Wits research sheds light on language learning and AI

Staff Reporter

New research by the University of the Witwatersrand (Wits) suggests that the way children learn language may help explain how human language becomes more structured over generations — and may also offer insights into the behaviour of large-scale artificial intelligence language models.

The study, recently published in the Proceedings of the National Academy of Sciences under the title Compositionality and Systematicity Emerge from Iterated Learning in Deep Linear Networks, examined how language-like data changes as it is passed between generations of learners.

The research focused on “iterated learning”, a theory which holds that language evolves over generations as each new learner absorbs, adapts and transmits it. The process can make language more structured because easier patterns are retained while less organised parts are lost.

“We built a computer brain with similar characteristics to a child’s, and compared it to behaviours we see in children’s brains. We then fed it data with similar properties found in human language and watched how the generations (versions) of the computer brain learn.”

“It turns out, computer brains find the structure in the data in the same way that children favour certain properties of language in learning. It also showed that the dataset (language) becomes more structured over generations because it makes learning easier,” says lead author Dr Devon Jarvis, Lecturer in the School of Computer Science and Applied Mathematics (CSAM), and Fellow in the Wits Machine Intelligence and Neural Discovery (MIND) Institute.

Jarvis said children learn language and the world around them in stages, moving from basic categories to more complex distinctions. They may first learn that plants and animals are different, then that there are different types of animals, before later refining that understanding.

“First, they learn that plants and animals are different things. Then they learn that there are different types of animals. But at some point, there is a depth of understanding of the world that they just have not reached yet,” says Jarvis.

The university said this kind of learning can be seen when children over-generalise. A child may learn that birds have wings and can fly, but later discover that penguins cannot fly and can swim. Such errors help children refine their understanding.

“While this progressive acquisition of knowledge has its benefits, the work focused on the implications for generations of learners. A child learns some language from their parents, and they will eventually pass it on to their own children. Due to the complexity of language, this transmission introduces mistakes.”

“Just like the penguin example, these mistakes are not arbitrary and result from the over-generalisation of knowledge. The net result is that easy portions of language to learn are remembered and reused, while the more unstructured portions are forgotten. Essentially, individuals are good at learning but only with the pressure of communication do we really see the depth of their intelligence,” explains Jarvis.

The researchers used deep linear neural networks, mathematical models that mimic aspects of how the brain processes information, to investigate the neural basis of this process.

They found that iterated learning worked best when the network had sufficient depth, multiple layers of processing, and a sufficiently complex language. Shallow networks, with fewer layers, were unable to capture the structured regularities that make language easier to learn.

The findings suggest that the design of a learning system, whether biological or artificial, and the richness of the environment in which it learns, are important in determining how language structure is absorbed and passed on.

The research also has implications for understanding generative AI systems, which depend heavily on scale and layered processing for their emerging capabilities.

Jarvis continues: “The pieces of this work have been around in the various literatures for a while now. Deep linear networks are established models of child development and iterated learning has been known to linguists for many years.”

“But it is the combination of these two perspectives that seems to make a useful point: that language evolves to become learnable based on the very specific nature of how children learn in stages and favour reusing information over learning new things.”

“The fact that this was shown in a very simple version of the technology underpinning the modern boom in AI tools is also encouraging and suggests that in the intersection of multiple fields lies the fundamental principles of cognition.”

The paper was co-authored by Professor Richard Klein, Head of the School of CSAM and Fellow in the Wits MIND Institute; Professor Benjamin Rosman, Director of the Wits MIND Institute and researcher in CSAM; and Professor Andrew Saxe of the Gatsby Unit and Sainsbury Wellcome Centre at University College London.

INSIDE EDUCATION