Suggestions or feedback?

MIT News | Massachusetts Institute of Technology

  • Machine learning
  • Sustainability
  • Black holes
  • Classes and programs

Departments

  • Aeronautics and Astronautics
  • Brain and Cognitive Sciences
  • Architecture
  • Political Science
  • Mechanical Engineering

Centers, Labs, & Programs

  • Abdul Latif Jameel Poverty Action Lab (J-PAL)
  • Picower Institute for Learning and Memory
  • Lincoln Laboratory
  • School of Architecture + Planning
  • School of Engineering
  • School of Humanities, Arts, and Social Sciences
  • Sloan School of Management
  • School of Science
  • MIT Schwarzman College of Computing

Cognitive scientists define critical period for learning language

Press contact :, media download.

critical period hypothesis language learning

*Terms of Use:

Images for download on the MIT News office website are made available to non-commercial entities, press and the general public under a Creative Commons Attribution Non-Commercial No Derivatives license . You may not alter the images provided, other than to crop them to size. A credit line must be used when reproducing images; if one is not provided below, credit the images to "MIT."

critical period hypothesis language learning

Previous image Next image

A great deal of evidence suggests that it is more difficult to learn a new language as an adult than as a child, which has led scientists to propose that there is a “critical period” for language learning. However, the length of this period and its underlying causes remain unknown.

A new study performed at MIT suggests that children remain very skilled at learning the grammar of a new language much longer than expected — up to the age of 17 or 18. However, the study also found that it is nearly impossible for people to achieve proficiency similar to that of a native speaker unless they start learning a language by the age of 10.

“If you want to have native-like knowledge of English grammar you should start by about 10 years old. We don’t see very much difference between people who start at birth and people who start at 10, but we start seeing a decline after that,” says Joshua Hartshorne, an assistant professor of psychology at Boston College, who conducted this study as a postdoc at MIT.

People who start learning a language between 10 and 18 will still learn quickly, but since they have a shorter window before their learning ability declines, they do not achieve the proficiency of native speakers, the researchers found. The findings are based on an analysis of a grammar quiz taken by nearly 670,000 people, which is by far the largest dataset that anyone has assembled for a study of language-learning ability.

“It’s been very difficult until now to get all the data you would need to answer this question of how long the critical period lasts,” says Josh Tenenbaum, an MIT professor of brain and cognitive sciences and an author of the paper. “This is one of those rare opportunities in science where we could work on a question that is very old, that many smart people have thought about and written about, and take a new perspective and see something that maybe other people haven’t.”

Steven Pinker, a professor of psychology at Harvard University, is also an author of the paper, which appears in the journal Cognition on May 1.

Quick learners

While it’s typical for children to pick up languages more easily than adults — a phenomenon often seen in families that immigrate to a new country — this trend has been difficult to study in a laboratory setting. Researchers who brought adults and children into a lab, taught them some new elements of language, and then tested them, found that adults were actually better at learning under those conditions. Such studies likely do not accurately replicate the process of long-term learning, Hartshorne says.

“Whatever it is that results in what we see in day-to-day life with adults having difficulty in fully acquiring the language, it happens over a really long timescale,” he says.

Following people as they learn a language over many years is difficult and time-consuming, so the researchers came up with a different approach. They decided to take snapshots of hundreds of thousands of people who were in different stages of learning English. By measuring the grammatical ability of many people of different ages, who started learning English at different points in their life, they could get enough data to come to some meaningful conclusions.

Hartshorne’s original estimate was that they needed at least half a million participants — unprecedented for this type of study. Faced with the challenge of attracting so many test subjects, he set out to create a grammar quiz that would be entertaining enough to go viral.

With the help of some MIT undergraduates, Hartshorne scoured scientific papers on language learning to discover the grammatical rules most likely to trip up a non-native speaker. He wrote questions that would reveal these errors, such as determining whether a sentence such as “Yesterday John wanted to won the race” is grammatically correct. 

To entice more people to take the test, he also included questions that were not necessary for measuring language learning, but were designed to reveal which dialect of English the test-taker speaks. For example, an English speaker from Canada might find the sentence “I’m done dinner” correct, while most others would not.

Within hours after being posted on Facebook, the 10-minute quiz “ Which English? ” had gone viral.

“The next few weeks were spent keeping the website running, because the amount of traffic we were getting was just overwhelming,” Hartshorne says. “That’s how I knew the experiment was sufficiently fun.”

A long critical period

After taking the quiz, users were asked to reveal their current age and the age at which they began learning English, as well as other information about their language background. The researchers ended up with complete data for 669,498 people, and once they had this huge amount of data, they had to figure out how to analyze it.

“We had to tease apart how many years has someone been studying this language, when they started speaking it, and what kind of exposure have they been getting: Were they learning in a class or were they immigrants to an English-speaking country?” Hartshorne says.

The researchers developed and tested a variety of computational models to see which was most consistent with their results, and found that the best explanation for their data is that grammar-learning ability remains strong until age 17 or 18, at which point it drops. The findings suggest that the critical period for learning language is much longer than cognitive scientists had previously thought.

“It was surprising to us,” Hartshorne says. “The debate had been over whether it declines from birth, starts declining at 5 years old, or starts declining starting at puberty.”

The authors note that adults are still good at learning foreign languages, but they will not be able to reach the level of a native speaker if they begin learning as a teenager or as an adult.

"Although it has long been observed that learning a second language is easier early in life, this study provides the most compelling evidence to date that there is a specific time in life after which the ability to learn the grammar of a new language declines," says Mahesh Srinivasan, an assistant professor of psychology at the University of California at Berkeley, who was not involved in the study. “This is a major step forward for the field. The study also opens surprising, new questions, because it suggests that the critical period closes much later than previously thought."

Still unknown is what causes the critical period to end around age 18. The researchers suggest that cultural factors may play a role, but there may also be changes in brain plasticity that occur around that age.

“It’s possible that there’s a biological change. It’s also possible that it’s something social or cultural,” Tenenbaum says. “There’s roughly a period of being a minor that goes up to about age 17 or 18 in many societies. After that, you leave your home, maybe you work full time, or you become a specialized university student. All of those might impact your learning rate for any language.”

Hartshorne now plans to run some related studies in his lab at Boston College, including one that will compare native and non-native speakers of Spanish. He also plans to study whether individual aspects of grammar have different critical periods, and whether other elements of language skill such as accent have a shorter critical period.

The researchers also hope that other scientists will make use of their data, which they have posted online , for additional studies.

“There are lots of other things going on in this data that somebody could analyze,” Hartshorne says. “We do want to draw other scientists’ attention to the fact that the data is out there and they can use it.”

The research was funded by the National Institutes of Health and MIT’s Center for Minds, Brains, and Machines.

Share this news article on:

Press mentions, scientific american.

Bucking conventional wisdom, research co-authored by Prof. Josh Tenenbaum shows that “picking up the subtleties of grammar in a a second language does not fade until well into the teens,” writes Dana G. Smith for Scientific American . “To become completely fluent, however, learning should start before the age of 10.”

New research suggests “children are highly skilled at learning the grammar of a new language up until the age of 17 or 18, much longer than previously thought,” reports Kashmira Gander in Newsweek. “We may need to go back to the drawing board in trying to explain why adults have trouble learning language,” Joshua Hartshorne, who co-wrote the study as a postdoc at MIT, tells Gander.

A study co-authored by Prof. Josh Tenenbaum finds that learning a new language should start before age 10 to achieve a native-like grasp of the grammar, reports BBC News . People remain highly skilled language learners until about 17 or 18, but then fall off, which Tenenbaum says could be due to “a biological change” or “something social or cultural.”

Previous item Next item

Related Links

  • Joshua Hartshorne
  • Josh Tenenbaum
  • Computational Cognitive Science Group
  • Department of Brain and Cognitive Sciences

Related Topics

  • Brain and cognitive sciences
  • National Institutes of Health (NIH)

Related Articles

critical period hypothesis language learning

The rise and fall of cognitive skills

critical period hypothesis language learning

How badly do you want something? Babies can tell

critical period hypothesis language learning

How we determine who’s to blame

More mit news.

Screen in center displays colorful image that glows and projects in arc surrounding screen

Startup’s displays engineer light to create immersive experiences without the headsets

Read full story →

Two people write on one of several white notepads hung on a wall.

3 Questions: What does innovation look like in the field of substance use disorder?

The full group of presenters on stage

Celebrating student entrepreneurship at delta v’s 2024 Demo Day

3 by 3 grid of headshots of 2024-25 MLK Scholars

MIT welcomes nine MLK Scholars for 2024-25

Two by four grid of headshots of SHASS faculty

Meet the 2024 tenured professors in the MIT School of Humanities, Arts, and Social Sciences

An illustration of two hands, each holding matching pills. One is labeled PLACEBO.

Harnessing the power of placebo for pain relief

  • More news on MIT News homepage →

Massachusetts Institute of Technology 77 Massachusetts Avenue, Cambridge, MA, USA

  • Map (opens in new window)
  • Events (opens in new window)
  • People (opens in new window)
  • Careers (opens in new window)
  • Accessibility
  • Social Media Hub
  • MIT on Facebook
  • MIT on YouTube
  • MIT on Instagram

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • HHS Author Manuscripts

Logo of nihpa

Rethinking the critical period for language: New insights into an old question from American Sign Language

Rachel i. mayberry.

Department of Linguistics, University of California San Diego

ROBERT KLUENDER

We thank the commentators for their thoughtful critiques, which we found both insightful and stimulating to our own thinking. Our first response is that, while debates about the CPL in theoretical contexts are important, the vigor and intensity of these debates should not overshadow the fact that the main goal of our article was to highlight a finding of vital importance: Sufficient language input in early childhood matters deeply because it has long-term consequences ( Lillo-Martin, 2018 ). Woll sums up this point both succinctly and poignantly in her report of a similar case of very late L1 exposure in adulthood who had decades of experience: “For a [deaf] child who, even in the context of early intervention, does not acquire a spoken language, the danger is that they will never have native-like mastery of any L1.” This is what truly matters. Our hope is that our keynote article and the accompanying commentaries might have a positive effect on clinical practice, educational policy, and even parental choice in this regard. In what follows, we discuss the main issues arising from the commentaries. First we note the points of agreement followed by a clarification of what we did not claim in our article. Researchers continue to debate what the shape of the AoA function looks like and its theoretical implications, which we address third. We then address the issues raised as to whether late L1 acquisition and late L2 learning differ in degree or kind, and last we discuss what we mean when we say that language acquisition during post-natal brain growth creates the capacity to learn language.

Points of agreement

The good news is that there is widespread agreement among language researchers about the CPL. Until recently, to our knowledge, no one has questioned that there is a steady decline in ultimate attainment the longer that exposure to any language – first, second, or N th – is delayed over the childhood years. This consensus may be in need of modification in light of the large-scale findings reported in Hartshorne, Tenenbaum, and Pinker (in press) with regard to L2 syntactic learning ability, as cited by Birdsong and Quinto-Pozos in their commentary. There are also questions about the exact slope of this decline at different levels of linguistic analysis ( Bley-Vroman, 2018 ; Lillo-Martin, 2018 ; Long & Granena, 2018 ; Veríssimo, 2018 ). However, there seems to be no dispute about the general phenomenon and that it affects sign language too ( Bialystok & Kroll, 2018 ; Birdsong & Quinto-Pozos, 2018 ; Bley-Vroman, 2018 ; DeKeyser, 2018 ; Hyltenstam, 2018 ; Lillo-Martin, 2018 ; Veríssimo, 2018 ; White, 2018 ). This in and of itself is a significant advance over Lenneberg’s (1967) original hypothesis, which made no such prediction, and we should not lose sight of this fact. Indeed, research conducted by several of the commentators helped establish the finding, and for this we can all be thankful. At the same time we recognize that this piece of scientific convergence has not made its way into public discourse, which behooves us to redouble our outreach efforts.

A major point of disagreement among the commentators centers on the question of whether there is a similar steady decline across the adult years. Regardless of which answer ends up being correct, we need to bear in mind what is at stake. Adults (at least those with a first language established during childhood) will simply continue to muddle through, learning an additional language one way or the other as they have always done. As White puts it, the L2 “ [...] acquisition task involves coming up with a linguistic system that allows the learner to use the L2 (in comprehension and production). The task is NOT to arrive at a grammar identical to that of a native speaker.” We would add that the L1 acquisition task is to come up with grammar in the first place, and that the ability to do so declines sharply the longer the child matures without language. Not all commentators agreed, however, about the differences between these two learning situations, which we address in detail below. Before discussing the shape and interpretation of the L2 AoA function, and comments about the L1 AoA function, we wish to set the record straight with regard to what we did not claim in our keynote article.

What we did not claim

For the sake of clarity, we would like to disavow certain positions that have been attributed to us in some of the commentaries, but that we do not in fact hold: Namely, that language learning ability functions flawlessly in adulthood, that L1 and L2 learning are the same, that L2 learners consistently perform on par with native speaker or signer controls, that all L2 learners attain near-native levels of proficiency, or that there are no differences between near-native and native-like language ability ( Abrahamsson, 2018 ; Hyltenstam, 2018 ; Birdsong & Quinto-Pozos, 2018 ). Some of these misconceptions may have arisen from our discussion of the possibility of L2 learners acquiring a native-like accent in section 2.1 ‘Phonological Effects.’ As native-like pronunciation has often been touted as the sine qua non of critical period studies, we chose to focus precisely on this aspect of L2 acquisition in some detail in order to ascertain the extent to which it may, in fact, be at all possible. In this vein, we hasten to add that the 4% figure cited in a footnote, which several commentators mentioned ( Abrahamsson, 2018 ; Lillo-Martin, 2018 ; Long & Granena, 2018 ), was merely the lowest percentage of participants claimed to have performed within native speaker accent norms by the authors of any of the L2 pronunciation studies that we reviewed. We fully recognize the limitations of subjective assessments of accent, which is why we also included acoustic studies of pronunciation in our review. By no means did we intend to claim that 4% of L2 learners in the studies we reviewed demonstrate equivalent proficiency to native speakers at all levels of linguistic analysis, nor that 4% of all L2 learners perform at native levels. Rather, we explicitly stated that “[w]e are not claiming [native-like accent] to be the norm in L2 acquisition. To the contrary, everyone is anecdotally aware of the difficulty of achieving native-like pronunciation in a language acquired after childhood.”

Some commentators also came away with the impression that – because we stated in our keynote article that “there is no animal model with which to study a CP for language (CPL)” – we must believe that the study of neuronal mechanisms underlying CP plasticity in various animal species is irrelevant to studies of the CPL ( Reh, Arredondo & Werker, 2018 ). To the contrary, we find such studies vitally important - and in fact cited one such study in our keynote article that showed effects of both critical period social isolation and corresponding knockout experiments in mice on oligodendrocyte maturation/myelination ( Makinodan, Rosen, Ito & Corfas, 2012 ). The relevance of this animal study is that myelination has been demonstrated to persist into the third decade of human life ( Miller, Duka, Stimpson, Schapiro, Baze, McArthur, Fobbs, Sousa, Sestan, Wildman, Lipovich, Kuzawa, Hof & Sherwood, 2012 ) and post-critical period L2 learning has been suggested to enhance myelination in college age populations (as measured by increases in fractional anisotropy and decreases in radial diffusivity) and thereby expand “the functionality of networks involved in learning by altering the underlying anatomy” ( Schlegel, Rudelson & Tse, 2012 , p. 1669). Moreover, even studies of adult songbirds have demonstrated plasticity of vocal learning late in life. After deafening and recovery from hair cell destruction, adult domesticated society finches were not only able to relearn their songs, but these relearned songs more closely matched those of cagemates with which they were housed during recovery than their own original songs before deafening ( Woolley & Rubel, 2002 ). Thus the claim that “the evolutionary function of a CP is [not] to develop language-learning skills to be utilized beyond the closure of the CP, no more than the function of a CP for birdsong is for the bird to develop skill for future birdsong learning” ( Abrahamsson, 2018 ) cannot stand, at least for this particular species.

The shape of the L2 AoA function

One aspect of our keynote article that elicited comments from the greatest number of commentators concerned the shape of the AoA function, whether the linearity of the documented decline in ultimate attainment across the lifespan is continuous or discontinuous, and whether this decline reflects biological, L1 entrenchment, or socioenvironmental factors including L2 input ( Bialystok & Kroll, 2018 ; Birdsong & Quinto-Pozos, 2018 ; DeKeyser, 2018 ; Hyltenstam, 2018 ; Long & Granena, 2018 ; Newport, 2018 ; Veríssimo, 2018 ). Unsurprisingly, these are the commentators who have conducted studies investigating these questions. Opinions range all the way from the conclusion that there is no critical period at all ( Bialystok & Kroll, 2018 ) to the conclusion that there are multiple critical periods with different age cutoffs for different linguistic phenomena ( Bley-Vroman, 2018 ; Lillo-Martin, 2018 ; Long & Granena, 2018 ; Veríssimo, 2018 ).

Here we flesh out some of the ideas in our keynote article that we were unable to elaborate on for lack of space. First and foremost, we emphasize that we are, for the most part, consumers rather than producers of L2 AoA research, and therefore come to this literature as interested but reasonably dispassionate observers. One inference that seems apparent to us, as alluded to in our original article, is that it would be desirable for the field to agree on how much of a difference (in variance accounted for) is sufficient to warrant choosing one alternative hypothesis over the other.

The problem is exemplified by the analytical and interpretative differences across several of the L2 studies we reviewed. For example, Johnson and Newport (1989) reported correlation coefficients of −.77 for one linear function across all participants in their study with AoAs of 3–39 years, but of −.87 for participants with AoAs of 3–15 years, and no significant correlation for participants with AoAs of 17–39 years. Reanalyzing the data, Elman, Bates, Johnson, Karmiloff-Smith, Parisi, and Plunkett (1996) reported that Johnson and Newport’s two linear functions accounted for 39.25% of the variance in the distribution, while a curvilinear function accounted for 63.1%. As another example, when restricting the range of AoAs to 7–18 years in their analyses of morphosyntactic ability, Flege, Yeni-Komshian, and Liu (1999) reported that a sigmoid function accounted for 5% more of the variance than did a linear function, and conceded that this might be evidence for a discontinuity around an AoA of 12 years, although they also found linear correlations on either side of the 12–15 age range, with a more robust correlation before than after. In contrast, Granena and Long (2013) reported better fits with regression models incorporating two versus no AoA breakpoints for phonology, morphosyntax, and lexis/collocation alike, but then conceded that “the increase in variance accounted for, even if significant, was only around 5%. This could mean that the less complex (i.e. more parsimonious) model with no breakpoints is already a good enough fit to the data [...]” (pp. 326f). Note that both sets of authors argue against their own preferred hypotheses in this instance. We were also aware that when age at time of testing was partialed out in the DeKeyser, Alfi-Shabtay, and Ravid (2010) study, only L2 learners with an AoA of up to 18 years still showed a correlation with proficiency. This correlation disappeared when the group was split at age 12 (possibly for lack of statistical power, as the authors suggest), but mean scores on the grammaticality judgment task still differed significantly between those with an AoA above or below age 12. Our issue is that both the correlation and the mean split appear to be largely attributable to where the earliest cutoff was drawn in the data set, namely at age 18. It is not clear that either would hold up if the cutoff point had been changed to age 15, or for that matter age 19: Namely, the scores of the three U.S. participants with an AoA of 19 fell squarely in the middle of the distribution for those with AoAs of 15 or below, and well above those of participants with AoAs of 16 or 17. Similar problems exist in the Israeli data set.

Beyond debates about the shape of the AoA function, however, there is the question of what counts as ‘puberty’. As we noted in our original article, and discussed above, across L2 AoA studies, the cutoff points (for what might pass for puberty) span nearly a decade: age 12 ( DeKeyser et al., 2010 ), age 15 ( Flege et al., 1999 ), age 16 ( Abrahamsson & Hyltenstam, 2008 , 2009 ), age 17 ( Johnson & Newport, 1989 ), age 18 ( DeKeyser et al., 2010 ), and age 20 ( Birdsong & Molis, 2001 ). Obviously, the rate at which individuals reach sexual maturity varies widely from case to case, and if the goal is to fix a point at which childhood neural development officially ends, it might be worth trying to locate this point on an individual basis. This is not an academic exercise. It would parallel the Pena, Werker, and Dehaene-Lambertz (2012) study of premature babies in order to determine if the benchmarks of phonological organization in the first year of life are tied to neural development or to the extent of language exposure. This work shows that sensitivity to linguistic input, at least for phonological learning, is yoked to phases in brain development. As Newport points out in her commentary, we now know much more about the brain changes that occur throughout adulthood that may, or may not, relate to L2 AoA effects in adulthood. In their commentary, Reh, Arredondo, and Werker (2018) suggest that the slower maturation of the frontal lobe relative to the earlier maturation of posterior brain areas may play a role in L2 learning, as also proposed by Thompson-Schill, Ramscar, and Chrysikou (2009) .

Several commentators ( Abrahamsson, 2018 ; Hyltenstam, 2018 ; Veríssimo, 2018 ) suggested that the materials used to test L2 proficiency are often lacking in theoretical sophistication and empirical rigor, and that the results of such studies should be taken with a grain of salt. We could not agree more, not only with regard to studies of ultimate attainment, but also – and perhaps especially – in neuroimaging studies. L2 proficiency has been defined in mostly ad hoc ways in the literature. L2 researchers could avail themselves of already established, comprehensive and detailed systems for determining language proficiency levels, including near-native and native: the Interagency Language Roundtable (ILR) Oral Proficiency Interview (OPI), used across U.S. federal service agencies for decades now, and the Common European Framework of Reference (CEFR) for languages, in use within the EU since 1996. In fact, these two scales have been calibrated against each other for a decade now as well, so their neglect in L2 research is puzzling.

Language measurement and cognitive factors in late L1 acquisition

The fact that sign languages are subject to AoA effects prompted several commentators to conclude that sign languages are just like spoken languages ( Birdsong & Quinto-Pozos, 2018 ; Bialystok & Kroll, 2018 ; DeKeyser, 2018 ; Veríssimo, 2018 ) and we of course agree. We also agree with the commentators who pointed out that the stimuli and tasks for AoA studies need to be carefully selected to determine whether different critical periods may differentially affect varying levels or domains of linguistic structures ( Abrahamsson, 2018 ; Lillo-Martin, 2018 ; Veríssimo, 2018 ). In turn, these commentators would probably agree with us that the particular question under investigation and the formal linguistic descriptions available for the language under study determine how AoA experiments can be crafted. It is easy to lose sight of the fact that sign languages have only recently been distinguished from gesture and admitted into the family of human languages ( Goldin-Meadow & Brentari, 2015 ). Research detailing the grammar of sign languages, ASL in particular, remains in its infancy compared with the long-available descriptions of, for example, English, German, Turkish, or Swedish. Our initial studies had the goal of determining whether AoA effects were apparent in ASL at all – hence, our use of global processing measures like shadowing or sentence memory ( Veríssimo, 2018 ). While much progress has been made describing ASL grammar ( Sandler & Lillo-Martin, 2006 ), linguists disagree about such basic linguistic phenomena in ASL as syllabification, verb agreement, anaphora, or pronominal forms, among others ( Frederiksen & Mayberry, 2016 ; Lillo-Martin & Meier, 2011 ; Wilbur, 2011 ). Such ambiguities in formal linguistic description make it difficult, but not impossible, to ask whether late L1 acquisition affects particular domains of ASL grammar more than others.

We agree that all of a person’s language representations, which can include more than one language in more than one sensory-motor modality, come into play during language processing ( Bialystok & Kroll, 2018 ; Birdsong & Quinto-Pozos, 2018 ). Our working definition of late L1 acquisition is that the learner has few linguistic representations available at the onset of his or her initial ASL exposure. Deaf ASL signers who have linguistic representations available to them in other languages and forms that were established in early life perform at levels closer to those of earlier learners than late L1 learners. We were able to observe this in our original AoA studies. For example, when the task became difficult, one deaf signer began to reproduce ASL sentences entirely in fingerspelling, which formed the primary basis of this individual’s early education. Another participant began subvocalizing when the ASL sentence task became hard; this participant became deaf at age 4 rather than at birth. It was these observations of deaf participants using languages and forms other than ASL that led us to hypothesize that critical period primarily affects first language acquisition ( Mayberry, 1993 ; Mayberry & Lock, 2003 ; Mayberry, Lock & Kazmi, 2002 ). We do not pre-screen potential participants for their ASL skills. Instead we screen them for early language experience according to self-report. Some individuals who self-report as late L1 learners are clearly more akin to late “quasi-L2 ASL” learners. We believe this accounts for the individual variation apparent in some of our studies ( Emmorey, 2018 ).

In her commentary, Emmorey asks whether individual differences among late L1 learners might be due to varying levels of motivation or cognitive abilities. Although we have not attempted to measure it, the motivation of deaf individuals to learn ASL, including those who learn it after minimal childhood language experience, is extremely high. This is illustrated by the stories deaf signers tell about their ASL learning and how it has changed their lives ( Valli, Lucas, Farb & Kulick, 1992 ). The life transforming attributes of learning ASL are a common theme in ASL poetry and literature ( Perlmutter, 2008 ). The logical problem with attributing attenuated levels of ASL development to working memory is that its development is known to be inextricably tied to language development ( Gathercole & Baddeley, 1993 ). Working memory further relates to the development of executive function, which is also correlated with level of language development ( Botting, Jones, Marshall, Denmark, Atkinson & Morgan, 2017 ; Hall, Eigsti, Bortfeld & Lillo-Martin, 2017 ). In this sense, working memory and executive function might be considered as being comorbidities of acquiring a first language after early childhood.

With respect to general cognitive functioning, it is important to know that hundreds of studies of deaf individuals’ IQ – spanning more than a century – have repeatedly found the deaf population to score within the normal range of the hearing population on non-verbal IQ scales, despite widespread language deprivation in this population ( Braden, 1994 ; Mayberry, 2002 ). Among the cognitive skills tapped by various non-verbal IQ tasks, spatial cognition is a notable strength among late L1 learners. For example, many late L1 learners have excellent navigation and drawing skills. Consistent with spatial cognitive strengths, late L1 learners show greater proficiency with ASL classifier constructions that encode spatial relations in contrast to ASL syntactic constructions that do not ( Boudreault & Mayberry, 2006 ; Mayberry, Cheng, Hatrak & Ilkbasaran, in preparation ). These linguistic strengths begin to address the question of whether some aspects of linguistic structure are more sensitive to AoA than others ( Bley-Vroman, 2018 ; Lillo-Martin, 2018 ; Long & Granena, 2018 ; Veríssimo, 2018 ), an important question in need of further investigation.

The quantity and quality of linguistic input, in various cognitive domains, and education may interact with L1 AoA effects, as several commentators suggested ( Birdsong & Quinto-Pozos, 2018 ; Emmorey, 2018 ; Flege, 2018 ; Long & Granena, 2018 ; Newport, 2018 ). We agree and note that studies of how linguistic frequency interacts with L1 AoA effects have yet to be conducted. The most common source of language input for deaf late learners is through education. Deaf late L1 learners who are able to attend school with other deaf signers receive more ASL input than those without such opportunities. The extent to which this increased input boosts ASL language levels remains to be investigated ( Henner, Caldwell-Harris, Novogrodsky & Hoffmeister, 2016 ).

Woll describes a case study of a deaf man, M, whose L1 acquisition began in his late 20s and who had 25 years of experience with British Sign Language. M’s language skills are consistent with those of our case study, Martin, who began to acquire sign language as an L1 at age 21 and had 30 years of experience with ASL ( Mayberry, Davenport, Roth & Halgren, 2018 ). Both cases showed limited morphological and syntactic ability and reduced abilities to comprehend and produce sign language, British Sign Language for M and ASL for Martin. Notably, both individuals are described as having excellent navigation skills. Unlike Bialystok and Kroll, we do not interpret the fact that individuals such as these cases are able to learn some sign language as evidence against a critical period for language. A modicum of vocabulary assembled in utterances with sparse morphology or syntax does not, in our view, constitute a functional language system, just as the ability to detect light, and the edges of objects after cataract removal in adulthood does not indicate a functional visual system. Nor do we think that late L1 learners are similar to heritage language learners ( Bialystok & Kroll, 2018 ; Birdsong & Quinto-Pozos, 2018 ; Lillo-Martin, 2018 ; White, 2018 ) for the simple reason that heritage language users have fully developed linguistic representations and processes available to them in the form of their dominant language.

Creating the capacity to learn language

A number of commentators noted that Johnson and Newport (1989) originally proposed two possible mechanisms to underlie AoA effects, maturation versus exercise. The later hypothesis is also referred to as the “use it or lose it” explanation ( Abrahamsson, 2018 ; Bley-Vroman, 2018 ; DeKeyser, 2018 ; Hyltenstam, 2018 ; Veríssimo, 2018 ; White, 2018 ). While our research with deaf late L1 learners suggests that the capacity to learn language diminishes with age ( Bley-Vroman, 2018 ; Hyltenstam, 2018 ; DeKeyser, 2018 ; Veríssimo, 2018 ), we think that a more accurate theoretical framing of the phenomenon is that a prolonged delay in language exposure leads to a diminished capacity to learn language. In the case of deaf late L1 learners, the infant brain was ready to interact with the environment linguistically, but the environment failed to yield the necessary language input. In his figure, Hyltenstam shows a steep decline in ultimate language outcome incorporating data from L2 learners with that of the late L1 learners reported in the literature. We think this figure summarizes these phenomena well, but our theoretical reframing would turn it upside down. All language learners begin with an intercept of zero. Reflecting the creation of language ability, individuals whose language experience begins in infancy show a steep upward trajectory in the acquisition of language structure. This language learning curve asymptotes at lower than native-speaker/signer levels, the older the onset of L2 learning. In addition, the probability that the learning curve will approach native-like levels declines sharply the longer the delay in L1 exposure. This framework illustrates the “hybrid” hypothesis proposed by Newport.

Studies of language acquisition in late L1 learners indicate that the outcome of language acquisition is not governed by cognitive maturation but by the cognitive processes of deciphering linguistic structure in synchrony with neural development. Despite being cognitively mature, adolescents acquiring language for the first time begin the process by learning vocabulary, which they subsequently combine into single predicate utterances. The older the age onset of L1 experience, the less likely the learner will progress to more complex morphological and syntactic structures ( Cheng & Mayberry, under review ; Mayberry, Cheng, Hatrak & Ilkbasaran, 2017 ). Although the infant brain shows activation in response to spoken language in the expected left hemisphere areas in response to language ( Dehaene-Lambertz, Dehaene & Hertz-Pannier, 2002 ), multiple neural changes that occur throughout childhood affect the brain language system as well. The infant brain matures from posterior to anterior regions, and this is evident in children’s language processing too ( Schlaggar, Brown, Lugar, Visscher, Miezin & Petersen, 2002 ). During childhood the brain language system becomes more lateralized and consolidated ( Berl, Mayo, Parks, Rosenberger, VanMeter, Ratner, Vaidya & Gaillard, 2014 ). The brain language system also becomes more robustly connected over childhood. Dorsal pathways connecting language areas in the temporal and frontal lobes become increasing myelinated ( Pujol, Soriano-Mas, Oritz, Sebastián-Gallés & Deus, 2006 ). Increased myelination of left hemisphere fiber tracts correlates with the onset of complex sentence comprehension in typically developing children during late childhood ( Skeide, Brauer & Friederici, 2016 ). Deaf signers who began to learn ASL after childhood show reduced myelination of these fiber tracts and show concomitant difficulty comprehending complex ASL sentences ( Cheng, Roth, Halgren & Mayberry, under review ). Thus the act of learning language may trigger neural development throughout varying stages in the development of the brain language system. This scenario of reciprocal linguistic input effects on postnatal brain growth and vice versa would create the ability to learn language by enlarging the information processing capacity of the neurolinguistic system to recognize and manipulate linguistic representations. The task of L2 learning would thus be facilitated when a lexicon, grammar, and a neural language network are either being developed simultaneously or already in place. The task of L1 learning at older ages is impeded when the requisite neural architecture is not in place.

More research is required to determine the specific links between language acquisition and the development of the brain language system and the extent to which they are reciprocally causal. In this way, the study of this atypical, but unfortunately all too common, situation of late L1 acquisition among individuals born deaf promises to illuminate basic acquisitional and neurodevelopmental processes that together create the faculty of human language.

Acknowledgments

The research reported in this publication with ASL signers was supported in part by NIH grant R01DC012797. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Contributor Information

RACHEL I. MAYBERRY, Department of Linguistics, University of California San Diego.

ROBERT KLUENDER, Department of Linguistics, University of California San Diego.

  • Abrahamsson N (2018). But first, let’s think again! Bilingualism: Language and Cognition . doi: 10.1017/S1366728918000251 [ CrossRef ] [ Google Scholar ]
  • Abrahamsson N, & Hyltenstam K (2008). The robustness of aptitude effects in near-native second language acquisition . Studies in Second Language Acquisition , 30 , 481–509. [ Google Scholar ]
  • Abrahamsson N, & Hyltenstam K (2009). Age of onset and nativelikeness in a second language: Listener perception versus linguistic scrutiny . Language Learning , 59 ( 2 ), 249–306. [ Google Scholar ]
  • Berl MM, Mayo J, Parks EN, Rosenberger LR, VanMeter J, Ratner NB, Vaidya CJ, & Gaillard WD (2014). Regional differences in the developmental trajectory of lateralization of the language network . Human Brain Mapping , 35 ( 1 ), 270–284. doi: 10.1002/hbm.22179 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Bialystok E, & Kroll JF (2018). Can the critical period be saved? A bilingual perspective . Bilingualism: Language and Cognition . doi: 10.1017/S1366728918000202 [ CrossRef ] [ Google Scholar ]
  • Birdsong D, & Molis M (2001). On the evidence for maturational constraints in second-language acquisition . Journal of Memory and Language , 44 ( 2 ), 235–249. [ Google Scholar ]
  • Birdsong D, & Quinto-Pozos D (2018). Signers and speakers, age and attainment . Bilingualism: Language and Cognition . doi: 10.1017/S1366728918000226 [ CrossRef ] [ Google Scholar ]
  • Bley-Vroman R (2018). Language as “something strange” . Bilingualism: Language and Cognition . doi: 10.1017/S136672891800024X [ CrossRef ] [ Google Scholar ]
  • Botting N, Jones A, Marshall C, Denmark T, Atkinson J, & Morgan G (2017). Nonverbal executive function is mediated by language: A study of deaf and hearing children . Child Development , 88 ( 5 ), 1689–1700. doi: 10.1111/cdev.12659 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Boudreault P, & Mayberry RI (2006). Grammatical processing in American Sign Language: Age of first-language acquisition effects in relation to syntactic structure . Language and Cognitive Processes , 21 ( 5 ), 608–635. doi: 10.1080/01690960500139363 [ CrossRef ] [ Google Scholar ]
  • Braden JP (1994). Deafness, Deprivation, and IQ . New York: Plenum Press. [ Google Scholar ]
  • Cheng Q, & Mayberry RI (under review). The trajectory of variable word order acquisition in American Sign Language: Insights from adolescent first-language learners .
  • Cheng Q, Roth A, Halgren E, & Mayberry RI (under review). Language pathways in deaf native signers and late L1 learners of ASL: Effects of modality and early language deprivation on brain connectivity for language . [ PMC free article ] [ PubMed ]
  • Dehaene-Lambertz G, Dehaene S, & Hertz-Pannier L (2002). Functional neuroimaging of speech perception in infants . Science , 298 ( 5600 ), 2013–2015. [ PubMed ] [ Google Scholar ]
  • DeKeyser R (2018). The critical period hypothesis - a diamond in the rough . Bilingualism: Language and Cognition . doi: 10.1017/S1366728918000147 [ CrossRef ] [ Google Scholar ]
  • DeKeyser R, Alfi-Shabtay I, & Ravid D (2010). Cross-linguistic evidence for the nature of age effects in second language acquisition . Applied Psycholinguistics , 31 , 413–438. [ Google Scholar ]
  • Elman JL, Bates E, Johnson MH, Karmiloff-Smith A, Parisi D, & Plunkett K (1996). Rethinking Innateness . Cambridge, MA: MIT Press. [ Google Scholar ]
  • Emmorey K (2018). Variation in late L1 acquisition? Bilingualism: Language and Cognition . doi: 10.1017/S1366728918000196 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Flege JE (2018). It’s input that matters most, not age . Bilingualism: Language and Cognition . doi: 10.1017/S136672891800010X [ CrossRef ] [ Google Scholar ]
  • Flege JE, Yeni-Komshian GH, & Liu S (1999). Age constraints on second-language acquisition . Journal of Memory and Language , 41 ( 1 ), 78–104. doi: 10.1006/Jmla.1999.2638 [ CrossRef ] [ Google Scholar ]
  • Frederiksen AT, & Mayberry RI (2016). Who’s on First? Investigating the referential hierarchy in simple native ASL narratives . Lingua , 180 , 49–68. doi: 10.1016/j.lingua.2016.03.007 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Gathercole SE, & Baddeley AD (1993). Working Memory and Language . Hove, New York: Psychology Press. [ Google Scholar ]
  • Goldin-Meadow S, & Brentari D (2015). Gesture, sign and language: The coming of age of sign language and gesture studies . Behavioral and Brain Sciences , 40 , 1–82. doi: 10.1017/S0140525X15001247 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Granena G, & Long MH (2013). Age of onset length of residence, language aptitude, and ultimate L2 attainment in three linguistic domains . Second Language Research , 29 ( 3 ), 313–343. [ Google Scholar ]
  • Hall ML, Eigsti IM, Bortfeld H, & Lillo-Martin D (2017). Auditory Deprivation Does Not Impair Executive Function, But Language Deprivation Might: Evidence From a Parent-Report Measure in Deaf Native Signing Children . Journal of Deaf Studies and Deaf Education , 22 ( 1 ), 9–21. doi: 10.1093/deafed/enw054 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Hartshorne JK, Tenenbaum JB, & Pinker S (in press). A critical period for second language acquisition: Evidence from 2/3 million English speakers . Cognition . [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Henner J, Caldwell-Harris CL, Novogrodsky R, & Hoffmeister R (2016). American Sign Language Syntax and Analogical Reasoning Skills Are Influenced by Early Acquisition and Age of Entry to Signing Schools for the Deaf . Frontiers in Psychology , 07 . doi: 10.3389/fpsyg.2016.01982 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Hyltenstam K (2018). Second language ultimate attainment: Effects of maturation, exercise, and social/psychological factors . Bilingualism: Language and Cognition . doi: 10.1017/S1366728918000172 [ CrossRef ] [ Google Scholar ]
  • Johnson JS, & Newport EL (1989). Critical period effects in second language learning: The influence of maturational state on the acquisition of English as a second language . Cognitive Psychology , 21 , 60–90. [ PubMed ] [ Google Scholar ]
  • Lenneberg E (1967). Biological Foundations of Language . New York: John Wiley & Sons. [ Google Scholar ]
  • Lillo-Martin D (2018). Differences and similarities between late first-language and second-language learning . Bilingualism: Language and Cognition . doi: 10.1017/S1366728918000159 [ CrossRef ] [ Google Scholar ]
  • Lillo-Martin D, & Meier RP (2011). On the linguistic status of ‘agreement’ in sign languages . Theoretical Linguistics , 37 ( 3/4 ), 95–141. doi: 10.1515/THLI.2011.009 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Long MH, & Granena G (2018). Sensitive periods and language aptitude in second language acquisition . Bilingualism: Language and Cognition . doi: 10.1017/S1366728918000184 [ CrossRef ] [ Google Scholar ]
  • Makinodan M, Rosen KM, Ito S, & Corfas G (2012). A critical period for social experience-dependent oligodendrocyte maturation and myelination . Science , 337 ( 6100 ), 1357–1360. doi: 10.1126/science.1220845 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Mayberry RI (1993). First language acquisition after childhood differs from second language acquisition: The case of American Sign Language . Journal of Speech and Hearing Research , 36 ( 6 ), 1258–1270. [ PubMed ] [ Google Scholar ]
  • Mayberry RI (2002). Cognitive development in deaf children: The interface of language and perception in neuropsychology . In Segalowitcz SJ & Rapin I (Eds.), Handbook of neuropsychology (2 ed., Vol. 8 , pp. 71–107). [ Google Scholar ]
  • Mayberry RI, Cheng Q, Hatrak M, & Ilkbasaran D (2017). Late L1 learners acquire simple but not syntactically complex structures . Paper presented at the International Association for the Study of Child Language, Lyon, France. [ Google Scholar ]
  • Mayberry RI, Cheng Q, Hatrak M, & Ilkbasaran D (in preparation). For arborized trees, plant early: How language deprivation affects syntactic development . Manuscript in preparation . [ Google Scholar ]
  • Mayberry RI, Davenport T, Roth A, & Halgren E (2018). Neurolinguistic processing when the brain matures without language . Cortex , 99 , 390–403. doi: 10.1016/j.cortex.2017.12.011 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Mayberry RI, & Lock E (2003). Age constraints on first versus second language acquisition: Evidence for linguistic plasticity and epigenesis . Brain and Language , 87 ( 3 ), 369–384. doi: 10.1016/S0093-934x(03)00137-8 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Mayberry RI, Lock E, & Kazmi H (2002). Linguistic ability and early language exposure . Nature , 417 ( 6884 ), 38-38. doi: 10.1038/417038a [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Miller DJ, Duka T, Stimpson CD, Schapiro SJ, Baze WB, McArthur MJ, Fobbs AJ, Sousa AM, Sestan N, Wildman DE, Lipovich L, Kuzawa CW, Hof PR, & Sherwood CC (2012). Prolonged myelination in human neocortical evolution . Proceedings of the National Academy of Sciences of the USA , 109 ( 41 ), 16480–16485. doi: 10.1073/pnas.1117943109 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Newport EL (2018). Is there a critical period for L1 but not L2? Bilingualism: Language and Cognition . doi: 10.1017/S1366728918000305 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Pena M, Werker JF, & Dehaene-Lambertz G (2012). Earlier speech exposure does not accelerate speech acquisition . Journal of Neuroscience , 32 ( 33 ), 11159–11163. doi: 10.1523/JNEUROSCI.6516-11.2012 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Perlmutter DM (2008). Nobilior est vulgaris: Dante’s hypothesis and sign language poetry In Lindgren KA, DeLuca D, & Napoli D (Eds.), Signs and Voices: Deaf culture, identify, language and arts (pp. 189–213). Washington, DC: Gallaudet University Press. [ Google Scholar ]
  • Pujol J, Soriano-Mas C, Oritz H, Sebasti࿡n-Gallés N, & Deus J (2006). Myelination of language-related areas in the developing brain . Neurology , 66 , 339–343. [ PubMed ] [ Google Scholar ]
  • Reh RK, Arrendo MM, & Werker JF (2018). Understanding individual variation in levels of second language attainment through the lens of critical period mechanisms . Bilingualism: Language and Cognition doi: 10.1017/S1366728918000263 [ CrossRef ] [ Google Scholar ]
  • Sandler W, & Lillo-Martin D (2006). Sign Language and Linguistic Universals : Cambridge University Press. [ Google Scholar ]
  • Schlaggar BL, Brown TT, Lugar HM, Visscher KM, Miezin FM, & Petersen SE (2002). Functional neuroanatomical differences between adults and school-age children in the processing of single words . Science , 296 ( 5572 ), 1476–1479. [ PubMed ] [ Google Scholar ]
  • Schlegel A, Rudelson J, & Tse P (2012). White matter structure changes as adults learn a second language . Journal of Cognitive Neuroscience , 24 ( 8 ), 1664–1670. [ PubMed ] [ Google Scholar ]
  • Skeide MA, Brauer J, & Friederici AD (2016). Brain functional and structural predictors of language performance . Cerebral Cortex , 26 ( 5 ), 2127–2139. doi: 10.1093/cercor/bhv042 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Thompson-Schill SL, Ramscar M, & Chrysikou EG (2009). Cognition without control . Current Directions in Psychological Science , 18 ( 5 ), 259–263. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Valli C, Lucas C, Farb E, & Kulick P (1992). ASL Pah!: Deaf students’ perspectives on their language . Silver Spring Md: Linstok Press. [ Google Scholar ]
  • Veríssimo J (2018). Sensitive periods in both L1 and L2: Some conceptual and methodological suggestions . Bilingualism: Language and Cognition . doi: 10.1017/S1366728918000275 [ CrossRef ] [ Google Scholar ]
  • White L (2018). Nonconvergence on the native speaker grammar: Defining L2 success . Bilingualism: Language and Cognition . doi: 10.1017/S1366728918000214 [ CrossRef ] [ Google Scholar ]
  • Wilbur RB (2011). Sign syllables In Oostendorpp M. v., Ewen C, Hume E, & Rice K (Eds.), The Blackwell Companion to Phonology (pp. 1–26): Blackwell Publishing. [ Google Scholar ]
  • Woll B (2018). The consequences of very late exposure to BSL as an L1 . Bilingualism: Language and Cognition . doi: 10.1017/S1366728918000238 [ CrossRef ] [ Google Scholar ]
  • Woolley S, & Rubel E (2002). Vocal memory and learning in adult Bengalese finches with regenerated hair cells . The Journal of Neuroscience , 22 ( 1 ), 7774–7787. [ PMC free article ] [ PubMed ] [ Google Scholar ]

10 September 2024: Due to technical disruption, we are experiencing some delays to publication. We are working to restore services and apologise for the inconvenience. For further updates please visit our website: https://www.cambridge.org/universitypress/about-us/news-and-blogs/cambridge-university-press-publishing-update-following-technical-disruption

We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings .

Login Alert

critical period hypothesis language learning

  • > Journals
  • > Bilingualism: Language and Cognition
  • > Volume 21 Issue 5
  • > Rethinking the critical period for language: New insights...

critical period hypothesis language learning

Article contents

  • Introduction
  • CPL effects on ultimate spoken L2 outcome
  • AoA effects on ASL outcome
  • The trajectory of late L1 acquisition
  • Late L1 vs. L2 effects on neurolinguistic processing
  • The scope and nature of the CPL

Rethinking the critical period for language: New insights into an old question from American Sign Language

Published online by Cambridge University Press:  26 December 2017

The hypothesis that children surpass adults in long-term second-language proficiency is accepted as evidence for a critical period for language. However, the scope and nature of a critical period for language has been the subject of considerable debate. The controversy centers on whether the age-related decline in ultimate second-language proficiency is evidence for a critical period or something else. Here we argue that age-onset effects for first vs. second language outcome are largely different. We show this by examining psycholinguistic studies of ultimate attainment in L2 vs. L1 learners, longitudinal studies of adolescent L1 acquisition, and neurolinguistic studies of late L2 and L1 learners. This research indicates that L1 acquisition arises from post-natal brain development interacting with environmental linguistic experience. By contrast, L2 learning after early childhood is scaffolded by prior childhood L1 acquisition, both linguistically and neurally, making it a less clear test of the critical period for language.

1. Introduction

Why would we need to know if there is a critical period for language acquisition? This information might be useful for educational policy to enable more children to become proficient in a second language. Likewise, the information might be useful in order to improve language programs for immigrants who speak other languages to help them integrate into their new countries more quickly. Clinical and rehabilitation language programs for children and adults often cite critical period (CP) research as a rationale. Last but not least, research into CP learning informs longstanding questions in cognitive science about language and brain development and how they affect one another, in addition to the role the environment plays in this development. Given that research on CP learning is vital to so many domains of inquiry, the next question is what kind of experiments and data are required to answer the question.

Generally speaking, CP phenomena are thought to reflect a unique type of learning when an animal or human is exquisitely sensitive to a particular stimulus in the environment during development. The main characteristic of this sensitivity is that it is limited to a temporal phase in development. Before the opening and after the closing of the CP, sensitivity to the stimulus is either diminished or absent, hence the notion of a period . CP learning is commonly observed throughout the animal kingdom with one frequently cited example being birdsong. The white crowned sparrow learns the song of its species beginning around day 10 after hatching. The CP window for learning its species’ song closes around day 50 after hatching. A lack of exposure to adult song during this temporal window results in an abnormal song. The onset, closing, and duration of the CP for birdsong learning varies by species (Marler, Reference Marler 1989 ). Another example of CP learning is the discovery by Lorenz ( Reference Lorenz 1965 ) that baby geese imprint on the first moving stimulus they see beginning at 13 hours and ending around 16 hours after hatching. Typically the first moving object is the mother, and gosling survival depends upon learning to follow the gaggle, hence the notion critical . Socialization phenomena in animals have also been found to be governed by a CP. Developmental timing effects in dogs (Lord, Reference Lord 2013 ) and domesticated Siberian silver foxes (Trut, Reference Trut 1999 ) have been associated with enhanced abilities to interpret human signals of shared attention, such as gaze and pointing, both of which are prerequisites for human language acquisition (Hare, Brown, Williamson & Tomasello, Reference Hare, Brown, Williamson and Tomasello 2002 ; Virányi, Gácsi, Kubinyi, Topál, Belényi, Ujfalussy & Miklósi, Reference Virányi, Gácsi, Kubinyi, Topál, Belényi, Ujfalussy and Miklósi 2008 ). Mice that are isolated during the fourth and fifth weeks postpartum, a CP in their development, show decreases in myelination of neuronal axons related to behavioral and cognitive deficits (Makinodan, Rosen, Ito & Corfas, Reference Makinodan, Rosen, Ito and Corfas 2012 ). Some of the most well studied CP effects are found in the development of the visual system (Wiesel, Reference Wiesel 1982 ). Animal models provide a rich and detailed means to investigate CP effects on the development of perception, behavior and the underlying neural mechanisms of these effects (Hensch, Reference Hensch 2005 ). However, there is no animal model with which to study a CP for language (CPL).

1.1 The Critical Period for Language

The existence of a CPL has been the subject of considerable debate. Skinner ( Reference Skinner 1957 ) initially proposed that children learn language as a result of stimulusresponse reinforcements emanating from the environment. Chomsky ( Reference Chomsky 1959 ) countered that features of the environment cannot explain language development, which he proposed to be the knowledge of linguistic structure, otherwise known as the human language faculty, and not linguistic behavior per se (Chomsky, Reference Chomsky 1965 ). Given the widespread findings of CP learning in animals, we might postulate that a CPL bridges these language domains, one centered in the environment and the other centered in the mind and brain. The CPL may function to link the experience of language present in the environment to development of the brain language system.

1.2 Early CPL proposals

For centuries the folk observation that children develop language quickly and effortlessly while adults often fail to learn a second language well enough to pass as a native speaker has been interpreted as evidence for the existence of a CPL. In 1894, the physician Itard ( Reference Itard 1962 ) concluded that the speechless infant sauvage he had tutored for two years, Victor, failed to learn French because the boy was simply too old. In 1967, Lenneberg marshaled evidence suggesting that language development co-occurs with brain development. He described several phenomena in language acquisition that appear to occur during childhood but not later. First, language development is stage-like in a fashion akin to the milestones of, for example, the development of walking by infants. Second, recovery of language ability after brain damage is possible for children but less so for adults. Third, the ability to acquire a second language (L2) spontaneously from mere exposure without conscious effort or a residual accent declines with age. Fourth, the language development of cognitively impaired children is delayed compared with that of typically developing children and appears to stop around puberty. And last, the effects of deafness on spoken language development are inversely related to its age-onset. Lenneberg ( Reference Lenneberg 1967 ) argued that these linguistic age-related phenomena were not coincidental but instead are the effects of brain development and thus constitute evidence for a CPL. He further proposed that the closing of the CPL was marked by hemispheric lateralization for language, which he, along with other scholars at the time, erroneously believed to occur around puberty.

Since Lenneberg's seminal monograph, much research has been devoted to ascertaining the validity and scope of the putative CPL. Some research has focused on the interaction effects of the L1 syntactic and morphological structure on the acquisition of these structures in the L2 during childhood and beyond. This kind of cross-linguistic research is beyond the scope of the present paper, however. Other research has used an experimental paradigm measuring the ultimate outcome of L2 learning in relation to the age when the learning began, otherwise known as age of acquisition , AoA, although more accurate terms might be age of exposure or onset . Footnote 1 The results of this body of research may not yield the clearest insights into the veracity and nature of the putative CPL, however, for reasons we explain below.

1.3 Current Proposal

The main arguments we make here are, first, that conflating second (L2) with first language (L1) acquisition creates a confounded language learning situation that needs to be teased apart in order to illuminate the putative CPL. Logically, the CPL should govern the initial acquisition of language in early life, from both a behavioral and neural perspective, rather than the subsequent learning of an L2 after early childhood, after grammatical structure and its neural circuitry have been acquired and established. A childhood L1 and subsequent L2 acquired during early childhood have been shown to interact with one another in fascinating ways (Meisel, Reference Meisel, Boeckx and Grohmann 2013 ). However, we argue here that sign languages, due to the unique environmental circumstances under which they are acquired, provide unique insights into the CPL that are hidden from the exclusive study of L2 spoken language acquisition. Indeed, comparing the outcome of post-childhood L2 acquisition with that of post-childhood L1 acquisition provides the necessary comparison for titrating the effects of the two learning situations. We begin by considering how CPL effects have been investigated with studies of ultimate attainment for L2 spoken language learning. Next we turn to the case of American Sign Language (ASL) and psycholinguistic studies of ASL ultimate attainment, and studies comparing late L1 with late L2 attainment. Case studies of late L1 acquisition provide the linguistic links necessary to understand the brain language processing effects associated with late L1 vs. L2 development. Together these diverse studies provide new insights into the scope and nature of the CPL, which we address last.

2. CPL effects on ultimate spoken L2 outcome

The structure of language consists of hierarchical layers of rules. Perceptual and motor processes link to phonological structure (Jackendoff, Reference Jackendoff 2011 ), which is interleaved throughout the lexicon, morphology, and syntax. Two levels of linguistic structure in particular, phonology and morphosyntax, have each been hypothesized to be more sensitive to a CPL than other aspects of linguistic structure. In addition, the shape of the function between AoA and L2 proficiency has also been scrutinized with the goal of identifying the age at which a possible CPL closes, with the assumption that it opens at birth.

2.1 Phonological effects

Any language has a vast lexicon that is expressed and comprehended by way of a phonological system. Speaking with a non-native accent has long been cited as the most salient effect of learning language after early childhood (Lenneberg, Reference Lenneberg 1967 ; Scovel, Reference Scovel 1988 ). Infants are especially sensitive to the phonological system of the ambient language in the environment. During the first year of life, infants show perceptual learning of the vowel space and consonantal features of the environmental language (Kuhl, Williams, Lacerda, Stevens & Lindblom, Reference Kuhl, Williams, Lacerda, Stevens and Lindblom 1992 ; Werker & Tees, Reference Werker and Tees 1984 ). This early perceptual-phonological learning has formed the basis of two related hypotheses about the nature of the CPL. One is that CPL effects arise from infant phonological learning, rendering the subsequent learning of another language difficult because phonemic categories have already been established in infancy for the L1 (Flege, Schirru & MacKay, Reference Flege, Schirru and MacKay 2003 ; Norrman & Bylund, Reference Norrman and Bylund 2016 ). A related hypothesis is that CPL effects observed in levels of linguistic structure other than phonology may in fact originate from early phonological learning of the L1. CPL effects at the level of L2 morphology and syntax, for example, have been postulated to be cascaded effects emanating from early phonological attunement (Werker & Tees, Reference Werker and Tees 2005 ), perhaps by way of working memory (Pierce, Genesee, Delcenserie & Morgan, Reference Pierce, Genesee, Delcenserie and Morgan 2017 ). The central role attributed to infant phonological discrimination with respect to the subsequent learning of linguistic structure in an L2 is reminiscent of models of reading in which phonological discrimination is posited to play a central and causal role in reading comprehension, both for successful and unsuccessful development (Dickinson & McCabe, Reference Dickinson and McCabe 2001 ).

In a series of studies, Flege and his colleagues found robust AoA effects on the ability to speak English without an accent. They proposed that the main difficulty for L2 learning by older learners is adjusting the phonological categories of the natively acquired L1 to accommodate the altered or novel phonological categories of the L2. He proposed that the degree of phonological match and mismatch between the L1 and L2 causes varying degrees of L2 learning success (Flege et al., Reference Flege, Schirru and MacKay 2003 ). That said, there is evidence both for and against the notion of strict maturational limits on the possibility of attaining a native-like accent in an L2. Based on Lenneberg's ( Reference Lenneberg 1967 ) original hypothesis, this cutoff is usually assumed to fall sometime during adolescence. Yet meticulously designed investigations of L2 phonology (Abrahamsson & Hyltenstam, Reference Abrahamsson and Hyltenstam 2009 ; Granema & Long, Reference Granena and Long 2012 ; Moyer, Reference Moyer 1999 ) provide equivocal evidence on this point. While there is some evidence for the presence of a discontinuity, albeit at different ages (age 6 in Granena & Long, Reference Granena and Long 2012 ; age 12 in Flege, Yeni-Komshian & Liu, Reference Flege, Yeni-Komshian and Liu 1999 ; and age 16 in Abrahamsson & Hyltenstam, Reference Abrahamsson and Hyltenstam 2009 ), all these studies also report linear relationships between the degree of foreign accent and AoA up into the late twenties.

Another approach has been to determine if any late L2 learners can achieve native-like pronunciation. In a series of studies, Bongaerts and colleagues (Bongaerts, Mennen & van der Silk, Reference Bongaerts, Mennen and van der Silk 2000 ; Bongaerts, van Summeren, Planken & Schils, Reference Bongaerts, van Summeren, Planken and Schils 1997 ; Palmen, Bongaerts & Schils, Reference Palmen, Bongaerts and Schils 1997 ) concluded that learners first exposed to an L2 after age 12 were still able to do so under certain circumstances. They also concluded that variables that provide “input enhancement” (Ioup, Reference Ioup, Singleton and Lengyel 1995 ), such as instructional support, motivation to sound like a native speaker, and biographical history (i.e., having a native speaker partner or family) also played a large role in achieving a native-like accent, as did the typological similarity of the L1 (e.g., either German or English) and the L2 (e.g., Dutch).

Individual case studies support these group studies. Ioup, Boustagui, El Tigi, and Moselle ( Reference Ioup, Boustagui, El Tigi and Moselle 1994 ) reported a case study of two women first exposed to (Egyptian) Arabic in their early twenties: one had had years of instruction and had done extensive graduate work in Standard Arabic, while the other was entirely self-taught. Both women had married Egyptian men. Eight out of 13 teachers of L2 Arabic judged their spontaneous spoken production to be native-like. A similar case study reported an exceptional learner with an AoA of 22 who was also largely self-taught, but whose pronunciation fell well within native speaker norms (Moyer, Reference Moyer 1999 ). In addition to native speaker perceptions of accent, Birdsong ( Reference Birdsong 2003 ) examined both vowel duration and voice onset time (VOT) of initial stop consonants in L2 speakers of French first exposed to the language at 18 years of age or older. He reported two late learners out of 22 studied who fell consistently within native speaker norms on both the acoustic and the impressionistic parameters. Both of these late L2 learners also reported high levels of motivation. Another study reported an exceptional, advanced L2 learner of Spanish (out of five with advanced degrees in Spanish) with an AoA of 24 who performed consistently within native speaker norms on acoustic measures of Spanish stop-liquid clusters: VOT, rhotic quality, and vowel epenthesis (Colantani & Steele, Reference Colantani, Steele, Klee and Face 2006 ).

Clearly, L2 exposure that is delayed until as late as the third decade of life is not an absolute biological barrier to the acquisition of a native-like accent; certain exceptional and highly motivated individuals are still somehow able to surmount this impediment. If such a constraint were truly hard-wired, there could be no such exceptions. We are not claiming this to be the norm in L2 acquisition. To the contrary, everyone is anecdotally aware of the difficulty of achieving native-like pronunciation in a language acquired late in life. Nevertheless, by the same token, most people are also anecdotally familiar with some individual who, despite delayed exposure, seems to have been able to achieve a native accent in a second language. More importantly, with the exception of Granena and Long ( Reference Granena and Long 2012 ), who studied Chinese speakers of Spanish, every other study that has drilled down on such individuals in order to determine their capabilities in the phonological domain has found at least one and up to four individuals who perform at native-like levels on both subjective (native speaker assessment) and objective (acoustic) measures of accent. Aside from Granena and Long ( Reference Granena and Long 2012 ), who found none, the lowest percentage of L2 speakers deemed to exhibit native-like pronunciation in any study is a little over 4% of the population studied. Footnote 2

2.2 Lexical, morphological, and syntactic effects

From the theoretical perspective of universal grammar and its variants, knowledge of the morphological and syntactic structure of a language is often taken as the sine qua non of the human language capacity. Like early phonological development, children acquire the linguistic features specific to the language they use early in life, and these features are thought to be difficult to reset at older ages (Wexler & Cullicover, Reference Wexler and Cullicover 1980 ). For example, Curtiss and colleagues (Curtiss, Reference Curtiss 1977 ; Fromkin, Krashen, Curtiss, Rigler & Rigler, Reference Fromkin, Krashen, Curtiss, Rigler and Rigler 1974 ) characterized the spoken language development of Genie (who was learning English at the age of 13 after being socially isolated from people beginning around the age of 20 months) as being deficient with respect to morphological and syntactic learning, but spared for lexical learning. This description of what Genie found easy and hard in language acquisition is consistent with the UG framework prevalent at the time, which considered lexical development to be unrelated to the acquisition of syntactic structure. The problem with this account of the CPL, however, is that Genie's lexical development was not systematically studied. Since this landmark study, lexical acquisition has been found to play a pivotal role in language development (Fenson, Dale, Reznick, Bates, Thal, Pethick, Tomasello, Mervis & Stiles, Reference Fenson, Dale, Reznick, Bates, Thal, Pethick, Tomasello, Mervis and Stiles 1994 ; Bates & Goodman, Reference Bates and Goodman 1997 ). Nonetheless, the idea that the CPL primarily affects morphological and syntactic development is a common hypothesis. Given this perspective, many studies have scrutinized AoA effects on L2 outcome with respect to morphological and syntactic proficiency.

2.3 AoA effects on L2 morphological and syntactic outcomes

A frequently used measure of morphological and syntactic knowledge is the grammaticality judgment task because it requires the detection of rule violations in these domains, which native speakers do unconsciously. In a seminal study using a grammaticality judgment task with Chinese and Korean native speakers who were L2 learners of English, Johnson and Newport ( Reference Johnson and Newport 1989 ) found performance to decline as a linear function of AoA up to age 16, with no systematic relation to AoA afterwards. They interpreted this linear trend prior to puberty, and the lack of one afterwards, as evidence that a CPL governs language acquisition during childhood, but not afterward.

Shape of the AoA function

As described above, one of the main observable features of CP phenomena is a closing of the temporal learning window sometime during development. For this reason, researchers have searched for a closing of the CPL by scrutinizing the shape of the function between AoA and L2 performance at the level of morphology and syntax (and to some extent for phonology too, described above). Testing native Spanish L2 learners of English using the same task and stimuli as Johnson and Newport, Birdsong and Molis ( Reference Birdsong and Molis 2001 ) found few AoA effects until puberty, after which performance declined, suggesting that the linguistic similarity between the L1 and L2 modulates AoA effects. In a massive study using census data and a self-assessment of L2 proficiency, Hakuta, Bialystok, and Wiley ( Reference Hakuta, Bialystok and Wiley 2003 ) found no break points in the linear function for AoA effects, but rather a continuous decline in self-reported L2 English proficiency in native Chinese and native Spanish speakers, a decline that continued into the 7 th decade of life.

DeKeyser, Alfi-Shabtay, and Ravid ( Reference DeKeyser, Alfi-Shabtay and Ravid 2010 ) and Granena and Long ( Reference Granena and Long 2012 ) both reported similar overall linear declines across the decades, but argued for the existence of two underlying points of discontinuity in this linear function, albeit at different ages of arrival. The results were again largely equivocal. In a study of Russian–English bilinguals in the U.S. and Canada, and of Russian–Hebrew bilinguals in Israel, DeKeyser et al. ( Reference DeKeyser, Alfi-Shabtay and Ravid 2010 ) found robust linear correlations of AoA with morphosyntactic ability across the entire lifespan, as in Hakuta et al. ( Reference Hakuta, Bialystok and Wiley 2003 ), albeit this time as measured by a grammaticality judgment task. Partial analyses of the dataset showed a linear correlation only up to an AoA of 40 years and not beyond. However, when age at time of testing was partialed out, only those with an AoA of up to 18 years still showed a linear correlation with proficiency. This correlation disappeared when this group was split at 12 years, but mean scores on the grammaticality judgment task still differed significantly between those with an AoA above or below 12. The important point for our line of argumentation here is that L2 proficiency declines with AoA, regardless of whether AoA is binned into separate age groups or the data are aggregated and regression functions are computed over the entire data set. The unanswered question is whether arbitrary cut offs in AoA reflect real breakpoints in the ability to acquire an L2 proficiently, which is where language aptitude and motivation come into play.

Based on Johnson and Newport ( Reference Johnson and Newport 1989 ) and Flege et al. (1999), Granena and Long ( Reference Granena and Long 2012 ) chose AoA breakpoints of 6 and 15 years in their study of Chinese–Spanish bilinguals living in Spain. Based on both grammaticality judgment and production measures, Granena and Long ( Reference Granena and Long 2012 ) reported linear correlations with morphosyntactic and lexical ability only for those with an AoA between 7 and 15 years when the participants were grouped into three AoA categories spanning the first three decades of life. However, in the end, Granena and Long ( Reference Granena and Long 2012 :326-327) conceded that a model with two break points accounted for only 5% more of the variance in their data than a linear model with none at all, and that the latter (as in Hakuta et al., Reference Hakuta, Bialystok and Wiley 2003 ) might in fact provide a more parsimonious, less complex fit.

Thus in both studies, linear correlations accounted for the data across multiple decades of first exposure to an L2, as in Hakuta et al. ( Reference Hakuta, Bialystok and Wiley 2003 ), but more limited linear correlations up to age 15 or 18 were detected when the data were divided into discrete subsets of AoA ranges, as in Johnson and Newport ( Reference Johnson and Newport 1989 ). In other words, much like the famous rabbit vs. duck optical illusion, these results leave it mostly to the observer to decide whether the linearity vs. discontinuity glass is half full or half empty in either direction.

Language aptitude and motivation

Both DeKeyser et al. ( Reference DeKeyser, Alfi-Shabtay and Ravid 2010 ) and Granena and Long ( Reference Granena and Long 2012 ) reported correlations between language aptitude and L2 proficiency primarily for learners exposed to an L2 later in life: between 18 and 40 for morphosyntax (DeKeyser et al., Reference DeKeyser, Alfi-Shabtay and Ravid 2010 ) and between 16 and 29 for phonology and lexical knowledge (Granena & Long, Reference Granena and Long 2012 ). These correlations were interpreted as indicating that language aptitude plays a role in successful L2 acquisition, but only for late learners. However, in a study of Spanish–Swedish bilinguals who were first screened for near-native proficiency, Abrahamsson and Hyltenstam ( Reference Abrahamsson and Hyltenstam 2009 ) reported a robust correlation between language aptitude and performance on a grammaticality judgment task testing fine points of Swedish grammar in early learners, with ages of arrival between 1 and 11. Those with higher language aptitude scores tended to be the ones who performed within native speaker norms. There was a similar trend in the late learners. This suggests that language aptitude plays a role in near-native L2 performance regardless of age of first exposure.

Case studies of high proficiency L2 learners have also found that aptitude and motivation modulate AoA effects. In a study described above, Ioup ( Reference Ioup, Singleton and Lengyel 1995 ) examined two high proficiency late learners of Egyptian Arabic. They were identified as such because they performed at near-native levels on a production task targeting intricate features of Egyptian Arabic morphology and syntax. Their high level of proficiency, in spite of having been first exposed to the language in their early twenties, was attributed to a desire to assimilate to the culture for family reasons along with language aptitude (with regard to the late learner who was self-taught), and formal instruction (with regard to the late learner who had done graduate work in Standard Arabic).

L2 use and education

Education in the L2 along with the amount of use of the L2 has also been found to exert robust effects on L2 outcome. Hakuta et al. ( Reference Hakuta, Bialystok and Wiley 2003 ) found a significant effect of education level (independent of age of arrival) on self-assessed proficiency level in their analysis of U.S.A. census data. Birdsong and Molis ( Reference Birdsong and Molis 2001 ) also found robust effects of education and L2 use on L2 outcome. In their study of Korean–English immigrants to the U.S., Flege et al. (1999) found that individuals with more years of formal USA education outperformed those with less education on tests of rule-based morphosyntax, while those who spoke more English than Korean on a daily basis performed better than those who did not on tests of lexically idiosyncratic morphosyntax.

AoA and L2 outcome

AoA effects on L2 outcome are robust under some circumstances, but also plainly interact with non-age related factors, such as the amount of linguistic experience with the L2, the typological relationship of the L1 to the L2, the amount of education received in the L2, the amount of L2 use, as well as learning factors such as motivation and aptitude. It is possible that these learning factors may contribute to some of the age-related decline in L2 outcome. For example, motivation to learn and use a new language, and amount of education received in the L2, may decline with age. If L2 outcome were fully under the control of a CPL, these learning variables should not predict L2 outcome, and the outcome of L2 learning would not be consistently observed to be so variable (Meisel, Reference Meisel, Boeckx and Grohmann 2013 ).

Seidenberg and Zevin ( Reference Seidenberg, Zevin, Munakata and Johnson 2006 ) interpret AoA effects on L2 outcome as arising from what they call the “paradox of success.” They propose that prior language learning itself alters the outcome of L2 learning. Within the framework of connectionist modeling, learning is accomplished by the creation of associative links among nodes within a network. These paths become weighted according to the frequency of their associations with other nodes within the network over time. Difficulties and limitations for L2 learning from a connectionist perspective thus arise from conflicts among the weights and associations of the originally developed language network with those of the new language. From this perspective, the linear AoA function often observed in relation to L2 outcome is due to the increasing entrenchment of the L1 language network with age. Note that this theoretical model of AoA effects on L2 outcome is somewhat related to the models discussed above that posit cascaded effects emanating from infant phonological attunement, which might be thought of as different weights to acoustic features of the speech signal. The difference is that a connectionist network accounts for morphological, syntactic, and lexical knowledge, as well as phonology.

Common to all these models is the assumption that infant L1 development is unlike L2 learning because L1 development begins from scratch in infancy while L2 learning is filtered through, and interacts with, L1 knowledge. In an abstract sense, all these theoretical proposals suggest that the putative CPL applies to L1 learning, and that L2 effects are a consequence of this prior learning. We turn now to a new source of data with which we can examine the CPL, AoA effects on sign language attainment.

3. AoA effects on ASL outcome

When considering AoA effects on ASL outcome, it is important to be mindful of the ways in which infant deafness radically alters the linguistic input from the environment necessary for language acquisition.

3.1 Infant deafness and language acquisition

The evolution of human social groups has conspired to create the ideal linguistic environment for language acquisition. Infants who hear normally are immersed in spoken language from birth and even before (Moon, Cooper & Fifer, Reference Moon, Cooper and Fifer 1993 ). The language spoken in the environment is sufficient for infants who hear to develop complex language by the ages of seven to nine without any overt instruction on the part of their caretakers, or any explicit practice on the part of the children (Ambridge & Lieven, Reference Ambridge and Lieven 2011 ; Diessel, Reference Diessel 2004 ). By contrast, infants born severely or profoundly deaf are isolated from the language spoken around them by virtue of their inability to hear it. Unfortunately, the visual signal of speech, as in lipreading, is too impoverished to support spontaneous language acquisition because most speech sounds are articulated inside the mouth. Footnote 3 However, deaf (and also hearing) infants do spontaneously develop sign language when the people around them sign. This fact alone indicates that the human capacity for spontaneous language development transcends the sensorimotor characteristics of the communication channel.

However, the majority of infants born deaf are not exposed to any sign language until ages well past infancy, when they first interact with signers. This initial exposure to sign language typically occurs outside the home in a school or social setting. Footnote 4 Many deaf infants in North America and Europe receive special intervention to promote spoken language development, but such intervention, which typically discourages the use of sign language, is not always available nor is it always successful. In the absence or paucity of prior spoken language development, a deaf child's first exposure to a sign language marks the initial onset of language acquisition, albeit at a late age. In addition, however, sign languages, like spoken ones, are also learned as second languages by many individuals, deaf or hearing, at a range of ages past infancy. These circumstances create naturally occurring variation in the age-onset of first- and second-language acquisition and thus provide a unique means with which we can investigate the postulated CPL.

Given that language is a complex system requiring several years of experience to fully develop, the next question is what CPL effects might look like in signed languages.

3.2 AoA and ASL outcome

The first study to find that AoA affects ASL proficiency used narrative shadowing and sentence recall because these tasks yield insights into the psycholinguistic processing of natural language (Mayberry & Fischer, Reference Mayberry and Fischer 1989 ). Shadowing is an on-line task where attention is split by comprehending while simultaneously reproducing a linguistic stimulus. Performance on this task is sensitive to linguistic structure (Cherry & Taylor, Reference Cherry and Taylor 1954 ; Marslen-Wilson, Reference Marslen-Wilson 1973 ). College students who were native signers (whose deaf parents signed to them from birth) and born deaf were more accurate shadowing ASL narratives than college students also born deaf who were non-native signers (whose hearing parents did not sign to them). The non-native signers began to learn ASL between the ages of 9 to 16 in school through interaction with peers who signed. Prior to this, they had attended “oral” schools, where sign language was discouraged. The second study used sentence recall, an off-line task where comprehension and expression are sequential and thus place a greater load on memory compared with the shadowing task. For the second study, another group of deaf college students participated whose AoA ranged from birth to age 15. Replicating and extending the first study results, ASL sentence recall accuracy declined as a linear function of AoA. However, because the participants of both studies were college students of a similar age, length/years of ASL experience was confounded with AoA; the two factors were inversely correlated.

Titrating the effects of AoA from linguistic experience is a design challenge for CPL studies. How much linguistic experience does a learner need to achieve maximum, or ultimate, proficiency? Some studies address the problem by recruiting only participants with high proficiency levels and then searching for AoA effects among these pre-screened L2 learners, as discussed above (Abrahamsson & Hyltenstam, Reference Abrahamsson and Hyltenstam 2009 ; Coppieters, Reference Coppieters 1987 ; White & Genesee, Reference White and Genesee 1996 ).

The third study controlled linguistic experience by recruiting deaf signers who had used ASL for a minimum of 20 years and who, for the most part, were not college graduates. The task again was sentence recall but here the sentences were long and complex. Recall accuracy declined as a linear function of AoA, which ranged between birth and 13 years, with no correlation with length of experience (Mayberry & Eichen, Reference Mayberry and Eichen 1991 ).

Across the three studies, AoA affected knowledge of ASL structure rather than cognitive processing constraints per se . For example, AoA showed no significant effects on the rate with which the signers produced signs, no effects on the overall number of signs the signers produced for each trial, and no effects on performance on non-verbal cognitive tasks such as block design. Rather, AoA showed differential effects on morphosyntactic processing. Later learners tended to strip inflectional morphology from the stimulus signs in complex ASL sentences to produce bare stems instead. By contrast, early learners tended to re-analyze and re-arrange them, all the while maintaining the overall meaning of the stimulus sentence.

Given the AoA effects described thus far, it comes as no surprise that native deaf signers show greater on-line sensitivity to violations of verb agreement compared with non-native deaf signers (Emmorey, Bellugi, Friederici & Horn, Reference Emmorey, Bellugi, Friederici and Horn 1995 ), or that grammaticality judgment accuracy for ASL sentences, ranging from simple to complex, was found to decline as a linear function of AoA in a Canadian sample of highly experienced deaf signers (Boudreault & Mayberry, Reference Boudreault and Mayberry 2006 ). Similarly, Newport ( Reference Newport 1990 ) found a linear decline as a function of AoA and performance on tasks requiring knowledge of complex ASL verb morphology in a sample of highly experienced deaf signers with a minimum of 30 years’ exposure to ASL. However, she found no AoA effect on tasks involving basic word order, suggesting that not all aspects of ASL morphology and syntax decline with AoA, an intriguing finding to which we return below.

To summarize AoA effects on ASL outcome, the results of several studies investigating the ASL outcome in diverse groups of deaf signers using varying proficiency measures concur to show AoA predicts ASL learning outcome. These effects show a linear relation to AoA from birth to adolescence in those studies that tested for this function. This indicates, first, that AoA effects are not unique to spoken L2 but instead transcend the sensorimotor modality of the communication channel. This is not surprising given that ASL is a language. Learning a manual-visual language does not circumvent AoA effects, suggesting that these effects do not originate from sensorimotor learning per se , or if they do, that all sensorimotor modalities associated with all language are affected. However, the fact that AoA effects on ASL learning outcome parallel those of AoA effects on L2 spoken language learning over the same age range does not address the question of L1 outcome in relation to AoA.

3.3 AoA effects on L2 vs L1 morphological and syntactic outcome

As explained above, not all signers who are deaf learn ASL as an L1. Some signers acquire another sign language in infancy and learn ASL later as an L2, although this situation has been little studied. Other deaf signers acquire spoken English to varying degrees before learning ASL later, also as an L2. A small proportion of L2 signers were not born deaf but instead suddenly became deaf due to viral infections. Footnote 5 Because such signers are indisputably L2 learners of ASL (because they fully acquired spoken English as hearing infants prior to learning ASL later), they provide an ideal test of the CPL. Comparing their ASL proficiency to signers who were born deaf, but who acquired minimal language prior to learning ASL at the same ages, provides a critical test, a means to ascertain the extent to which brain maturation alone predicts the outcome of language acquisition. If brain maturation alone affects language learning, then AoA will equally affect L1 and L2 outcome.

To this end, we matched signers who became deaf between the ages of 8 and 12 by age, sex, and length of ASL experience to signers who were born deaf and self-reported knowing minimal language prior to ASL exposure at the same AoA. All the signers were highly motivated to learn ASL and highly experienced, having 20 years or more of continuous experience. Native deaf signers served as the controls. Again, the task was ASL sentence recall. The results showed a marked advantage of infant language learning. The deaf L2 learners performed at near-native levels. By contrast, the deaf late L1 learners performed at low levels (Mayberry, Reference Mayberry 1993 ). These results provide initial evidence that L1 experience begun in infancy is necessary for later L2 learning to be successful. Note that these findings also confirm the amodal nature of language ability. Infant spoken language acquisition facilitates later sign language acquisition. Given that language ability is amodal, the facilitative effects of infant language acquisition should be bi-directional: that is to say, infant sign language acquisition should support later L2 learning of spoken language (Mayberry, Lock & Kazmi, Reference Mayberry, Lock and Kazmi 2002 ).

Next, we turned to English as the target language to further probe this hypothesis using grammaticality judgment, as commonly used in L2 AoA studies described above, along with a sentence-to-picture matching task. To assess the amodal nature of AoA effects on L1 vs. L2 acquisition, we tested two kinds of L2 learners: one group was native deaf signers of ASL who learned English as an L2 in school; the other group was native hearing speakers of Urdu, Spanish, German, and French who also learned English as an L2 in school at similar ages. To verify what we have called the l1 timing hypothesis , we also recruited late L1 learners who were born deaf but who acquired minimal language in early childhood prior to learning ASL and English in school at the same ages as the L2 groups. As predicted, both groups of L2 learners performed at near-native levels on both tasks, despite the fact that one group had normal hearing and acquired a spoken language in infancy and the other group was born deaf and acquired ASL in infancy. The contribution of infant language experience to life-long language learning ability is demonstrably amodal.

Also as predicted, the late L1 learners, who had used ASL and English for the same length of time as the two L2 groups, but who also had acquired minimal language during early life, showed low levels of English proficiency on the two tasks, but not across all sentence structures. They performed at near-native levels on simple SVO structures, as Newport ( Reference Newport 1990 ) had previously found in her study testing late acquisition of ASL. As the morphosyntactic complexity of the English structures increased, their performance declined to chance levels on both tasks (Mayberry & Lock, Reference Mayberry and Lock 2003 ). These results were replicated in a study of deaf signers of British Sign Language (BSL) using the ASL grammaticality judgment task of Boudreault and Mayberry ( Reference Boudreault and Mayberry 2006 ) translated into BSL (Cormier, Schembri, Vinson & Orfanidou, Reference Cormier, Schembri, Vinson and Orfanidou 2012 ). The developmental timing of initial language experience during childhood clearly exerts robust effects on ultimate language proficiency across languages and sensorimotor modalities.

3.4 AoA effects on ASL phonology

Up to this point, we have focused exclusively on AoA effects on ASL outcome with respect to morphology and syntax. As described above, AoA shows robust effects on spoken L2 phonology, with some researchers proposing that infant phonological learning is the source of these effects. Like all languages, the linguistic architecture of ASL contains a phonological level of structure: signs, i.e., words are constructed from highly constrained bundles of articulatory features (Brentari, Reference Brentari 1998 ; Perlmutter, Reference Perlmutter 1992 ; Wilbur, Reference Wilbur, Oostendorpp, Ewen, Hume and Rice 2011 ). In the above described studies employing shadowing and sentence recall tasks, AoA also showed effects on the signers’ ASL phonological production. Specifically, the lexical errors made by the non-native signers were often phonological in nature. These phonological-lexical errors were real signs, not neologisms, that violated the morphosyntactic structure of the stimulus sentence, but, at the same time, they were clearly derived from the phonological structure of the original stimulus. An example of this kind of error in English would be like, “At Thanksgiving, I ate too much turkey sleep potato,” where the verb “sleep” is substituted for the conjunction “and.” The two signs vary in only one sublexical feature of sign, location, in ASL. These kinds of phonological errors suggest that the stimulus sentence was incompletely processed. Perhaps phonological pieces of the stimulus item were perceived, but the sentence was insufficiently processed to catch and rectify the error.

Lexical-phonological processing errors

The fact that these phonological lexical errors typically violated the morphosyntactic structure of the stimulus corroborates the grammaticality judgment results of other studies (Boudreault & Mayberry, Reference Boudreault and Mayberry 2006 ; Mayberry & Lock, Reference Mayberry and Lock 2003 ; Mayberry et al., Reference Mayberry, Lock and Kazmi 2002 ). Moreover, the interpretation that these phonological-lexical errors reflect incomplete language processing was supported by the finding that they were negatively correlated with comprehension accuracy, both of which in turn were negatively correlated with AoA. These phonologically based lexical errors suggest that non-native deaf signers process the linguistic signal differently from native deaf signers. Their processing appears to be more shallow and often snagged at the surface level of lexical structure, leading to what we have called a phonological bottleneck in language processing (Mayberry & Fischer, Reference Mayberry and Fischer 1989 ), a psycholinguistic phenomenon related to the results of subsequent neuroimaging work discussed below. These effects are reminiscent to the extra effort L2 speakers have to expend to comprehend speech in a noisy environment.

Other studies have also found AoA effects on phonological processing in sign language by comparing the performance of native vs. non-native deaf signers. For example, native deaf signers showed clusters of similarity judgments for movements extracted from ASL signs suggestive of phonemic categories, a kind of phonemic clustering not exhibited by sign-naïve hearing participants (Poizner, Reference Poizner 1981 ). AoA effects have been found for lexical decision tasks in Spanish Sign Language and British Sign Language (Carreiras, Gutierrez-Sigut, Baquero & Corina, Reference Carreiras, Gutierrez-Sigut, Baquero and Corina 2008 ; Dye & Shih, Reference Dye and Shih 2006 ). Native and non-native deaf signers appear to differentially weight phonological features in ASL and BSL (Hildebrandt & Corina, Reference Hildebrandt and Corina 2002 ; Orfanidou, Adam, Morgan & McQueen, Reference Orfanidou, Adam, Morgan and McQueen 2010 ). These AoA effects on phonological processing in sign language are consistent with the widely observed AoA effects on phonological skills for spoken L2 learning. Given the differential and amodal AoA effects on L1 vs. L2 outcome on morphosyntactic processing in ASL, the next key question is whether similar differential effects are observed in ASL phonological processing.

Differential AoA effects on L1 vs L2 phonological processing

Some studies have found hints that phonological experience in early life facilitates rather than hinders later language learning, contrary to what has been proposed for spoken L2 learning described above. In a categorical perception study of an ASL handshape, Best, Mathur, Miranda, and Lillo-Martin ( Reference Best, Mathur, Miranda and Lillo-Martin 2010 ) found native deaf signers and hearing L2 signers to show category boundaries for the tested phonological feature. By contrast, non-native deaf signers, who perhaps were quasi-late L1 learners, were less categorical. In another study, native deaf and L2 hearing signers did not differ in their phonological similarity judgments for pairs of ASL signs. However, non-native deaf signers showed phonological similarity ratings that differed from those of both the native deaf and L2 hearing ASL learners, but not from those made by sign-naïve hearing participants (Hall, Ferreira & Mayberry, Reference Hall, Ferreira and Mayberry 2012 ). In a third study, native deaf and L2 hearing signers performed similarly on a gated sign recognition task (Morford and Carlson, Reference Morford and Carlson 2011 ). By contrast, non-native deaf signers performed significantly less well, occasionally responding to stimuli containing only partial phonological information with gestures. As has been observed in other studies, the non-native deaf signers showed differential weighting patterns for phonological features of ASL sign. Morford and Carlson ( Reference Morford and Carlson 2011 ) interpret these results to indicate that language experience during early life affects the organization of the mental lexicon. This interpretation was supported by the results of an eye-tracking study of ASL lexical recognition. Native deaf signers showed sensitivity to sign phonological structure during online lexical recognition whereas non-native deaf signers did not (Lieberman, Borovksy & Mayberry, Reference Lieberman, Borovksy, Hatrak and Mayberry 2016 ; Lieberman, Borovsky, Hatrak & Mayberry, Reference Lieberman, Borovsky, Hatrak and Mayberry 2015 ).

The studies comparing late L1 with late L2 acquisition in adults have mostly investigated phonological effects in the context of the lexeme. Recent studies with deaf children developing sign and spoken languages bilingually have corroborated the facilitative effects of early phonological experience cross-linguistically and cross-modally in children's language development (Davidson, Lillo-Martin & Chen Pichler, Reference Davidson, Lillo-Martin and Chen Pichler 2014 ; Hassanzadeh, Reference Hassanzadeh 2012 ).

AoA effects on L1 outcome are thus clearly distinct from those for L2 outcome at the level of phonological processing. Like morphosyntactic learning, infant language acquisition facilitates later L2 learning at the phonological level, independent of sensorimotor modalities of the early L1 or the later L2. Also like morphosyntactic development, aspects of phonological development are clearly amodal. L1 phonology may interfere to some extent with successful L2 learning of a spoken language, even when the L1 is a sign language, which has not been studied yet. Nonetheless, it is equally clear that early-acquired L1 phonology facilitates L2 learning. A lack of language experience, including phonological experience, during early life renders language acquisition begun after childhood incomplete. Research investigating the trajectory of L1 acquisition begun after early childhood supports this interpretation and provides further insight into this phenomenon.

4. The trajectory of late L1 acquisition

Infants born deaf who experience ASL in infancy spontaneously acquire it in a fashion similar to the spoken language acquisition of infants who hear (Bates, Marchman, Thal, Fenson, Dale, Reznick, Reilly & Hartung, Reference Bates, Marchman, Thal, Fenson, Dale, Reznick, Reilly and Hartung 1994 ; Fenson et al., Reference Fenson, Dale, Reznick, Bates, Thal, Pethick, Tomasello, Mervis and Stiles 1994 ). They babble with rhythmic arm movements aligned with ASL phonological features and prosody (Petitto, Holowka, Sergio, Levy & Ostry, Reference Petitto, Holowka, Sergio, Levy and Ostry 2004 ). The first 36 months of life is a period of rapid language development, and deaf infants exposed to ASL begin by learning more nouns then verbs, as is common for many languages. As their lexicon expands, they acquire more predicates. Vocabulary size rather than age predicts subsequent milestones such as word combination and morphosyntactic acquisition (Anderson & Reilly, Reference Anderson and Reilly 2002 ; Berk & Lillo-Martin, Reference Berk and Lillo-Martin 2012 ). The development of ASL grammar extends over the next several years as children acquire complex syntactic structures such as question formation, conjoining, complementation, topicalization and others (Chen Pichler, Reference Chen Pichler, Pfau, Woll and Steinbach 2012 ; Mayberry & Squires, Reference Mayberry, Squires and Brown 2006 ; Reilly, Reference Reilly, Schick, Marschark and Spencer 2006 ).

4.1 Input and language acquisition

In their seminal study, Hart and Risly ( Reference Hart and Risley 1995 ) discovered that the amount of language spoken to children has robust effects on their language development. Since this landmark work, numerous studies have found that the amount and kind of linguistic input children receive affects the trajectory of their lexical and syntactic acquisition (Hoff, Reference Hoff 2003 ; J. Huttenlocher, Vasilyeva, Cymerman & Levine, Reference Huttenlocher, Vasilyeva, Cymerman and Levine 2002 ). By any measure, the linguistic environment of infants born deaf is grossly impoverished compared to that of infants who hear when sign language is absent from the environment.

4.2 Homesign

Deaf children who are isolated from spoken and signed language provide another means, aside from retrospective outcome studies, to test the hypothesis that a CPL constrains L1 acquisition. Unlike cases of severe abuse where children are deprived of human contact and as a consequence are isolated from language (Curtiss, Reference Curtiss 1977 ; Fromkin et al., Reference Fromkin, Krashen, Curtiss, Rigler and Rigler 1974 ; Fujinaga & Kasuga, Reference Fujinaga and Kasuga 1990 ; Koluchova, Reference Koluchova 1972 ), deaf children do not typically suffer from social isolation. They are as nurtured as children who hear. In the absence of spoken language development, deaf children are observed to gesture for communication, known as homesign . Homesign has been shown to have some linguistic properties, such as ordering patterns, and these patterns are neither observed in the caregivers’ gestures (Goldin-Meadow & Mylander, Reference Goldin-Meadow and Mylander 1983 ), nor fully understood by them (Carrigan & Coppola, Reference Carrigan and Coppola 2017 ). These findings have been taken to mean that any linguistic properties observed in homesign arise from within the child and not from the environment, suggesting that some features of the human language faculty require no linguistic input to emerge from the child (Goldin-Meadow, Reference Goldin-Meadow 2005 ). However, studies of L1 acquisition begun after childhood indicate that homesign does not function as an L1 for the deaf child.

4.3 Case studies of late L1 acquisition

Spoken language learning.

The spoken Spanish development of a boy who was born profoundly deaf and fit for the first time with hearing aids at the age of 16 was studied by Grimshaw, Adelstein, Bryden, and MacKinnon ( Reference Grimshaw, Adelstein, Bryden and MacKinnon 1998 ). Although EM could detect speech with amplification, his development of spoken Spanish was limited after 48 months of instruction. His mean length of utterance, MLU (the average number of morphemes or words combined in utterance expression) was less than 2.0. EM's minimal spoken Spanish development might be attributed to modality effects. Perhaps he could not learn spoken language because he had missed experiencing spoken phonology in infancy. However, we cannot assume that the amplified speech signal was of sufficient clarity for EM to discern the acoustic details of speech. His auditory system was damaged. Genie, whose auditory system was intact, was able to develop intelligible speech at the age of 13, although it is unknown how much spoken English she had acquired prior to being isolated from her family around the age of 20 months, or whether she overheard speech during her isolation (Curtiss, Reference Curtiss 1977 ). These language outcomes contrast sharply with a case studied by Vargha-Khadem (Vargha-Khadem, Carr, Isaacs, Brett, Adams & Mishkin, Reference Vargha-Khadem, Carr, Isaacs, Brett, Adams and Mishkin 1997 ) showing rapid acquisition by a 10-year-old after hemispherectomy. Unlike EM and Genie, the child had been exposed to spoken phonology from infancy.

ASL acquisition

Other studies have investigated ASL development begun at older ages. Berk and Lillo-Martin ( Reference Berk and Lillo-Martin 2012 ) analyzed the spontaneous language of two children, Mei and Cal, who were born deaf and experienced ASL at the ages of 5;9 and 6;0 (years;months) with no prior language acquisition. After three months of immersion, they had learned proportionately more nouns than verbs and had begun to combine them into two-word utterances. The semantic content of their 2-word combinations was similar to that of 2-year old ASL deaf children and that of 2-year old hearing English learners reported in the literature. These results suggest that the beginning stages of late L1 development are similar to those of infant acquisition even after 5 to 6 years of delay. The question is whether this is true with an extreme delay in language experience.

We followed the ASL acquisition of three adolescents who were born deaf, Shawna, Cody, and Carlos, and who were first immersed in ASL at the ages of 13;8, 14;7, and 14;8 respectively, after little or no prior language acquisition. After 12, 18, and 24 months of ASL experience, each adolescent had acquired a vocabulary that resembled that of young children, both in terms of the words they learned and the distribution of lexical types (nouns, predicates, and closed class items) as measured with the MacArthur ASL-CDI (Anderson & Reilly, Reference Anderson and Reilly 2002 ) and verified with analyses of their spontaneous language. The adolescents’ rate of lexical learning was faster than that of infants, a likely effect of their being significantly more cognitively mature than infants. Similar to the 5- and 6-year-old late L1 learners, they also began to quickly combine signs into two-word utterances that were devoid of any inflectional morphology, as is the case for infant learners. Their utterance complexity, as measured by MLU, was related to the number of months each adolescent had been immersed in ASL (Ferjan Ramirez, Lieberman & Mayberry, Reference Ferjan Ramirez, Lieberman and Mayberry 2013 ).

Like the 5- and 6-year-old Mei and Cal, the adolescents’ language development showed promising beginnings. Both studies suggest that the ability to rapidly learn lexical items and combine them into two-word utterances is a latent human linguistic ability that is unperturbed by brain maturation in the absence of linguistic experience. Having no record of the adolescent's homesign, it is impossible to tell how it related to their initial ASL acquisition. However, their initial rapid lexical learning and word combinations indicate their ability to parse the linguistic signal into meaningful units remained intact despite their extremely late exposure to natural language. Although this skill might relate to early experience with gesture, it is important to note that Genie also displayed this skill, although not elaborated upon in the original reports of her language acquisition. However, this early linguistic parsing ability does not appear to develop into the ability to use sign phonological structure during language processing tasks, as demonstrated by the late L1 outcome studies described above.

Cross-sectional, longitudinal analyses of the word order acquisition of Shawna, Cody, Carlos, and Chris (who began learning ASL for the first time at age 13 with no prior language) from 12 months to 6 years of ASL experience revealed patterns that also resembled child ASL development. The patterns of word order acquisition for all four adolescent learners were initially variable, similar to that of 2- to 3-year-old native deaf learners reported in the literature. ASL uses variable word orders that are marked by morphosyntactic rules. Like young, native deaf learners, the adolescent learners progressed to a generalization stage of using SVO word order. However, unlike child ASL learners, they showed no indication of continued development of ASL word order beyond this stage, which involves complex sentence structure (Chen & Mayberry, under review).

Long-range outcome

Additional evidence that the trajectory of late L1 learner development becomes asymptotic at low levels of language development comes from a study by Morford ( Reference Morford 2003 ). She analyzed the ASL development of Maria and Marcus, two children who were born deaf and immersed in ASL at the ages of 12 and 13. Neither child had previously acquired any language; both were reported and observed to have used gestures with their hearing families prior to ASL exposure. She elicited language samples longitudinally from 1 to 32 months after ASL immersion with a wordless picture book. Analyses revealed that, by 7 to 9 months of ASL experience, they primarily used signs instead of gestures. They also began to quickly combine signs into utterances, as corroborated in the subsequent studies of late L1 acquisition described above. In a follow-up study conducted seven years later, however, Maria and Marcus both exhibited low levels of ASL comprehension on a sentence-to-picture matching task using utterances describing pictures from the original elicitation materials. They also made multiple phonological-lexical substitution errors on a sentence repetition task consistent with the adult, late L1 outcome studies described above.

The results of these longitudinal and cross-sectional studies of adolescent L1 acquisition converge, which is remarkable given the fact that these late L1 learners were born and raised in different countries and cultures, and first experienced ASL in a variety of home and school settings in the USA and Canada. Late L1 learners exhibit initial rapid learning of lexical items in different grammatical categories and subsequent word combinations that are reminiscent of the acquisition of young child language learners, but at a faster pace. At the same time, however, accumulating evidence suggests that two major characteristics of language acquisition begun for the first time at age 12 or older are, first, rapid initial language acquisition, and second, a subsequent protracted period of limited language development, despite rich linguistic environments and language instruction. The language development of adolescent late L1 learners does not progress to complex morphosyntactic structures, but remains limited to simple structures. Corroborating evidence for limited language development when language is not experienced in childhood comes from an ASL sentence comprehension study using a sentence-to-picture matching task. Individuals born deaf who experienced little or no language until the age of 12, with 10 years of experience, showed high accuracy on SV and SVO structures, but near chance performance on more complex structures (Mayberry, Cheng, Hatrak & Ilkbasaran, Reference Mayberry, Cheng, Hatrak and Ilkbasaran 2017 ).

The trajectory of language acquisition begun after childhood is unlike infant language acquisition beyond the initial stages of word learning and word combinations. Very late L1 acquisition is characterized by protracted and limited acquisition beyond the initial stages of language acquisition. The unique trajectory of late L1 acquisition begins to explain the differential effects of AoA on L1 vs. L2 observed in ultimate attainment studies. L2 learners begin the task of new language learning with an already acquired and established linguistic system through which they can begin to learn and remember words and structures in the new language. By contrast, late L1 learners begin the task of language acquisition with no prior knowledge of words or of any linguistic structures. Next, we ask whether the asymptotic levels of language acquisition we observe in adolescent L1 learners relate to neurolinguistic processing. Specifically, we ask whether late L1 exposure affects development of the brain language system.

5. Late L1 vs. L2 effects on neurolinguistic processing

Before turning to the effects of late L1 and L2 learning on the neuroprocessing of language, it is necessary to first ask whether sign and spoken language are processed in similar brain areas under the typical circumstances of infant language acquisition. One obvious difference between sign and spoken language is the sensorimotor modalities through which they are sent and received. Instead of using the vocal tract, signers use the hands and arms in concert with movements of the head, torso, and face for articulation. Instead of listening to the language signal through the auditory system, signers watch the language signal through visual system. Many studies have investigated the question of whether the sensorimotor characteristics of the communication channel affect how language is structured and how the brain processes this structure. For example, unlike spoken English, ASL uses the spatial positions and orientations of the moving hands to mark verbal arguments, case and number, as well as prepositions, and syntactic categories, among other morphosyntactic phenomena (Lillo-Martin & Meier, Reference Lillo-Martin and Meier 2011 ; Sandler & Lillo-Martin, Reference Sandler and Lillo-Martin 2006 ). Given the unique use of space by ASL to express linguistic structure, the question arises as to whether the brain processes it in a similar fashion to that of spoken language structure.

5.1 Neural processing of ASL in native learners

Initial research employed classic methods to investigate sign language in the brain. Poizner, Klima, and Bellugi ( Reference Poizner, Klima and Bellugi 1987 ) discovered that the cognitive purpose to which space is put, linguistic vs. non-linguistic, determines which hemisphere processes it. Signers who were deaf and suffered lesions to the left hemisphere language areas exhibited language deficits involving morphosyntactic structures that are instantiated with spatial contrasts. These same left hemisphere damaged signers showed no deficits for non-linguistic spatial processing, such as recognizing pictures, making block designs, or arranging miniature furniture in a room. By contrast, right hemisphere lesioned deaf signers showed the reverse deficit pattern, namely, no deficits for comprehending spatially marked morphosyntax, but significant difficulty recognizing and reproducing pictures or block designs. The extent of left temporal lobe damage was further found to correspond to how patients performed on ASL comprehension tasks, which in turn resembled deficits exhibited by hearing English patients with left hemisphere lesions (Hickok, Love-Geffen & Klima, Reference Hickok, Love-Geffen and Klima 2002 ). In another study, direct stimulation of the left temporal cortex of an epilectic deaf signer disrupted his sign expression in a fashion akin to what is observed for hearing epilectic patients (Corina, McBurney, Dodrill, Hinshaw, Brinkley & Ojemann, Reference Corina, McBurney, Dodrill, Hinshaw, Brinkley and Ojemann 1999 ). These pioneering studies show that the canonical language areas of the left perisylvian region are dedicated to the task of linguistic processing independent of the sensorimotor channel of language. Subsequent neuroimaging studies of healthy deaf adults, who acquired sign language from infancy, have corroborated these findings in several sign languages including ASL, BSL, Japanese Sign Language, and Swedish Sign Language (Cardin, Orfanidou, Ronnberg, Capek, Rudner & Woll, Reference Cardin, Orfanidou, Ronnberg, Capek, Rudner and Woll 2013 ; MacSweeney, Campbell, Woll, Brammer, Giampietro, David, Calvert & McGuire, Reference MacSweeney, Campbell, Woll, Brammer, Giampietro, David, Calvert and McGuire 2006 ; Newman, Supalla, Fernandez, Newport & Bavelier, Reference Newman, Supalla, Fernandez, Newport and Bavelier 2015 ; Petitto, Zatorre, Guana, Nikelski, Dostie & Evans, Reference Petitto, Zatorre, Guana, Nikelski, Dostie and Evans 2000 ; Sakai, Tatsuno, Suzuki, Kimura & Ichida, Reference Sakai, Tatsuno, Suzuki, Kimura and Ichida 2005 ).

Localization of sensory processing

The neural processing dissociation between sensory perception and linguistic processing was further demonstrated in a study using anatomically constrained magnetoencephalography, aMEG. In a picture-word priming task, the neural responses of native English hearing speakers who listened to spoken words were compared to those of native ASL deaf signers who watched ASL signs. Approximately 100 ms after presentation of the spoken word, the hearing English speakers showed activation in primary auditory cortex, as would be expected. Approximately 100 ms after presentation of the ASL sign, the deaf ASL signers showed activation in primary visual cortex, as also would be expected (Leonard, Ferjan Ramirez, Torres, Travis, Hatrak, Mayberry & Halgren, Reference Leonard, Ferjan Ramirez, Torres, Travis, Hatrak, Mayberry and Halgren 2012 ). A bit later, around 400 ms after presentation of the spoken word, the hearing speakers exhibited expected activation in left superior temporal areas, showing the well-known pattern for semantic processing indexed by the N400 effect. Likewise, around 400 ms after presentation of the signed word, the deaf signers exhibited activation in the same superior temporal region, showing the same N400 effect. These results indicate that, although the initial stages of sensory processing for spoken and signed words occur in the cortical areas responsible for auditory vs. visual sensory processing respectively, the subsequent stage of lexico-semantic processing is the same regardless of sensory input. Given that the neural processing of spoken and signed languages, beyond the initial stages of sensory perception, is remarkably similar, we are now in a position to ask if there are differential effects of late L1 vs. late L2 learning on neurolinguistic processing.

5.2 Late L1 acquisition neural processing effects

Fmri studies.

Using fMRI, Mayberry, Chen, Witcher, and Klein ( Reference Mayberry, Chen, Witcher and Klein 2011 ) neuroimaged 22 signers as they performed an ASL grammaticality judgment task. The signers, all of whom were right handed and born deaf, learned ASL as an L1 in an immersion situation either at home or school between the ages of birth and 14. The non-native signers had used ASL daily for a minimum of 15 years. All the signers had begun school by the age of 6. Some non-native signers had acquired minimal spoken language prior to learning ASL. Other non-native signers, those who first learned ASL at ages older than 8 years, had begun their education in classrooms or schools where the use of sign language was actively discouraged. Their educational placement was subsequently switched to classrooms where sign language was used due to the fact that their spoken language was not functional for educational purposes.

The neural activation exhibited by the signers was analyzed with whole brain regression analyses to determine which brain areas were affected by AoA. Of the nine identified brain areas, seven were located within the language network, five in the left hemisphere and two in homologous right hemisphere areas. Two other identified brain areas were located in the left occipital-visual cortex (Mayberry et al., Reference Mayberry, Chen, Witcher and Klein 2011 ). AoA for the L1 affected neural activation patterns along the anterior to posterior dimension of the left hemisphere. Anterior frontal and temporal language areas showed a significant negative relation to AoA. As the L1 AoA became older, the BOLD signal in the frontal and temporal language areas decreased. The reverse effect was observed in the visual processing areas of the left occipital cortex. Specifically, as the AoA for the L1 was older, the BOLD signal increased. In other words, signers who were born deaf and experienced ASL in early life showed robust neural activation patterns in the expected frontal and temporal language areas of the left hemisphere (and in two homologous areas in the right hemisphere). These same signers exhibited BOLD signals in left visual cortex that were below baseline (i.e., less neural activation when processing ASL sentences than for watching a still face). This suggests that when language is acquired in early life, the adult brain primarily allocates neural resources to the linguistic aspects of linguistic processing and requires minimal information from sensory-perceptual processing; perhaps because it is unnecessary for comprehension. Knowing a language means being able to predict with a high degree of probability the upcoming words in a sentence; this top-down prediction may require minimal perceptual information, suggesting that the language processing areas in frontal and temporal lobes may be in a feedback loop with the sensory-perceptual areas in visual cortex.

The neural processing results for the later L1 learners, despite their substantial linguistic experience, were quite different. Late L1 learners exhibited BOLD signals in left posterior visual cortex that were significantly greater than those they exhibited in the frontal and temporal language areas of the left hemisphere. Late L1 learners also showed greater neural activation patterns when watching ASL sentences than they did watching a still face (Mayberry et al., Reference Mayberry, Chen, Witcher and Klein 2011 ). These results lead us to question whether the neural pathways connecting visual processing areas to language processing are fully developed when the onset of linguistic experience occurs late during brain development, a question to which we return below.

These neural results parallel some of the psycholinguistic effects of late L1 described above, in particular the finding that late L1 learners, but not early L1 and late L2 learners, tend to produce phonologically based errors divorced from lexical meaning and sentence structure when performing psycholinguistic tasks. Such late L1 learners also exhibit less evidence for phonologically organized lexical processing on a variety of tasks. These contrasting neural patterns for early vs. late L1 acquisition of ASL suggest that the brain language network requires linguistic experience during early life to develop fully. Subsequent studies provide evidence for this hypothesis.

aMEG studies

Using aMEG with a picture-sign priming task, we compared the neural correlates of lexical processing in hearing L2 learners of ASL with those of native deaf ASL signers. The L2 learners had acquired spoken English in infancy; they began to learn ASL in late adolescence to early adulthood. The L2 learners exhibited neural activation patterns for lexico-semantic processing of ASL signs that were nearly identical similar to those of native ASL deaf signers, with primary neural activation in left hemisphere perisylvian language areas and some additional activation in homologous right hemisphere and left parietal areas (Ferjan Ramirez, Leonard, Torres, Hatrak, Halgren & Mayberry, Reference Ferjan Ramirez, Leonard, Torres, Hatrak, Halgren and Mayberry 2014 ).

These findings for L2 learners of ASL are consistent with studies of spoken language L2 neural processing. For example, native Spanish speakers showed neural activation patterns primarily in left hemisphere perisylvian language areas when listening to words in their native Spanish. When listening to spoken words in their less proficient L2, English, they exhibited neural activation patterns in the same left hemisphere perisylvian language areas but with some additional activation in left parietal and right occipitotemporal areas (Leonard, Torres, Travis, Brown, Hagler Jr, Dale, Elman & Halgren, Reference Leonard, Torres, Travis, Brown, Hagler, Dale, Elman and Halgren 2011 ). Other studies have also found that a less proficient and/or a late acquired L2 engages language areas in the left hemisphere with the addition of some right hemisphere activation (Dehaene, Dupoux, Mehler, Cohen, Paulesu, Perani, van de Moortele, Lehericy & Le Bihan, Reference Dehaene, Dupoux, Mehler, Cohen, Paulesu, Perani, van de Moortele, Lehericy and Le Bihan 1997 ; Wartenburger, Heekeren, Abutalebi, Cappa, Villringer & Perani, Reference Wartenburger, Heekeren, Abutalebi, Cappa, Villringer and Perani 2003 ). This pattern for L2 neural processing was confirmed by the results of a meta-analysis analyzing 30 neuroimaging studies of spoken L2 processing (Indefrey, Reference Indefrey 2006 ). By definition, all L2 learners, whether hearing or deaf, learning a spoken or signed L2, share a common factor: infant language experience. Neurolinguistic studies of individuals who were bereft of language experience during childhood paint an entirely different picture.

5.3 Case studies of extremely late L1 acquisition

We studied the neural activation patterns of two cases of extreme late L1 acquisition described above, Carlos and Shawna, whose acquisition began at ages 13;8 and 14;7 respectively. When they were neuroimaged with aMEG using the same picture-sign priming paradigm mentioned above, they had 38 and 24 months of language experience respectively. The control groups were native deaf signers and hearing L2 signers (whose length of ASL experience was comparable to that of Carlos and Shawna). As expected, both the deaf native and hearing L2 signer control groups exhibited activation in frontal and temporal areas of the left hemisphere for lexico-semantic processing that was highly similar, as described above. Although Carlos and Shawna were nearly as accurate and fast as the hearing L2 learners when recognizing the signs in the scanner, they both showed strong activation in right occipital-parietal areas; Shawna showed some additional activation in right frontal and left temporal areas. We neuroimaged them a second time after they had accumulated 15 more months of ASL experience. Both Carlos and Shawna continued to show right occipital-parietal activation, but now they both also exhibited some additional neural activations in the temporal language areas, left for Shawna and bilaterally for Carlos, in response to signs with which they were most familiar, as indexed by response time (Ferjan Ramirez, Leonard, Davenport, Torres, Halgren & Mayberry, Reference Ferjan Ramirez, Leonard, Davenport, Torres, Halgren and Mayberry 2016 ). In the absence of childhood language experience, the adolescent brain exhibits radically altered neural processing patterns for lexical processing. At the same time, however, the results further indicate that the canonical language areas of the left hemisphere retain some capacity to process some language (familiar words) after three to four years of language experience late in life. The next question is whether late L1 learners exhibit more typical neurolinguistic activation patterns after decades of experience. Our previous fMRI study of highly experienced signers with late L1 acquisition (who were not as severely linguistically deprived as Shawna and Carlos) indicates that the answer is no (Mayberry et al., Reference Mayberry, Chen, Witcher and Klein 2011 ).

We conducted another neuroimaging study to investigate the question more directly. We did so in another case study. Martin was born profoundly deaf and grew up as the only deaf person in his hearing family and community in rural Mexico. He attended no school in childhood and reported communicating with a sister through gesture. At the age of 21, he began to learn Mexican Sign Language and then, after immigrating to the USA at the age of 23, he began to learn ASL through immersion and classroom instruction. Martin had 30 years of continuous ASL experience when we neuroimaged him with aMEG using the same picture-sign priming paradigm mentioned above. Although he was as accurate and fast on the scanner task as the hearing L2 learner control group (and nearly as accurate and fast as the native deaf control group), Martin exhibited neural activation patterns that were primarily located in dorsolateral, superior parietal, and occipital areas bilaterally. This neural activation pattern was highly similar to the ones exhibited by Carlos and Shawna after they had 24 and 38 months of linguistic experience. Unlike the adolescent L1 learners, however, Martin, who was a young adult L1 learner, showed almost no activation in either the left or right temporal language areas (Mayberry, Davenport, Roth & Halgren, under review).

These contrasting results using different imaging techniques and paradigms indicate that the neurolinguistic processing of late L1 learners contrasts sharply from that of native learners, who experienced language in infancy, as well as from that of L2 learners, who also experienced language in infancy. Late L1 learners show greater activation in visual perceptual areas compared with native and L2 learners. That is, even though L2 learners have similar visual exposure to ASL as late L1 learners, they process it in like language. Cases of extreme language delay show neural activation patterns more commonly associated with watching meaningless human actions than processing lexical items from a language (Decety & Grèzes, Reference Decety and Grèzes 1999 ). Although adolescent L1 learners show some activation in left hemisphere perisylvian language areas as they accrue more ASL experience, the young adult L1 learner with decades of experience did not. This suggests that the left hemisphere language areas retain some capacity to process language when language is first experienced in adolescence, but this capacity is lost by young adulthood. No such reduction or absence of linguistic processing capacity in left hemisphere language areas is ever observed for L2 learners of signed or spoken languages.

6. The scope and nature of the CPL

From the array of research discussed here, it should now be clear that AoA effects on the ultimate outcome of L1 acquisition differ substantially from those of L2 outcome, both from a linguistic and a neurolinguistic perspective. Linguistically, AoA effects on L1 ultimate attainment are much greater than those for L2 attainment across a variety of psycholinguistic tasks. Late L1 learners perform at significantly lower levels than do late L2 learners on measures of morphology and syntax, phonological processing, and comprehension. This attenuated language attainment is unrelated to overall non-verbal cognitive skills or motivation to learn ASL (Valli, Lucas, Farb & Kulick, Reference Valli, Lucas, Farb and Kulick 1992 ). Limited language structure is acquired when the onset of L1 experience begins in adolescence and young adulthood. The stages of initial adolescent L1 acquisition resemble infant language acquisition, minus a babbling stage. Unlike L2 learning, late L1 acquisition slows and then stops at the level of simple sentence structure. The circumscribed level of language attainment observed for cases of adolescent and adult L1 acquisition begins to explain the low comprehension levels found across the studies of late L1 attainment. Our understanding of whether and how these effects are modulated by linguistic input, both in and out of school, is an important question in need of further research.

Parallel effects for late L1 acquisition are found for neurolinguistic processing. Neurolinguistic processing patterns for the signed L2 are highly similar to those found for native signed L1 neuroprocessing, with some additional activation elsewhere in the brain. The neurolinguistic processing patterns associated with AoA effects on L1 outcome show attenuated activation patterns in the frontal and temporal language areas of the left hemisphere, accompanied by increased neural activation patterns in sensory-perceptual processing areas in the parietal and occipital cortex. Extreme delays in the onset of L1 experience are associated with unique neurolinguistic processing patterns in dorsolateral occipital, parietal and frontal areas, processing patterns not observed when language – any language in any sensorimotor modality – is acquired from infancy. Perisylvian language areas show limited activation when language is first experienced in adolescence and nearly none when it is first experienced at the end of brain maturation in young adulthood.

The unique effects of AoA on L1 acquisition, attainment, and neurolinguistic processing suggest that the hierarchical structure of language and the architecture of the brain language processing system arise from their interaction over the course of early childhood when brain maturation and language acquisition are temporally synchronized. Although hearing infants show neural activation in response to speech in canonical language brain areas from birth (Vannasing, Florea, Gonzales-Frankenberger, Tremblay, Paquette, Safi, Wallois, Lepore, Beland, Lassonde & Gallagher, Reference Vannasing, Florea, Gonzales-Frankenberger, Tremblay, Paquette, Safi, Wallois, Lepore, Beland, Lassonde and Gallagher 2016 ), their brain language network is not yet developed. The brain language system shows organizational shifts over the course of development from infancy through adolescence. Neural responses when processing language are more posterior in the young child's brain and become more anterior with maturation (Brown, Lugar, Coalson, Miezin, Petersen & Schlaggar, Reference Brown, Lugar, Coalson, Miezin, Petersen and Schlaggar 2005 ; Schlaggar, Brown, Lugar, Visscher, Miezin & Petersen, Reference Schlaggar, Brown, Lugar, Visscher, Miezin and Petersen 2002 ). Neural language processing is more bilaterally represented in children and becomes more localized to the left hemisphere with maturation (Berl, Mayo, Parks, Rosenberger, VanMeter, Ratner, Vaidya & Gaillard, Reference Berl, Mayo, Parks, Rosenberger, VanMeter, Ratner, Vaidya and Gaillard 2014 ; Ressel, Wilke, Lidzba, Lutzenberger & Krägeloh-Mann, Reference Ressel, Wilke, Lidzba, Lutzenberger and Krägeloh-Mann 2008 ). The L1 AoA effects on brain language processing discussed above add to this developmental picture by demonstrating that the brain language system requires linguistic experience in order to potentiate its development from infancy to adolescence. Moreover, the onset of linguistic experience needs to be synchronous with post-natal brain maturation.

There are multiple environmental effects on brain development that are only beginning to be understood. For example, complex environments induce greater proliferation of neuronal growth during the exuberant phase of brain development as compared to impoverished environments (Greenough, Black & Wallace, Reference Greenough, Black and Wallace 1987 ). Synapses that survive the pruning phase of neural development are those that have been stimulated by environmental input (Hensch, Reference Hensch 2005 ; P. R. Huttenlocher, Reference Huttenlocher 1990 ). As Hebb ( Reference Hebb 1949 ) initially proposed, synapses that fire together, wire together. Synaptic firing prompts myelination (Ishibashi, Dakin, Stevens, Kozlov, Stewart, Lee & Fields, Reference Ishibashi, Dakin, Stevens, Lee, Kozlov, Stewart and Fields 2006 ). Myelination of the fiber pathways connecting language areas in the temporal lobe to those in the frontal lobe develop with age and with vocabulary development (Pujol, Soriano-Mas, Oritz, Sebastián-Gallés & Deus, Reference Pujol, Soriano-Mas, Oritz, Sebastián-Gallés, Losilla and Deus 2006 ). The dorsal pathway, the arcuate fasciculus, has been found to correlate with the comprehension of complex sentence structure in children (Skeide, Brauer & Friederici, Reference Skeide, Brauer and Friederici 2016 ). Rather than being a strictly biologically driven maturational constraint on language acquisition, current work in our laboratory suggests that development of this neural fiber tract is driven, in part, by language experience and acquisition. The arcuate fasciculus is significantly less developed in cases of late L1 acquisition compared with the same fiber tract in native deaf signers and hearing L2 signers, for whom anatomical measures of this fiber tract do not differ (Chen, Roth, Halgren & Mayberry, under review). Given these structural findings about the brain language system, one explanation for the circumscribed level of language acquisition attained by very late L1 learners is that their brain language systems are incompletely developed due to a lack of linguistic experience during childhood brain maturation. In other words, language acquisition and development of the brain language system appear to reciprocally affect one another, but only when the onset of language experience is synchronous with the onset of post-natal brain development. Under this scenario, L1 acquisition and development of the brain language system can be considered an example of critical period learning. These factors are not so clearly at play in L2 learning where the acquisition of a linguistic system and its neural underpinnings are already established.

In conclusion, nearly half a century of scientific discovery has occurred since Lenneberg ( Reference Lenneberg 1967 ) made his paradigm changing observations and hypotheses about the biological and developmental nature of language acquisition. Sign language was not considered to be a language at the time, and neuroimaging technologies existing at the time provided limited understanding of the healthy brain. Much information remains to be learned about how the remarkable human achievement of language acquisition and the neural systems that enable it develop, and how environmental language affects them both. Sign language research has changed our thinking about the role sensorimotor modalities play in language structure and brain language processing. It also promises to reveal the complex and intertwined processes of language acquisition.

* Preparation of this paper and the recent research reported here was supported by NIH grant R01DC012797 to RM. Some research was supported by a grant from the Kavli Institute for Brain & Mind at UCSD, and other research conducted at McGill University was supported by grants from the Natural Science and Engineering Research Council of Canada to RM. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. The authors thank the many individuals who graciously volunteered for the studies described here, and Marla Hatrak, Deniz Ilkbasaran, Drucilla Ronchen, and Pamela Witcher for invaluable research assistance.

1 Because AoA is the more traditional term found in this line of research, we adopt it here.

2 This rate of incidence, although low, is roughly equivalent to the mean percentage of individuals who claim same-sex attraction among the general population in studies worldwide. Clearly same-sex attraction is not the norm, and it would be folly to claim so. But it is equally implausible to claim the same-sex attraction is biologically impossible because it occurs with this rate of frequency, even if it is not the norm.

3 The intensity level of speech is typically described as being around 60 dB (spoken by an average size man standing one meter from the listener). The term deaf as used here is defined as severe , hearing loss greater than 70dB, to profound, greater than 90dB. Individuals with severe to profound hearing loss cannot perceive speech auditorily.

4 Less than 10% of the population of deaf individuals is born into deaf families (Mitchell & Karchmer, Reference Mitchell and Karchmer 2004 ).

5 Meningitis or measles epidemics are less common today in North America, but viral infections continue to be a major cause of post-infancy deafness in many developing countries (Morgan & Mayberry, Reference Morgan and Mayberry 2012 )

Crossref logo

This article has been cited by the following publications. This list is generated based on data provided by Crossref .

  • Google Scholar

View all Google Scholar citations for this article.

Save article to Kindle

To save this article to your Kindle, first ensure [email protected] is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle .

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

  • Volume 21, Issue 5
  • RACHEL I. MAYBERRY (a1) and ROBERT KLUENDER (a1)
  • DOI: https://doi.org/10.1017/S1366728917000724

Save article to Dropbox

To save this article to your Dropbox account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you used this feature, you will be asked to authorise Cambridge Core to connect with your Dropbox account. Find out more about saving content to Dropbox .

Save article to Google Drive

To save this article to your Google Drive account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you used this feature, you will be asked to authorise Cambridge Core to connect with your Google Drive account. Find out more about saving content to Google Drive .

Reply to: Submit a response

- No HTML tags allowed - Web page URLs will display as text only - Lines and paragraphs break automatically - Attachments, images or tables are not permitted

Your details

Your email address will be used in order to notify you when your comment has been reviewed by the moderator and in case the author(s) of the article or the moderator need to contact you directly.

You have entered the maximum number of contributors

Conflicting interests.

Please list any fees and grants from, employment by, consultancy for, shared ownership in or any close relationship with, at any time over the preceding 36 months, any organisation whose interests may be affected by the publication of the response. Please also list any non-financial associations or interests (personal, professional, political, institutional, religious or other) that a reasonable reader would want to know about in relation to the submitted work. This pertains to all the authors of the piece, their spouses or partners.

  • Search Menu
  • Sign in through your institution
  • Advance articles
  • Editor's Choice
  • Key Concepts
  • The View From Here
  • Author Guidelines
  • Submission Site
  • Open Access
  • Why Publish?
  • About ELT Journal
  • Editorial Board
  • Advertising and Corporate Services
  • Journals Career Network
  • Self-Archiving Policy
  • Dispatch Dates
  • Terms and Conditions
  • Journals on Oxford Academic
  • Books on Oxford Academic

Article Contents

  • < Previous

Age and the critical period hypothesis

  • Article contents
  • Figures & tables
  • Supplementary Data

Christian Abello-Contesse, Age and the critical period hypothesis, ELT Journal , Volume 63, Issue 2, April 2009, Pages 170–172, https://doi.org/10.1093/elt/ccn072

  • Permissions Icon Permissions

In the field of second language acquisition (SLA), how specific aspects of learning a non-native language (L2) may be affected by when the process begins is referred to as the ‘age factor’. Because of the way age intersects with a range of social, affective, educational, and experiential variables, clarifying its relationship with learning rate and/or success is a major challenge.

There is a popular belief that children as L2 learners are ‘superior’ to adults ( Scovel 2000 ), that is, the younger the learner, the quicker the learning process and the better the outcomes. Nevertheless, a closer examination of the ways in which age combines with other variables reveals a more complex picture, with both favourable and unfavourable age-related differences being associated with early- and late-starting L2 learners ( Johnstone 2002 ).

The ‘critical period hypothesis’ (CPH) is a particularly relevant case in point. This is the claim that there is, indeed, an optimal period for language acquisition, ending at puberty. However, in its original formulation ( Lenneberg 1967 ), evidence for its existence was based on the relearning of impaired L1 skills, rather than the learning of a second language under normal circumstances.

Furthermore, although the age factor is an uncontroversial research variable extending from birth to death ( Cook 1995 ), and the CPH is a narrowly focused proposal subject to recurrent debate, ironically, it is the latter that tends to dominate SLA discussions ( García Lecumberri and Gallardo 2003 ), resulting in a number of competing conceptualizations. Thus, in the current literature on the subject ( Bialystok 1997 ; Richards and Schmidt 2002 ; Abello-Contesse et al. 2006), references can be found to (i) multiple critical periods (each based on a specific language component, such as age six for L2 phonology), (ii) the non-existence of one or more critical periods for L2 versus L1 acquisition, (iii) a ‘sensitive’ yet not ‘critical’ period, and (iv) a gradual and continual decline from childhood to adulthood.

It therefore needs to be recognized that there is a marked contrast between the CPH as an issue of continuing dispute in SLA, on the one hand, and, on the other, the popular view that it is an invariable ‘law’, equally applicable to any L2 acquisition context or situation. In fact, research indicates that age effects of all kinds depend largely on the actual opportunities for learning which are available within overall contexts of L2 acquisition and particular learning situations, notably the extent to which initial exposure is substantial and sustained ( Lightbown 2000 ).

Thus, most classroom-based studies have shown not only a lack of direct correlation between an earlier start and more successful/rapid L2 development but also a strong tendency for older children and teenagers to be more efficient learners. For example, in research conducted in the context of conventional school programmes, Cenoz (2003) and Muñoz (2006) have shown that learners whose exposure to the L2 began at age 11 consistently displayed higher levels of proficiency than those for whom it began at 4 or 8. Furthermore, comparable limitations have been reported for young learners in school settings involving innovative, immersion-type programmes, where exposure to the target language is significantly increased through subject-matter teaching in the L2 ( Genesee 1992 ; Abello-Contesse 2006 ). In sum, as Harley and Wang (1997) have argued, more mature learners are usually capable of making faster initial progress in acquiring the grammatical and lexical components of an L2 due to their higher level of cognitive development and greater analytical abilities.

In terms of language pedagogy, it can therefore be concluded that (i) there is no single ‘magic’ age for L2 learning, (ii) both older and younger learners are able to achieve advanced levels of proficiency in an L2, and (iii) the general and specific characteristics of the learning environment are also likely to be variables of equal or greater importance.

Google Scholar

Google Preview

Month: Total Views:
November 2016 31
December 2016 27
January 2017 31
February 2017 151
March 2017 238
April 2017 217
May 2017 355
June 2017 190
July 2017 91
August 2017 126
September 2017 264
October 2017 449
November 2017 743
December 2017 2,636
January 2018 2,610
February 2018 2,558
March 2018 3,166
April 2018 3,303
May 2018 3,359
June 2018 2,511
July 2018 2,078
August 2018 2,265
September 2018 2,635
October 2018 2,792
November 2018 3,935
December 2018 3,107
January 2019 2,182
February 2019 2,369
March 2019 3,416
April 2019 3,041
May 2019 2,845
June 2019 2,220
July 2019 2,079
August 2019 2,154
September 2019 2,452
October 2019 2,578
November 2019 2,371
December 2019 1,968
January 2020 1,602
February 2020 1,679
March 2020 1,768
April 2020 2,161
May 2020 1,377
June 2020 1,934
July 2020 1,221
August 2020 1,264
September 2020 1,773
October 2020 2,082
November 2020 2,169
December 2020 2,161
January 2021 1,988
February 2021 1,588
March 2021 1,974
April 2021 1,892
May 2021 1,617
June 2021 1,224
July 2021 981
August 2021 983
September 2021 1,286
October 2021 1,714
November 2021 1,757
December 2021 1,510
January 2022 1,419
February 2022 1,028
March 2022 1,344
April 2022 993
May 2022 947
June 2022 698
July 2022 534
August 2022 337
September 2022 496
October 2022 836
November 2022 817
December 2022 701
January 2023 682
February 2023 419
March 2023 636
April 2023 706
May 2023 656
June 2023 422
July 2023 709
August 2023 343
September 2023 411
October 2023 619
November 2023 751
December 2023 501
January 2024 534
February 2024 345
March 2024 685
April 2024 671
May 2024 687
June 2024 388
July 2024 370
August 2024 312
September 2024 113

Email alerts

Citing articles via.

  • Recommend to Your Library

Affiliations

  • Online ISSN 1477-4526
  • Print ISSN 0951-0893
  • Copyright © 2024 Oxford University Press
  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Institutional account management
  • Rights and permissions
  • Get help with access
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

  • Language Teaching
  • Second Language Acquisition

The Critical Period Hypothesis in Second Language Acquisition: A Review of the Literature

  • iJARS International Journal Of Humanities and Social Studies 8(4):20

Samia Azieb at Najran University

  • Najran University

Discover the world's research

  • 25+ million members
  • 160+ million publication pages
  • 2.3+ billion citations

Akram Alaedini

  • Francisca T. Uy

Fideliza Cojuangco

  • Imee B. Dicdiquin
  • Jennifer Markovits
  • Cristian Abarca
  • Samina Sarwat
  • Mobin Asghar

Waheed Shahzad

  • Asma Manzoor
  • Abdul Shakoor
  • Natasha Kokab

Syed Shahzad

  • Huachuan Wen

Wuhan Zhu

  • BEHAV BRAIN SCI
  • Samuel David Epstein

Suzanne Flynn

  • STUD SECOND LANG ACQ

Theo Bongaerts

  • Chantal van Summeren 

Brigitte Planken

  • Erik Schils

Susan Mennen

  • David Singleton
  • Silvina A. Montrul

H. Douglas Brown

  • Recruit researchers
  • Join for free
  • Login Email Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google Welcome back! Please log in. Email · Hint Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google No account? Sign up
  • DOI: 10.1017/S1366728918001025
  • Corpus ID: 149642164

Critical periods for language acquisition: New insights with particular reference to bilingualism research

  • J. Abutalebi , H. Clahsen
  • Published in Bilingualism: Language and… 23 October 2018
  • Linguistics
  • Bilingualism: Language and Cognition

12 Citations

Is there a critical period for second language acquisition a theoretical social physics approach, critical period in second language acquisition: the age-attainment geometry, the process of acquiring declarative sentence on aphasia, acceptability of different psychological verbal constructions by heritage spanish speakers from california, still ‘native’ morphological processing in second-language-immersed speakers, the effect of aphasia on sixty-six years old in language acquisition, chinese children's chinese-english bilingual acquisition mode: a comparative experiment of ai educational robot and scan reading pen, sociological implications of the acquisition of the fante language by northern ghanaian immigrants in the sekondi-takoradi metropolis, a comparative study on the development of chinese and english abilities of chinese primary school students through two bilingual reading modes: human-ai robot interaction and paper books, the attrition of the polish nominal declension system in polish immigrants, 28 references, can the critical period be saved a bilingual perspective, a critical period for second language acquisition: evidence from 2/3 million english speakers, the critical period hypothesis: a diamond in the rough.

  • Highly Influential

Nonconvergence on the native speaker grammar: Defining L2 success

Understanding individual variation in levels of second language attainment through the lens of critical period mechanisms, second language ultimate attainment: effects of maturation, exercise, and social/psychological factors, the cambridge handbook of biolinguistics: sensitive phases in successive language acquisition: the critical period hypothesis revisited, is there a critical period for l1 but not l2, rethinking the critical period for language: new insights into an old question from american sign language., sensitive periods in both l1 and l2: some conceptual and methodological suggestions, related papers.

Showing 1 through 3 of 0 Related Papers

The Critical Period Hypothesis for Second Language Acquisition: Tailoring the Coat of Many Colors

  • First Online: 01 January 2013

Cite this chapter

critical period hypothesis language learning

  • David Birdsong 4  

Part of the book series: Second Language Learning and Teaching ((SLLT))

3214 Accesses

16 Citations

The present contribution represents an extension of David Singleton’s ( 2005 ) IRAL chapter, “The Critical Period Hypothesis: A coat of many colours”. I suggest that the CPH in its application to L2 acquisition could benefit from methodological and theoretical tailoring with respect to: the shape of the function that relates age of acquisition to proficiency, the use of nativelikeness for falsification of the CPH, and the framing of predictors of L2 attainment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
  • Durable hardcover edition

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

critical period hypothesis language learning

Second Language Acquisition Research Methods

critical period hypothesis language learning

The Socio-educational Model of Second Language Acquisition

Granena and Long ( 2013 ) applied multiple linear regression analyses to the relationship of Chinese natives’ AoA to their attainment in L2 Spanish morphosyntax, phonology, and lexis and collocation. For each of these three linguistic domains, including breakpoints in the model revealed a small (5 %) but statistically significant increase in variance accounted for, as compared to the variance accounted for in a model with no breakpoints. According to the authors, the fact that the improvement was so small “could mean that the less complex (i.e. more parsimonious) model with no breakpoints is already a good enough fit to the data or, alternatively, that a larger sample size is needed to compensate for the loss of degrees of freedom and to minimize the risk of overfitting” (2013: 326–327).

DeKeyser ( 2000 : 515) erroneously reports that the correlation of years of schooling and GJ scores is r  = 0.006 ns, for early arrivals, and r  = 0.08 ns, for late arrivals. In fact, these reported coefficients reflect correlations of years of schooling with aptitude ; see discussion to follow.

Abrahamsson, N. and K. Hyltenstam. 2009. Age of onset and nativelikeness in a second language: Listener perception versus linguistic scrutiny. Language Learning 59: 249–306.

Google Scholar  

Ayer, A. J. 1959. History of the Logical Positivist movement. In Logical Positivism , ed. A. J. Ayer, 3–28. New York: Free Press.

Birdsong, D. 2005. Interpreting age effects in second language acquisition. In Handbook of bilingualism , eds. J. Kroll and A. DeGroot, 109–127. Oxford: Oxford University Press.

Birdsong, D. and M. Molis. 2001. On the evidence for maturational constraints in second-language acquisition. Journal of Memory and Language 44: 235–249.

Carroll, J. B. and S. M. Sapon. 1959. Modern Language Aptitude Test: Manual. New York: Psychological Corporation.

Cook, V. 2003. Effects of the second language on the first. Clevedon, UK: Multilingual Matters.

DeKeyser, R. M. 2000. The robustness of critical period effects in second language acquisition. Studies in Second Language Acquisition 22: 499–533.

DeKeyser, R., I. Alfi-Shabtay and D. Ravid. 2010. Cross-linguistic evidence for the nature of age effects in second language acquisition. Applied Psycholinguistics 31: 413–438.

Fowler, C. A., V. Sramko, D. J. Ostry, S. A. Rowland and P. Hallé. 2008. Cross language phonetic influences on the speech of French-English bilinguals. Journal of Phonetics 36: 649–663.

Granena, G. and M. H. Long. 2013. Age of onset, length of residence, language aptitude, and ultimate L2 attainment in three linguistic domains. Second Language Research 29: 311–343.

Hakuta, K., E. Bialystok and E. Wiley. 2003. Critical evidence: A test of the Critical-Period Hypothesis for second-language acquisition. Psychological Science 14: 31–38.

Hyltenstam, K. and N. Abrahamsson. 2003. Maturational constraints in SLA. The handbook of second language acquisition , eds. M. H. Long and C. J. Doughty, 539–588. Malden, MA: Blackwell.

Johnson, J. S. and E. L. Newport. 1989. Critical period effects in second language learning: The influence of maturational state on the acquisition of English as a second language. Cognitive Psychology 21: 60–99.

Lenneberg, E. H. 1967. Biological foundations of language. New York: Wiley.

Long, M. H. 1990. Maturational constraints on language development. Studies in Second Language Acquisition 12: 251–285.

Ortega, L. 2009. Understanding second language acquisition. London: Hodder Education.

Penfield, W. and L. Roberts. 1959. Speech and brain mechanisms. Princeton, NJ: Princeton University Press.

Popper, K. 1959. The logic of scientific discovery . New York: Basic Books.

Singleton, D. 2005. The Critical Period Hypothesis: A coat of many colours. International Review of Applied Linguistics in Language Teaching 43: 269–285.

Stevens, G. 2004. Using census data to test the critical-period hypothesis for second-language acquisition. Psychological Science 15: 215–216.

Vanhove, J. 2013. The critical period hypothesis in second language acquisition: A statistical critique and a reanalysis. PLoS ONE. 8(7): e69172. doi: 10.137/journal.pone.0069172

Download references

Author information

Authors and affiliations.

University of Texas at Austin, Texas, USA

David Birdsong

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to David Birdsong .

Editor information

Editors and affiliations.

Faculty of Pedagogy and Fine Arts Dept. of English Studies, Adam Mickiewicz University, Kalisz, Wielkopolskie, Poland

Mirosław Pawlak

Graduate Studies Faculty, Oranim Academic College of Education, Tivon, Israel

Larissa Aronin

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this chapter

Birdsong, D. (2014). The Critical Period Hypothesis for Second Language Acquisition: Tailoring the Coat of Many Colors. In: Pawlak, M., Aronin, L. (eds) Essential Topics in Applied Linguistics and Multilingualism. Second Language Learning and Teaching. Springer, Cham. https://doi.org/10.1007/978-3-319-01414-2_3

Download citation

DOI : https://doi.org/10.1007/978-3-319-01414-2_3

Published : 19 September 2013

Publisher Name : Springer, Cham

Print ISBN : 978-3-319-01413-5

Online ISBN : 978-3-319-01414-2

eBook Packages : Humanities, Social Sciences and Law Education (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

Critical Period In Brain Development and Childhood Learning

Charlotte Nickerson

Research Assistant at Harvard University

Undergraduate at Harvard University

Charlotte Nickerson is a student at Harvard University obsessed with the intersection of mental health, productivity, and design.

Learn about our Editorial Process

Saul McLeod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul McLeod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

Olivia Guy-Evans, MSc

Associate Editor for Simply Psychology

BSc (Hons) Psychology, MSc Psychology of Education

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

On This Page:

Key Takeaways

  • Critical period is an ethological term that refers to a fixed and crucial time during the early development of an organism when it can learn things that are essential to survival. These influences impact the development of processes such as hearing and vision, social bonding, and language learning.
  • The term is most often experienced in the study of imprinting, where it is thought that young birds could only develop an attachment to the mother during a fixed time soon after hatching.
  • Neurologically, critical periods are marked by high levels of plasticity in the brain before neural connections become more solidified and stable. In particular, critical periods tend to end when synapses that inhibit the neurotransmitter GABA mature.
  • In contrast to critical periods, sensitive periods, otherwise known as “weak critical periods,” happen when an organism is more sensitive than usual to outside factors influencing behavior, but this influence is not necessarily restricted to the sensitive period.
  • Scholars have debated the extent to which older organisms can develop certain skills, such as natively-accented foreign languages, after the critical period.

brain critical development

The critical period is a biologically determined stage of development where an organism is optimally ready to acquire some pattern of behavior that is part of typical development. This period, by definition, will not recur at a later stage.

If an organism does not receive exposure to the appropriate stimulus needed to learn a skill during a critical period, it may be difficult or even impossible for that organism to develop certain functions associated with that skill later in life.

This happens because a range of functional and structural elements prevent passive experiences from eliciting significant changes in the brain (Cisneros-Franco et al., 2020).

The first strong proponent of the theory of critical periods was Charles Stockhard (1921), a biologist who attempted to experiment with the effects of various chemicals on the development of fish embryos, though he gave credit to Dareste for originating the idea 30 years earlier (Scott, 1962).

Stockhard’s experiments showed that applying almost any chemical to fish embryos at a certain stage of development would result in one-eyed fish.

These experiments established that the most rapidly growing tissues in an embryo are the most sensitive to any change in conditions, leading to effects later in development (Scott, 1962).

Meanwhile, psychologist Sigmund Freud attempted to explain the origins of neurosis in human patients as the result of early experiences, implying that infants are particularly sensitive to influences at certain points in their lives.

Lorenz (1935) later emphasized the importance of critical periods in the formation of primary social bonds (otherwise known as imprinting) in birds, remarking that this psychological imprinting was similar to critical periods in the development of the embryo.

Soon thereafter, McGraw (1946) pointed out the existence of critical periods for the optimal learning of motor skills in human infants (Scott, 1962).

Example: Infant-Parent Attachment

The concept of critical or sensitive periods can also be found in the domain of social development, for example, in the formation of the infant-parent attachment relationship (Salkind, 2005).

Attachment describes the strong emotional ties between the infant and caregiver, a reciprocal relationship developing over the first year of the child’s life and particularly during the second six months of the first year.

During this attachment period , the infant’s social behavior becomes increasingly focused on the principal caregivers (Salkind, 2005).

The 20th-century English psychiatrist John Bowlby formulated and presented a comprehensive theory of attachment influenced by evolutionary theory.

Bowlby argued that the infant-parent attachment relationship develops because it is important to the survival of the infant and that the period from six to twenty-four months of age is a critical period of attachment.

This coincides with an infant’s increasing tendency to approach familiar caregivers and to be wary of unfamiliar adults. After this critical period, it is still possible for a first attachment relationship to develop, albeit with greater difficulty (Salkind, 2005).

This has brought into question, in a similar vein to language development, whether there is actually a critical development period for infant-caregiver attachment.

Sources debating this issue typically include cases of infants who did not experience consistent caregiving due to being raised in institutions prior to adoption (Salkind, 2005).

Early research into the critical period of attachment, published in the 1940s, reports consistently that children raised in orphanages subsequently showed unusual and maladaptive patterns of social behavior, difficulty in forming close relationships, and being indiscriminately friendly toward unfamiliar adults (Salkind, 2005).

Later, research from the 1990s indicated that adoptees were actually still able to form attachment relationships after the first year of life and also made developmental progress following adoption.

Nonetheless, these children had an overall increased risk of insecure or maladaptive attachment relationships with their adoptive parents. This evidence supports the notion of a sensitive period, but not a critical period, in the development of first attachment relationships (Salkind, 2005).

Mechanisms for Critical Periods

Both genetics and sensory experiences from outside the body shape the brain as it develops (Knudsen, 2004). However, the developmental stage that an organism is in significantly impacts how much the brain can change based on these experiences.

In scientific terms, the brain’s plasticity changes over the course of a lifespan. The brain is very plastic in the early stages of life before many key connections take root, but less so later.

This is why researchers have shown that early experience is crucial for the development of, say, language and musical abilities, and these skills are more challenging to take up in adulthood (Skoe and Kraus, 2013; White et al., 2013; Hartshorne et al., 2018).

As brains mature, the connections in them become more fixed. The brain’s transitions from a more plastic to a more fixed state advantageously allow it to retain new and complex processes, such as perceptual, motor, and cognitive functions (Piaget, 1962).

Children’s gestures, for example, pride and predict how they will acquire oral language skills (Colonnesi et al., 2010), which in turn are important for developing executive functions (Marcovitch and Zelazo, 2009).

However, this formation of stable connections in the brain can limit how the brain’s neural circuitry can be revised in the future. For example, if a young organism has abnormal sensory experiences during the critical period – such as auditory or visual deprivation – the brain may not wire itself in a way that processes future sensory inputs properly (Gallagher et al., 2020).

One illustration of this is the timing of cochlear implants – a prosthesis that restores hearing in some deaf people. Children who receive cochlear implants before two years of age are more likely to benefit from them than those who are implanted later in life (Kral and Eggermont, 2007; Gallagher et al., 2020).

Similarly, the visual deprivation caused by cataracts in infants can cause similar consequences. When cataracts are removed during early infancy, individuals can develop relatively normal vision; however, when the cataracts are not removed until adulthood, this results in substantially poorer vision (Martins Rosa et al., 2013).

After the critical period closes, abnormal sensory experiences have a less drastic effect on the brain and lead to – barring direct damage to the central nervous system – reversible changes (Gallagher et al., 2020). Much of what scientists know about critical periods derives from animal studies , as these allow researchers greater control over the variables that they are testing.

This research has found that different sensory systems, such as vision, auditory processing, and spatial hearing, have different critical periods (Gallagher et al., 2020).

The brain regulates when critical periods open and close by regulating how much the brain’s synapses take up neurotransmitters , which are chemical substances that affect the transmission of electrical signals between neurons.

In particular, over time, synapses decrease their uptake of gamma-aminobutyric acid, better known as GABA. At the beginning of the critical period, outside sources become more effective at influencing changes and growth in the brain.

Meanwhile, as the inhibitory circuits of the brain mature, the mature brain becomes less sensitive to sensory experiences (Gallagher et al., 2020).

Critical Periods vs Sensitive Periods

Critical periods are similar to sensitive periods, and scholars have, at times, used them interchangeably. However, they describe distinct but overlapping developmental processes.

A sensitive period is a developmental stage where sensory experiences have a greater impact on behavioral and brain development than usual; however, this influence is not exclusive to this time period (Knudsen, 2004; Gallagher, 2020). These sensitive periods are important for skills such as learning a language or instrument.

In contrast, A critical period is a special type of sensitive period – a window where sensory experience is necessary to shape the neural circuits involved in basic sensory processing, and when this window opens and closes is well-defined (Gallagher, 2020).

Researchers also refer to sensitive periods as weak critical periods. Some examples of strong critical periods include the development of vision and hearing, while weak critical periods include phenome tuning – how children learn how to organize sounds in a language, grammar processing, vocabulary acquisition, musical training, and sports training (Gallagher et al., 2020).

Critical Period Hypothesis

One of the most notable applications of the concept of a critical period is in linguistics. Scholars usually trace the origins of the debate around age in language acquisition to Penfield and Robert’s (2014) book Speech and Brain Mechanisms.

In the 1950s and 1960s, Penfield was a staunch advocate of early immersion education (Kroll and De Groot, 2009). Nonetheless, it was Lenneberg, in his book Biological Foundations of Language, who coined the term critical period (1967) in describing the language period.

Lennenberg (1967) described a critical period as a period of automatic acquisition from mere exposure” that “seems to disappear after this age.” Scovel (1969) later summarized and narrowed Penfield’s and Lenneberg’s view on the critical period hypothesis into three main claims:

  • Adult native speakers can identify non-natives by their accents immediately and accurately.
  • The loss of brain plasticity at about the age of puberty accounts for the emergence of foreign accents./li>
  • The critical period hypothesis only holds for speech (whether or not someone has a native accent) and does not affect other areas of linguistic competence.

Linguists have since attempted to find evidence for whether or not scientific evidence actually supports the critical period hypothesis, if there is a critical period for acquiring accentless speech, for “morphosyntactic” competence, and if these are true, how age-related differences can be explained on the neurological level (Scovel, 2000).

The critical period hypothesis applies to both first and second-language learning. Until recently, research around the critical period’s role in first language acquisition revolved around findings about so-called “feral” children who had failed to acquire language at an older age after having been deprived of normal input during the critical period.

However, these case studies did not account for the extent to which social deprivation, and possibly food deprivation or sensory deprivation, may have confounded with language input deprivation (Kroll and De Groot, 2009).

More recently, researchers have focused more systematically on deaf children born to hearing parents who are therefore deprived of language input until at least elementary school.

These studies have found the effects of lack of language input without extreme social deprivation: the older the age of exposure to sign language is, the worse its ultimate attainment (Emmorey, Bellugi, Friederici, and Horn, 1995; Kroll and De Groot, 2009).

However, Kroll and De Groot argue that the critical period hypothesis does not apply to the rate of acquisition of language. Adults and adolescents can learn languages at the same rate or even faster than children in their initial stage of acquisition (Slavoff and Johnson, 1995).

However, adults tend to have a more limited ultimate attainment of language ability (Kroll and De Groot, 2009).

There has been a long lineage of empirical findings around the age of acquisition. The most fundamental of this research comes from a series of studies since the late 1970s documenting a negative correlation between age of acquisition and ultimate language mastery (Kroll and De Grott, 2009).

Nonetheless, different periods correspond to sensitivity to different aspects of language. For example, shortly after birth, infants can perceive and discriminate speech sounds from any language, including ones they have not been exposed to (Eimas et al., 1971; Gallagher et al., 2020).

Around six months of age, exposure to the primary language in the infant’s environment guides phonetic representations of language and, subsequently, the neural representations of speech sounds of the native language while weakening those of unused sounds (McClelland et al., 1999; Gallagher et al., 2020).

Vocabulary learning experiences rapid growth at about 18 months of age (Kuhl, 2010).

Critical Evaluation

More than any other area of applied linguistics, the critical period hypothesis has impacted how teachers teach languages. Consequently, researchers have critiqued how important the critical period is to language learning.

For example, several studies in early language acquisition research showed that children were not necessarily superior to older learners in acquiring a second language, even in the area of pronunciation (Olson and Samuels, 1973; Snow and Hoefnagel-Hohle, 1978; Scovel, 2000).

In fact, the majority of researchers at the time appeared to be skeptical about the existence of a critical period, with some explicitly denying its existence.

Counter to one of the primary tenets of Scovel’s (1969) critical period hypothesis, there have been several cases of people who have acquired a second language in adulthood speaking with native accents.

For example, Moyer’s study of highly proficient English-speaking learners of German suggested that at least one of the participants was judged to have native-like pronunciation in his second language (1999), and several participants in Bongaerts (1999) study of highly proficient Dutch speakers of French spoke with accents judged to be native (Scovel, 2000).

Bongaerts, T. (1999). Ultimate attainment in L2 pronunciation: The case of very advanced late L2 learners. Second language acquisition and the critical period hypothesis, 133-159.

Cisneros-Franco, J. M., Voss, P., Thomas, M. E., & de Villers-Sidani, E. (2020). Critical periods of brain development. In Handbook of Clinical Neurolog y (Vol. 173, pp. 75-88). Elsevier.

Colonnesi, C., Stams, G. J. J., Koster, I., & Noom, M. J. (2010). The relation between pointing and language development: A meta-analysis. Developmental Review, 30 (4), 352-366.

Eimas, P. D., Siqueland, E. R., Jusczyk, P., & Vigorito, J. (1971). Speech perception in infants. Science, 171 (3968), 303-306.

Emmorey, K., Bellugi, U., Friederici, A., & Horn, P. (1995). Effects of age of acquisition on grammatical sensitivity: Evidence from on-line and off-line tasks. Applied Psycholinguistics, 16 (1), 1-23.

Knudsen, E. I. (2004). Sensitive periods in the development of the brain and behavior. Journal of cognitive neuroscience, 16 (8), 1412-1425.

Hartshorne, J. K., Tenenbaum, J. B., & Pinker, S. (2018). A critical period for second language acquisition: Evidence from 2/3 million English speakers. Cognition, 177 , 263-277.

Kral, A., & Eggermont, J. J. (2007). What’s to lose and what’s to learn: development under auditory deprivation, cochlear implants and limits of cortical plasticity. Brain Research Reviews, 56(1), 259-269.

Kroll, J. F., & De Groot, A. M. (Eds.). (2009). Handbook of bilingualism: Psycholinguistic approaches . Oxford University Press.

Kuhl, P. K. (2010). Brain mechanisms in early language acquisition. Neuron, 67 (5), 713-727.

Lenneberg, E. H. (1967). The biological foundations of language. Hospital Practice, 2( 12), 59-67.

Lorenz, K. (1935). Der kumpan in der umwelt des vogels. Journal für Ornithologie, 83 (2), 137-213.

Marcovitch, S., & Zelazo, P. D. (2009). A hierarchical competing systems model of the emergence and early development of executive function. Developmental science, 12 (1), 1-18.

McClelland, J. L., Thomas, A. G., McCandliss, B. D., & Fiez, J. A. (1999). Understanding failures of learning: Hebbian learning, competition for representational space, and some preliminary experimental data. Progress in brain research, 121, 75-80.

McGraw, M. B. (1946). Maturation of behavior. In Manual of child psychology. (pp. 332-369). John Wiley & Sons Inc.

Moyer, A. (1999). Ultimate attainment in L2 phonology: The critical factors of age, motivation, and instruction. Studies in second language acquisition, 21 (1), 81-108.

Gallagher, A., Bulteau, C., Cohen, D., & Michaud, J. L. (2019). Neurocognitive Development: Normative Development. Elsevier.

Olson, L. L., & Jay Samuels, S. (1973). The relationship between age and accuracy of foreign language pronunciation. The Journal of Educational Research, 66 (6), 263-268.

Penfield, W., & Roberts, L. (2014). Speech and brain mechanisms. Princeton University Press.

Piaget, J. (1962). The stages of the intellectual development of the child. Bulletin of the Menninger Clinic, 26 (3), 120.

Rosa, A. M., Silva, M. F., Ferreira, S., Murta, J., & Castelo-Branco, M. (2013). Plasticity in the human visual cortex: an ophthalmology-based perspective. BioMed research international, 2013.

Salkind, N. J. (Ed.). (2005). Encyclopedia of human development . Sage Publications.

Scott, J. P. (1962). Critical periods in behavioral development. Science, 138 (3544), 949-958.

Scovel, T. (1969). Foreign accents, language acquisition, and cerebral dominance 1. Language learning, 19 (3‐4), 245-253.

Scovel, T. (2000). A critical review of the critical period research. Annual review of applied linguistics, 20 , 213-223.

Skoe, E., & Kraus, N. (2013). Musical training heightens auditory brainstem function during sensitive periods in development. Frontiers in Psychology, 4, 622.

Slavoff, G. R., & Johnson, J. S. (1995). The effects of age on the rate of learning a second language. Studies in Second Language Acquisition, 17 (1), 1-16.

Snow, C. E., & Hoefnagel-Höhle, M. (1978). The critical period for language acquisition: Evidence from second language learning. Child development, 1114-1128.

Stockard, C. R. (1921). Developmental rate and structural expression: an experimental study of twins,‘double monsters’ and single deformities, and the interaction among embryonic organs during their origin and development. American Journal of Anatomy, 28 (2), 115-277.

White, E. J., Hutka, S. A., Williams, L. J., & Moreno, S. (2013). Learning, neural plasticity and sensitive periods: implications for language acquisition, music training and transfer across the lifespan. Frontiers in systems neuroscience, 7, 90.

Further Information

Print Friendly, PDF & Email

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here .

Loading metrics

Open Access

Peer-reviewed

Research Article

The Critical Period Hypothesis in Second Language Acquisition: A Statistical Critique and a Reanalysis

* E-mail: [email protected]

Affiliation Department of Multilingualism, University of Fribourg, Fribourg, Switzerland

  • Jan Vanhove

PLOS

  • Published: July 25, 2013
  • https://doi.org/10.1371/journal.pone.0069172
  • Reader Comments

17 Jul 2014: The PLOS ONE Staff (2014) Correction: The Critical Period Hypothesis in Second Language Acquisition: A Statistical Critique and a Reanalysis. PLOS ONE 9(7): e102922. https://doi.org/10.1371/journal.pone.0102922 View correction

Figure 1

In second language acquisition research, the critical period hypothesis ( cph ) holds that the function between learners' age and their susceptibility to second language input is non-linear. This paper revisits the indistinctness found in the literature with regard to this hypothesis's scope and predictions. Even when its scope is clearly delineated and its predictions are spelt out, however, empirical studies–with few exceptions–use analytical (statistical) tools that are irrelevant with respect to the predictions made. This paper discusses statistical fallacies common in cph research and illustrates an alternative analytical method (piecewise regression) by means of a reanalysis of two datasets from a 2010 paper purporting to have found cross-linguistic evidence in favour of the cph . This reanalysis reveals that the specific age patterns predicted by the cph are not cross-linguistically robust. Applying the principle of parsimony, it is concluded that age patterns in second language acquisition are not governed by a critical period. To conclude, this paper highlights the role of confirmation bias in the scientific enterprise and appeals to second language acquisition researchers to reanalyse their old datasets using the methods discussed in this paper. The data and R commands that were used for the reanalysis are provided as supplementary materials.

Citation: Vanhove J (2013) The Critical Period Hypothesis in Second Language Acquisition: A Statistical Critique and a Reanalysis. PLoS ONE 8(7): e69172. https://doi.org/10.1371/journal.pone.0069172

Editor: Stephanie Ann White, UCLA, United States of America

Received: May 7, 2013; Accepted: June 7, 2013; Published: July 25, 2013

Copyright: © 2013 Jan Vanhove. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Funding: No current external funding sources for this study.

Competing interests: The author has declared that no competing interests exist.

Introduction

In the long term and in immersion contexts, second-language (L2) learners starting acquisition early in life – and staying exposed to input and thus learning over several years or decades – undisputedly tend to outperform later learners. Apart from being misinterpreted as an argument in favour of early foreign language instruction, which takes place in wholly different circumstances, this general age effect is also sometimes taken as evidence for a so-called ‘critical period’ ( cp ) for second-language acquisition ( sla ). Derived from biology, the cp concept was famously introduced into the field of language acquisition by Penfield and Roberts in 1959 [1] and was refined by Lenneberg eight years later [2] . Lenneberg argued that language acquisition needed to take place between age two and puberty – a period which he believed to coincide with the lateralisation process of the brain. (More recent neurological research suggests that different time frames exist for the lateralisation process of different language functions. Most, however, close before puberty [3] .) However, Lenneberg mostly drew on findings pertaining to first language development in deaf children, feral children or children with serious cognitive impairments in order to back up his claims. For him, the critical period concept was concerned with the implicit “automatic acquisition” [2, p. 176] in immersion contexts and does not preclude the possibility of learning a foreign language after puberty, albeit with much conscious effort and typically less success.

sla research adopted the critical period hypothesis ( cph ) and applied it to second and foreign language learning, resulting in a host of studies. In its most general version, the cph for sla states that the ‘susceptibility’ or ‘sensitivity’ to language input varies as a function of age, with adult L2 learners being less susceptible to input than child L2 learners. Importantly, the age–susceptibility function is hypothesised to be non-linear. Moving beyond this general version, we find that the cph is conceptualised in a multitude of ways [4] . This state of affairs requires scholars to make explicit their theoretical stance and assumptions [5] , but has the obvious downside that critical findings risk being mitigated as posing a problem to only one aspect of one particular conceptualisation of the cph , whereas other conceptualisations remain unscathed. This overall vagueness concerns two areas in particular, viz. the delineation of the cph 's scope and the formulation of testable predictions. Delineating the scope and formulating falsifiable predictions are, needless to say, fundamental stages in the scientific evaluation of any hypothesis or theory, but the lack of scholarly consensus on these points seems to be particularly pronounced in the case of the cph . This article therefore first presents a brief overview of differing views on these two stages. Then, once the scope of their cph version has been duly identified and empirical data have been collected using solid methods, it is essential that researchers analyse the data patterns soundly in order to assess the predictions made and that they draw justifiable conclusions from the results. As I will argue in great detail, however, the statistical analysis of data patterns as well as their interpretation in cph research – and this includes both critical and supportive studies and overviews – leaves a great deal to be desired. Reanalysing data from a recent cph -supportive study, I illustrate some common statistical fallacies in cph research and demonstrate how one particular cph prediction can be evaluated.

Delineating the scope of the critical period hypothesis

First, the age span for a putative critical period for language acquisition has been delimited in different ways in the literature [4] . Lenneberg's critical period stretched from two years of age to puberty (which he posits at about 14 years of age) [2] , whereas other scholars have drawn the cutoff point at 12, 15, 16 or 18 years of age [6] . Unlike Lenneberg, most researchers today do not define a starting age for the critical period for language learning. Some, however, consider the possibility of the critical period (or a critical period for a specific language area, e.g. phonology) ending much earlier than puberty (e.g. age 9 years [1] , or as early as 12 months in the case of phonology [7] ).

Second, some vagueness remains as to the setting that is relevant to the cph . Does the critical period constrain implicit learning processes only, i.e. only the untutored language acquisition in immersion contexts or does it also apply to (at least partly) instructed learning? Most researchers agree on the former [8] , but much research has included subjects who have had at least some instruction in the L2.

Third, there is no consensus on what the scope of the cp is as far as the areas of language that are concerned. Most researchers agree that a cp is most likely to constrain the acquisition of pronunciation and grammar and, consequently, these are the areas primarily looked into in studies on the cph [9] . Some researchers have also tried to define distinguishable cp s for the different language areas of phonetics, morphology and syntax and even for lexis (see [10] for an overview).

Fourth and last, research into the cph has focused on ‘ultimate attainment’ ( ua ) or the ‘final’ state of L2 proficiency rather than on the rate of learning. From research into the rate of acquisition (e.g. [11] – [13] ), it has become clear that the cph cannot hold for the rate variable. In fact, it has been observed that adult learners proceed faster than child learners at the beginning stages of L2 acquisition. Though theoretical reasons for excluding the rate can be posited (the initial faster rate of learning in adults may be the result of more conscious cognitive strategies rather than to less conscious implicit learning, for instance), rate of learning might from a different perspective also be considered an indicator of ‘susceptibility’ or ‘sensitivity’ to language input. Nevertheless, contemporary sla scholars generally seem to concur that ua and not rate of learning is the dependent variable of primary interest in cph research. These and further scope delineation problems relevant to cph research are discussed in more detail by, among others, Birdsong [9] , DeKeyser and Larson-Hall [14] , Long [10] and Muñoz and Singleton [6] .

Formulating testable hypotheses

Once the relevant cph 's scope has satisfactorily been identified, clear and testable predictions need to be drawn from it. At this stage, the lack of consensus on what the consequences or the actual observable outcome of a cp would have to look like becomes evident. As touched upon earlier, cph research is interested in the end state or ‘ultimate attainment’ ( ua ) in L2 acquisition because this “determines the upper limits of L2 attainment” [9, p. 10]. The range of possible ultimate attainment states thus helps researchers to explore the potential maximum outcome of L2 proficiency before and after the putative critical period.

One strong prediction made by some cph exponents holds that post- cp learners cannot reach native-like L2 competences. Identifying a single native-like post- cp L2 learner would then suffice to falsify all cph s making this prediction. Assessing this prediction is difficult, however, since it is not clear what exactly constitutes sufficient nativelikeness, as illustrated by the discussion on the actual nativelikeness of highly accomplished L2 speakers [15] , [16] . Indeed, there exists a real danger that, in a quest to vindicate the cph , scholars set the bar for L2 learners to match monolinguals increasingly higher – up to Swiftian extremes. Furthermore, the usefulness of comparing the linguistic performance in mono- and bilinguals has been called into question [6] , [17] , [18] . Put simply, the linguistic repertoires of mono- and bilinguals differ by definition and differences in the behavioural outcome will necessarily be found, if only one digs deep enough.

A second strong prediction made by cph proponents is that the function linking age of acquisition and ultimate attainment will not be linear throughout the whole lifespan. Before discussing how this function would have to look like in order for it to constitute cph -consistent evidence, I point out that the ultimate attainment variable can essentially be considered a cumulative measure dependent on the actual variable of interest in cph research, i.e. susceptibility to language input, as well as on such other factors like duration and intensity of learning (within and outside a putative cp ) and possibly a number of other influencing factors. To elaborate, the behavioural outcome, i.e. ultimate attainment, can be assumed to be integrative to the susceptibility function, as Newport [19] correctly points out. Other things being equal, ultimate attainment will therefore decrease as susceptibility decreases. However, decreasing ultimate attainment levels in and by themselves represent no compelling evidence in favour of a cph . The form of the integrative curve must therefore be predicted clearly from the susceptibility function. Additionally, the age of acquisition–ultimate attainment function can take just about any form when other things are not equal, e.g. duration of learning (Does learning last up until time of testing or only for a more or less constant number of years or is it dependent on age itself?) or intensity of learning (Do learners always learn at their maximum susceptibility level or does this intensity vary as a function of age, duration, present attainment and motivation?). The integral of the susceptibility function could therefore be of virtually unlimited complexity and its parameters could be adjusted to fit any age of acquisition–ultimate attainment pattern. It seems therefore astonishing that the distinction between level of sensitivity to language input and level of ultimate attainment is rarely made in the literature. Implicitly or explicitly [20] , the two are more or less equated and the same mathematical functions are expected to describe the two variables if observed across a range of starting ages of acquisition.

But even when the susceptibility and ultimate attainment variables are equated, there remains controversy as to what function linking age of onset of acquisition and ultimate attainment would actually constitute evidence for a critical period. Most scholars agree that not any kind of age effect constitutes such evidence. More specifically, the age of acquisition–ultimate attainment function would need to be different before and after the end of the cp [9] . According to Birdsong [9] , three basic possible patterns proposed in the literature meet this condition. These patterns are presented in Figure 1 . The first pattern describes a steep decline of the age of onset of acquisition ( aoa )–ultimate attainment ( ua ) function up to the end of the cp and a practically non-existent age effect thereafter. Pattern 2 is an “unconventional, although often implicitly invoked” [9, p. 17] notion of the cp function which contains a period of peak attainment (or performance at ceiling), i.e. performance does not vary as a function of age, which is often referred to as a ‘window of opportunity’. This time span is followed by an unbounded decline in ua depending on aoa . Pattern 3 includes characteristics of patterns 1 and 2. At the beginning of the aoa range, performance is at ceiling. The next segment is a downward slope in the age function which ends when performance reaches its floor. Birdsong points out that all of these patterns have been reported in the literature. On closer inspection, however, he concludes that the most convincing function describing these age effects is a simple linear one. Hakuta et al. [21] sketch further theoretically possible predictions of the cph in which the mean performance drops drastically and/or the slope of the aoa – ua proficiency function changes at a certain point.

thumbnail

  • PPT PowerPoint slide
  • PNG larger image
  • TIFF original image

The graphs are based on based on Figure 2 in [9] .

https://doi.org/10.1371/journal.pone.0069172.g001

Although several patterns have been proposed in the literature, it bears pointing out that the most common explicit prediction corresponds to Birdsong's first pattern, as exemplified by the following crystal-clear statement by DeKeyser, one of the foremost cph proponents:

[A] strong negative correlation between age of acquisition and ultimate attainment throughout the lifespan (or even from birth through middle age), the only age effect documented in many earlier studies, is not evidence for a critical period…[T]he critical period concept implies a break in the AoA–proficiency function, i.e., an age (somewhat variable from individual to individual, of course, and therefore an age range in the aggregate) after which the decline of success rate in one or more areas of language is much less pronounced and/or clearly due to different reasons. [22, p. 445].

DeKeyser and before him among others Johnson and Newport [23] thus conceptualise only one possible pattern which would speak in favour of a critical period: a clear negative age effect before the end of the critical period and a much weaker (if any) negative correlation between age and ultimate attainment after it. This ‘flattened slope’ prediction has the virtue of being much more tangible than the ‘potential nativelikeness’ prediction: Testing it does not necessarily require comparing the L2-learners to a native control group and thus effectively comparing apples and oranges. Rather, L2-learners with different aoa s can be compared amongst themselves without the need to categorise them by means of a native-speaker yardstick, the validity of which is inevitably going to be controversial [15] . In what follows, I will concern myself solely with the ‘flattened slope’ prediction, arguing that, despite its clarity of formulation, cph research has generally used analytical methods that are irrelevant for the purposes of actually testing it.

Inferring non-linearities in critical period research: An overview

critical period hypothesis language learning

Group mean or proportion comparisons.

critical period hypothesis language learning

[T]he main differences can be found between the native group and all other groups – including the earliest learner group – and between the adolescence group and all other groups. However, neither the difference between the two childhood groups nor the one between the two adulthood groups reached significance, which indicates that the major changes in eventual perceived nativelikeness of L2 learners can be associated with adolescence. [15, p. 270].

Similar group comparisons aimed at investigating the effect of aoa on ua have been carried out by both cph advocates and sceptics (among whom Bialystok and Miller [25, pp. 136–139], Birdsong and Molis [26, p. 240], Flege [27, pp. 120–121], Flege et al. [28, pp. 85–86], Johnson [29, p. 229], Johnson and Newport [23, p. 78], McDonald [30, pp. 408–410] and Patowski [31, pp. 456–458]). To be clear, not all of these authors drew direct conclusions about the aoa – ua function on the basis of these groups comparisons, but their group comparisons have been cited as indicative of a cph -consistent non-continuous age effect, as exemplified by the following quote by DeKeyser [22] :

Where group comparisons are made, younger learners always do significantly better than the older learners. The behavioral evidence, then, suggests a non-continuous age effect with a “bend” in the AoA–proficiency function somewhere between ages 12 and 16. [22, p. 448].

The first problem with group comparisons like these and drawing inferences on the basis thereof is that they require that a continuous variable, aoa , be split up into discrete bins. More often than not, the boundaries between these bins are drawn in an arbitrary fashion, but what is more troublesome is the loss of information and statistical power that such discretisation entails (see [32] for the extreme case of dichotomisation). If we want to find out more about the relationship between aoa and ua , why throw away most of the aoa information and effectively reduce the ua data to group means and the variance in those groups?

critical period hypothesis language learning

Comparison of correlation coefficients.

critical period hypothesis language learning

Correlation-based inferences about slope discontinuities have similarly explicitly been made by cph advocates and skeptics alike, e.g. Bialystok and Miller [25, pp. 136 and 140], DeKeyser and colleagues [22] , [44] and Flege et al. [45, pp. 166 and 169]. Others did not explicitly infer the presence or absence of slope differences from the subset correlations they computed (among others Birdsong and Molis [26] , DeKeyser [8] , Flege et al. [28] and Johnson [29] ), but their studies nevertheless featured in overviews discussing discontinuities [14] , [22] . Indeed, the most recent overview draws a strong conclusion about the validity of the cph 's ‘flattened slope’ prediction on the basis of these subset correlations:

In those studies where the two groups are described separately, the correlation is much higher for the younger than for the older group, except in Birdsong and Molis (2001) [ =  [26] , JV], where there was a ceiling effect for the younger group. This global picture from more than a dozen studies provides support for the non-continuity of the decline in the AoA–proficiency function, which all researchers agree is a hallmark of a critical period phenomenon. [22, p. 448].

In Johnson and Newport's specific case [23] , their correlation-based inference that ua levels off after puberty happened to be largely correct: the gjt scores are more or less randomly distributed around a near-horizontal trend line [26] . Ultimately, however, it rests on the fallacy of confusing correlation coefficients with slopes, which seriously calls into question conclusions such as DeKeyser's (cf. the quote above).

critical period hypothesis language learning

https://doi.org/10.1371/journal.pone.0069172.g002

critical period hypothesis language learning

Lower correlation coefficients in older aoa groups may therefore be largely due to differences in ua variance, which have been reported in several studies [23] , [26] , [28] , [29] (see [46] for additional references). Greater variability in ua with increasing age is likely due to factors other than age proper [47] , such as the concomitant greater variability in exposure to literacy, degree of education, motivation and opportunity for language use, and by itself represents evidence neither in favour of nor against the cph .

Regression approaches.

Having demonstrated that neither group mean or proportion comparisons nor correlation coefficient comparisons can directly address the ‘flattened slope’ prediction, I now turn to the studies in which regression models were computed with aoa as a predictor variable and ua as the outcome variable. Once again, this category of studies is not mutually exclusive with the two categories discussed above.

In a large-scale study using self-reports and approximate aoa s derived from a sample of the 1990 U.S. Census, Stevens found that the probability with which immigrants from various countries stated that they spoke English ‘very well’ decreased curvilinearly as a function of aoa [48] . She noted that this development is similar to the pattern found by Johnson and Newport [23] but that it contains no indication of an “abruptly defined ‘critical’ or sensitive period in L2 learning” [48, p. 569]. However, she modelled the self-ratings using an ordinal logistic regression model in which the aoa variable was logarithmically transformed. Technically, this is perfectly fine, but one should be careful not to read too much into the non-linear curves found. In logistic models, the outcome variable itself is modelled linearly as a function of the predictor variables and is expressed in log-odds. In order to compute the corresponding probabilities, these log-odds are transformed using the logistic function. Consequently, even if the model is specified linearly, the predicted probabilities will not lie on a perfectly straight line when plotted as a function of any one continuous predictor variable. Similarly, when the predictor variable is first logarithmically transformed and then used to linearly predict an outcome variable, the function linking the predicted outcome variables and the untransformed predictor variable is necessarily non-linear. Thus, non-linearities follow naturally from Stevens's model specifications. Moreover, cph -consistent discontinuities in the aoa – ua function cannot be found using her model specifications as they did not contain any parameters allowing for this.

Using data similar to Stevens's, Bialystok and Hakuta found that the link between the self-rated English competences of Chinese- and Spanish-speaking immigrants and their aoa could be described by a straight line [49] . In contrast to Stevens, Bialystok and Hakuta used a regression-based method allowing for changes in the function's slope, viz. locally weighted scatterplot smoothing ( lowess ). Informally, lowess is a non-parametrical method that relies on an algorithm that fits the dependent variable for small parts of the range of the independent variable whilst guaranteeing that the overall curve does not contain sudden jumps (for technical details, see [50] ). Hakuta et al. used an even larger sample from the same 1990 U.S. Census data on Chinese- and Spanish-speaking immigrants (2.3 million observations) [21] . Fitting lowess curves, no discontinuities in the aoa – ua slope could be detected. Moreover, the authors found that piecewise linear regression models, i.e. regression models containing a parameter that allows a sudden drop in the curve or a change of its slope, did not provide a better fit to the data than did an ordinary regression model without such a parameter.

critical period hypothesis language learning

To sum up, I have argued at length that regression approaches are superior to group mean and correlation coefficient comparisons for the purposes of testing the ‘flattened slope’ prediction. Acknowledging the reservations vis-à-vis self-estimated ua s, we still find that while the relationship between aoa and ua is not necessarily perfectly linear in the studies discussed, the data do not lend unequivocal support to this prediction. In the following section, I will reanalyse data from a recent empirical paper on the cph by DeKeyser et al. [44] . The first goal of this reanalysis is to further illustrate some of the statistical fallacies encountered in cph studies. Second, by making the computer code available I hope to demonstrate how the relevant regression models, viz. piecewise regression models, can be fitted and how the aoa representing the optimal breakpoint can be identified. Lastly, the findings of this reanalysis will contribute to our understanding of how aoa affects ua as measured using a gjt .

Summary of DeKeyser et al. (2010)

I chose to reanalyse a recent empirical paper on the cph by DeKeyser et al. [44] (henceforth DK et al.). This paper lends itself well to a reanalysis since it exhibits two highly commendable qualities: the authors spell out their hypotheses lucidly and provide detailed numerical and graphical data descriptions. Moreover, the paper's lead author is very clear on what constitutes a necessary condition for accepting the cph : a non-linearity in the age of onset of acquisition ( aoa )–ultimate attainment ( ua ) function, with ua declining less strongly as a function of aoa in older, post- cp arrivals compared to younger arrivals [14] , [22] . Lastly, it claims to have found cross-linguistic evidence from two parallel studies backing the cph and should therefore be an unsuspected source to cph proponents.

critical period hypothesis language learning

The authors set out to test the following hypotheses:

  • Hypothesis 1: For both the L2 English and the L2 Hebrew group, the slope of the age of arrival–ultimate attainment function will not be linear throughout the lifespan, but will instead show a marked flattening between adolescence and adulthood.
  • Hypothesis 2: The relationship between aptitude and ultimate attainment will differ markedly for the young and older arrivals, with significance only for the latter. (DK et al., p. 417)

Both hypotheses were purportedly confirmed, which in the authors' view provides evidence in favour of cph . The problem with this conclusion, however, is that it is based on a comparison of correlation coefficients. As I have argued above, correlation coefficients are not to be confused with regression coefficients and cannot be used to directly address research hypotheses concerning slopes, such as Hypothesis 1. In what follows, I will reanalyse the relationship between DK et al.'s aoa and gjt data in order to address Hypothesis 1. Additionally, I will lay bare a problem with the way in which Hypothesis 2 was addressed. The extracted data and the computer code used for the reanalysis are provided as supplementary materials, allowing anyone interested to scrutinise and easily reproduce my whole analysis and carry out their own computations (see ‘supporting information’).

Data extraction

critical period hypothesis language learning

In order to verify whether we did in fact extract the data points to a satisfactory degree of accuracy, I computed summary statistics for the extracted aoa and gjt data and checked these against the descriptive statistics provided by DK et al. (pp. 421 and 427). These summary statistics for the extracted data are presented in Table 1 . In addition, I computed the correlation coefficients for the aoa – gjt relationship for the whole aoa range and for aoa -defined subgroups and checked these coefficients against those reported by DK et al. (pp. 423 and 428). The correlation coefficients computed using the extracted data are presented in Table 2 . Both checks strongly suggest the extracted data to be virtually identical to the original data, and Dr DeKeyser confirmed this to be the case in response to an earlier draft of the present paper (personal communication, 6 May 2013).

thumbnail

https://doi.org/10.1371/journal.pone.0069172.t001

thumbnail

https://doi.org/10.1371/journal.pone.0069172.t002

Results and Discussion

Modelling the link between age of onset of acquisition and ultimate attainment.

I first replotted the aoa and gjt data we extracted from DK et al.'s scatterplots and added non-parametric scatterplot smoothers in order to investigate whether any changes in slope in the aoa – gjt function could be revealed, as per Hypothesis 1. Figures 3 and 4 show this not to be the case. Indeed, simple linear regression models that model gjt as a function of aoa provide decent fits for both the North America and the Israel data, explaining 65% and 63% of the variance in gjt scores, respectively. The parameters of these models are given in Table 3 .

thumbnail

The trend line is a non-parametric scatterplot smoother. The scatterplot itself is a near-perfect replication of DK et al.'s Fig. 1.

https://doi.org/10.1371/journal.pone.0069172.g003

thumbnail

The trend line is a non-parametric scatterplot smoother. The scatterplot itself is a near-perfect replication of DK et al.'s Fig. 5.

https://doi.org/10.1371/journal.pone.0069172.g004

thumbnail

https://doi.org/10.1371/journal.pone.0069172.t003

critical period hypothesis language learning

To ensure that both segments are joined at the breakpoint, the predictor variable is first centred at the breakpoint value, i.e. the breakpoint value is subtracted from the original predictor variable values. For a blow-by-blow account of how such models can be fitted in r , I refer to an example analysis by Baayen [55, pp. 214–222].

critical period hypothesis language learning

Solid: regression with breakpoint at aoa 18 (dashed lines represent its 95% confidence interval); dot-dash: regression without breakpoint.

https://doi.org/10.1371/journal.pone.0069172.g005

thumbnail

Solid: regression with breakpoint at aoa 18 (dashed lines represent its 95% confidence interval); dot-dash (hardly visible due to near-complete overlap): regression without breakpoint.

https://doi.org/10.1371/journal.pone.0069172.g006

thumbnail

https://doi.org/10.1371/journal.pone.0069172.t004

critical period hypothesis language learning

https://doi.org/10.1371/journal.pone.0069172.g007

thumbnail

Solid: regression with breakpoint at aoa 16 (dashed lines represent its 95% confidence interval); dot-dash: regression without breakpoint.

https://doi.org/10.1371/journal.pone.0069172.g008

thumbnail

Solid: regression with breakpoint at aoa 6 (dashed lines represent its 95% confidence interval); dot-dash (hardly visible due to near-complete overlap): regression without breakpoint.

https://doi.org/10.1371/journal.pone.0069172.g009

thumbnail

https://doi.org/10.1371/journal.pone.0069172.t005

thumbnail

https://doi.org/10.1371/journal.pone.0069172.t006

thumbnail

https://doi.org/10.1371/journal.pone.0069172.t007

thumbnail

https://doi.org/10.1371/journal.pone.0069172.t008

critical period hypothesis language learning

In sum, a regression model that allows for changes in the slope of the the aoa – gjt function to account for putative critical period effects provides a somewhat better fit to the North American data than does an everyday simple regression model. The improvement in model fit is marginal, however, and including a breakpoint does not result in any detectable improvement of model fit to the Israel data whatsoever. Breakpoint models therefore fail to provide solid cross-linguistic support in favour of critical period effects: across both data sets, gjt can satisfactorily be modelled as a linear function of aoa .

On partialling out ‘age at testing’

As I have argued above, correlation coefficients cannot be used to test hypotheses about slopes. When the correct procedure is carried out on DK et al.'s data, no cross-linguistically robust evidence for changes in the aoa – gjt function was found. In addition to comparing the zero-order correlations between aoa and gjt , however, DK et al. computed partial correlations in which the variance in aoa associated with the participants' age at testing ( aat ; a potentially confounding variable) was filtered out. They found that these partial correlations between aoa and gjt , which are given in Table 9 , differed between age groups in that they are stronger for younger than for older participants. This, DK et al. argue, constitutes additional evidence in favour of the cph . At this point, I can no longer provide my own analysis of DK et al.'s data seeing as the pertinent data points were not plotted. Nevertheless, the detailed descriptions by DK et al. strongly suggest that the use of these partial correlations is highly problematic. Most importantly, and to reiterate, correlations (whether zero-order or partial ones) are actually of no use when testing hypotheses concerning slopes. Still, one may wonder why the partial correlations differ across age groups. My surmise is that these differences are at least partly the by-product of an imbalance in the sampling procedure.

thumbnail

https://doi.org/10.1371/journal.pone.0069172.t009

critical period hypothesis language learning

The upshot of this brief discussion is that the partial correlation differences reported by DK et al. are at least partly the result of an imbalance in the sampling procedure: aoa and aat were simply less intimately tied for the young arrivals in the North America study than for the older arrivals with L2 English or for all of the L2 Hebrew participants. In an ideal world, we would like to fix aat or ascertain that it at most only weakly correlates with aoa . This, however, would result in a strong correlation between aoa and another potential confound variable, length of residence in the L2 environment, bringing us back to square one. Allowing for only moderate correlations between aoa and aat might improve our predicament somewhat, but even in that case, we should tread lightly when making inferences on the basis of statistical control procedures [61] .

On estimating the role of aptitude

Having shown that Hypothesis 1 could not be confirmed, I now turn to Hypothesis 2, which predicts a differential role of aptitude for ua in sla in different aoa groups. More specifically, it states that the correlation between aptitude and gjt performance will be significant only for older arrivals. The correlation coefficients of the relationship between aptitude and gjt are presented in Table 10 .

thumbnail

https://doi.org/10.1371/journal.pone.0069172.t010

The problem with both the wording of Hypothesis 2 and the way in which it is addressed is the following: it is assumed that a variable has a reliably different effect in different groups when the effect reaches significance in one group but not in the other. This logic is fairly widespread within several scientific disciplines (see e.g. [62] for a discussion). Nonetheless, it is demonstrably fallacious [63] . Here we will illustrate the fallacy for the specific case of comparing two correlation coefficients.

critical period hypothesis language learning

Apart from not being replicated in the North America study, does this difference actually show anything? I contend that it does not: what is of interest are not so much the correlation coefficients, but rather the interactions between aoa and aptitude in models predicting gjt . These interactions could be investigated by fitting a multiple regression model in which the postulated cp breakpoint governs the slope of both aoa and aptitude. If such a model provided a substantially better fit to the data than a model without a breakpoint for the aptitude slope and if the aptitude slope changes in the expected direction (i.e. a steeper slope for post- cp than for younger arrivals) for different L1–L2 pairings, only then would this particular prediction of the cph be borne out.

Using data extracted from a paper reporting on two recent studies that purport to provide evidence in favour of the cph and that, according to its authors, represent a major improvement over earlier studies (DK et al., p. 417), it was found that neither of its two hypotheses were actually confirmed when using the proper statistical tools. As a matter of fact, the gjt scores continue to decline at essentially the same rate even beyond the end of the putative critical period. According to the paper's lead author, such a finding represents a serious problem to his conceptualisation of the cph [14] ). Moreover, although modelling a breakpoint representing the end of a cp at aoa 16 may improve the statistical model slightly in study on learners of English in North America, the study on learners of Hebrew in Israel fails to confirm this finding. In fact, even if we were to accept the optimal breakpoint computed for the Israel study, it lies at aoa 6 and is associated with a different geometrical pattern.

Diverging age trends in parallel studies with participants with different L2s have similarly been reported by Birdsong and Molis [26] and are at odds with an L2-independent cph . One parsimonious explanation of such conflicting age trends may be that the overall, cross-linguistic age trend is in fact linear, but that fluctuations in the data (due to factors unaccounted for or randomness) may sometimes give rise to a ‘stretched L’-shaped pattern ( Figure 1, left panel ) and sometimes to a ‘stretched 7’-shaped pattern ( Figure 1 , middle panel; see also [66] for a similar comment).

Importantly, the criticism that DeKeyser and Larsson-Hall levy against two studies reporting findings similar to the present [48] , [49] , viz. that the data consisted of self-ratings of questionable validity [14] , does not apply to the present data set. In addition, DK et al. did not exclude any outliers from their analyses, so I assume that DeKeyser and Larsson-Hall's criticism [14] of Birdsong and Molis's study [26] , i.e. that the findings were due to the influence of outliers, is not applicable to the present data either. For good measure, however, I refitted the regression models with and without breakpoints after excluding one potentially problematic data point per model. The following data points had absolute standardised residuals larger than 2.5 in the original models without breakpoints as well as in those with breakpoints: the participant with aoa 17 and a gjt score of 125 in the North America study and the participant with aoa 12 and a gjt score of 117 in the Israel study. The resultant models were virtually identical to the original models (see Script S1 ). Furthermore, the aoa variable was sufficiently fine-grained and the aoa – gjt curve was not ‘presmoothed’ by the prior aggregation of gjt across parts of the aoa range (see [51] for such a criticism of another study). Lastly, seven of the nine “problems with supposed counter-evidence” to the cph discussed by Long [5] do not apply either, viz. (1) “[c]onfusion of rate and ultimate attainment”, (2) “[i]nappropriate choice of subjects”, (3) “[m]easurement of AO”, (4) “[l]eading instructions to raters”, (6) “[u]se of markedly non-native samples making near-native samples more likely to sound native to raters”, (7) “[u]nreliable or invalid measures”, and (8) “[i]nappropriate L1–L2 pairings”. Problem No. 5 (“Assessments based on limited samples and/or “language-like” behavior”) may be apropos given that only gjt data were used, leaving open the theoretical possibility that other measures might have yielded a different outcome. Finally, problem No. 9 (“Faulty interpretation of statistical patterns”) is, of course, precisely what I have turned the spotlights on.

Conclusions

The critical period hypothesis remains a hotly contested issue in the psycholinguistics of second-language acquisition. Discussions about the impact of empirical findings on the tenability of the cph generally revolve around the reliability of the data gathered (e.g. [5] , [14] , [22] , [52] , [67] , [68] ) and such methodological critiques are of course highly desirable. Furthermore, the debate often centres on the question of exactly what version of the cph is being vindicated or debunked. These versions differ mainly in terms of its scope, specifically with regard to the relevant age span, setting and language area, and the testable predictions they make. But even when the cph 's scope is clearly demarcated and its main prediction is spelt out lucidly, the issue remains to what extent the empirical findings can actually be marshalled in support of the relevant cph version. As I have shown in this paper, empirical data have often been taken to support cph versions predicting that the relationship between age of acquisition and ultimate attainment is not strictly linear, even though the statistical tools most commonly used (notably group mean and correlation coefficient comparisons) were, crudely put, irrelevant to this prediction. Methods that are arguably valid, e.g. piecewise regression and scatterplot smoothing, have been used in some studies [21] , [26] , [49] , but these studies have been criticised on other grounds. To my knowledge, such methods have never been used by scholars who explicitly subscribe to the cph .

I suspect that what may be going on is a form of ‘confirmation bias’ [69] , a cognitive bias at play in diverse branches of human knowledge seeking: Findings judged to be consistent with one's own hypothesis are hardly questioned, whereas findings inconsistent with one's own hypothesis are scrutinised much more strongly and criticised on all sorts of points [70] – [73] . My reanalysis of DK et al.'s recent paper may be a case in point. cph exponents used correlation coefficients to address their prediction about the slope of a function, as had been done in a host of earlier studies. Finding a result that squared with their expectations, they did not question the technical validity of their results, or at least they did not report this. (In fact, my reanalysis is actually a case in point in two respects: for an earlier draft of this paper, I had computed the optimal position of the breakpoints incorrectly, resulting in an insignificant improvement of model fit for the North American data rather than a borderline significant one. Finding a result that squared with my expectations, I did not question the technical validity of my results – until this error was kindly pointed out to me by Martijn Wieling (University of Tübingen).) That said, I am keen to point out that the statistical analyses in this particular paper, though suboptimal, are, as far as I could gather, reported correctly, i.e. the confirmation bias does not seem to have resulted in the blatant misreportings found elsewhere (see [74] for empirical evidence and discussion). An additional point to these authors' credit is that, apart from explicitly identifying their cph version's scope and making crystal-clear predictions, they present data descriptions that actually permit quantitative reassessments and have a history of doing so (e.g. the appendix in [8] ). This leads me to believe that they analysed their data all in good conscience and to hope that they, too, will conclude that their own data do not, in fact, support their hypothesis.

I end this paper on an upbeat note. Even though I have argued that the analytical tools employed in cph research generally leave much to be desired, the original data are, so I hope, still available. This provides researchers, cph supporters and sceptics alike, with an exciting opportunity to reanalyse their data sets using the tools outlined in the present paper and publish their findings at minimal cost of time and resources (for instance, as a comment to this paper). I would therefore encourage scholars to engage their old data sets and to communicate their analyses openly, e.g. by voluntarily publishing their data and computer code alongside their articles or comments. Ideally, cph supporters and sceptics would join forces to agree on a protocol for a high-powered study in order to provide a truly convincing answer to a core issue in sla .

Supporting Information

Dataset s1..

aoa and gjt data extracted from DeKeyser et al.'s North America study.

https://doi.org/10.1371/journal.pone.0069172.s001

Dataset S2.

aoa and gjt data extracted from DeKeyser et al.'s Israel study.

https://doi.org/10.1371/journal.pone.0069172.s002

Script with annotated R code used for the reanalysis. All add-on packages used can be installed from within R.

https://doi.org/10.1371/journal.pone.0069172.s003

Acknowledgments

I would like to thank Irmtraud Kaiser (University of Fribourg) for helping me to get an overview of the literature on the critical period hypothesis in second language acquisition. Thanks are also due to Martijn Wieling (currently University of Tübingen) for pointing out an error in the R code accompanying an earlier draft of this paper.

Author Contributions

Analyzed the data: JV. Wrote the paper: JV.

  • 1. Penfield W, Roberts L (1959) Speech and brain mechanisms. Princeton: Princeton University Press.
  • 2. Lenneberg EH (1967) Biological foundations of language. New York: Wiley.
  • View Article
  • Google Scholar
  • 10. Long MH (2007) Problems in SLA. Mahwah, NJ: Lawrence Erlbaum.
  • 14. DeKeyser R, Larson-Hall J (2005) What does the critical period really mean? In: Kroll and De Groot [75], 88–108.
  • 19. Newport EL (1991) Contrasting conceptions of the critical period for language. In: Carey S, Gelman R, editors, The epigenesis of mind: Essays on biology and cognition, Hillsdale, NJ: Lawrence Erlbaum. 111–130.
  • 20. Birdsong D (2005) Interpreting age effects in second language acquisition. In: Kroll and De Groot [75], 109–127.
  • 22. DeKeyser R (2012) Age effects in second language learning. In: Gass SM, Mackey A, editors, The Routledge handbook of second language acquisition, London: Routledge. 442–460.
  • 24. Weisstein EW. Discontinuity. From MathWorld –A Wolfram Web Resource. Available: http://mathworld.wolfram.com/Discontinuity.html . Accessed 2012 March 2.
  • 27. Flege JE (1999) Age of learning and second language speech. In: Birdsong [76], 101–132.
  • 36. Champely S (2009) pwr: Basic functions for power analysis. Available: http://cran.r-project.org/package=pwr . R package, version 1.1.1.
  • 37. R Core Team (2013) R: A language and environment for statistical computing. Available: http://www.r-project.org/ . Software, version 2.15.3.
  • 47. Hyltenstam K, Abrahamsson N (2003) Maturational constraints in sla . In: Doughty CJ, Long MH, editors, The handbook of second language acquisition, Malden, MA: Blackwell. 539–588.
  • 49. Bialystok E, Hakuta K (1999) Confounded age: Linguistic and cognitive factors in age differences for second language acquisition. In: Birdsong [76], 161–181.
  • 52. DeKeyser R (2006) A critique of recent arguments against the critical period hypothesis. In: Abello-Contesse C, Chacón-Beltrán R, López-Jiménez MD, Torreblanca-López MM, editors, Age in L2 acquisition and teaching, Bern: Peter Lang. 49–58.
  • 55. Baayen RH (2008) Analyzing linguistic data: A practical introduction to statistics using R. Cambridge: Cambridge University Press.
  • 56. Fox J (2002) Robust regression. Appendix to An R and S-Plus Companion to Applied Regression. Available: http://cran.r-project.org/doc/contrib/Fox-Companion/appendix.html .
  • 57. Ripley B, Hornik K, Gebhardt A, Firth D (2012) MASS: Support functions and datasets for Venables and Ripley's MASS. Available: http://cran.r-project.org/package=MASS . R package, version 7.3–17.
  • 58. Zuur AF, Ieno EN, Walker NJ, Saveliev AA, Smith GM (2009) Mixed effects models and extensions in ecology with R. New York: Springer.
  • 59. Pinheiro J, Bates D, DebRoy S, Sarkar D, R Core Team (2013) nlme: Linear and nonlinear mixed effects models. Available: http://cran.r-project.org/package=nlme . R package, version 3.1–108.
  • 65. Field A (2009) Discovering statistics using SPSS. London: SAGE 3rd edition.
  • 66. Birdsong D (2009) Age and the end state of second language acquisition. In: Ritchie WC, Bhatia TK, editors, The new handbook of second language acquisition, Bingley: Emerlad. 401–424.
  • 75. Kroll JF, De Groot AMB, editors (2005) Handbook of bilingualism: Psycholinguistic approaches. New York: Oxford University Press.
  • 76. Birdsong D, editor (1999) Second language acquisition and the critical period hypothesis. Mahwah, NJ: Lawrence Erlbaum.

2009 Articles

The Critical Period Hypothesis: Support, Challenge, and Reconceptualization

Schouten, Andy

Given the general failure experienced by adults when attempting to learn a second or foreign language, many have hypothesized that a critical period exists for the domain of language learning. Supporters of the Critical Period Hypothesis (CPH) contend that language learning, which takes place outside of this critical period (roughly defined as ending sometime around puberty), will inevitably be marked by non-native like features. In opposition to this position, several researches have postulated that, although rare, nativelike proficiency in a second language is in fact possible for adult learners. Still others, in light of the robust debate and research both supporting and challenging the CPH, have reconceptualized their views regarding a possible critical period for language learning, claiming that in combination with age of exposure, sociological, psychological, and physiological factors must also be considered when determining the factors that facilitate and debilitate language acquisition. In this paper, a review of literature describing the support, challenges, and reconceptualizations of the CPH is provided.

  • Second language acquisition--Research
  • Applied linguistics--Research
  • Critical periods (Biology)
  • English language--Study and teaching--Foreign speakers

thumnail for 1.-Schouten-2009.pdf

Also Published In

More about this work.

  • DOI Copy DOI to clipboard

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • My Bibliography
  • Collections
  • Citation manager

Save citation to file

Email citation, add to collections.

  • Create a new collection
  • Add to an existing collection

Add to My Bibliography

Your saved search, create a file for external citation management software, your rss feed.

  • Search in PubMed
  • Search in NLM Catalog
  • Add to Search

Critical period effects in second language learning: the influence of maturational state on the acquisition of English as a second language

  • PMID: 2920538
  • DOI: 10.1016/0010-0285(89)90003-0

Lenneberg (1967) hypothesized that language could be acquired only within a critical period, extending from early infancy until puberty. In its basic form, the critical period hypothesis need only have consequences for first language acquisition. Nevertheless, it is essential to our understanding of the nature of the hypothesized critical period to determine whether or not it extends as well to second language acquisition. If so, it should be the case that young children are better second language learners than adults and should consequently reach higher levels of final proficiency in the second language. This prediction was tested by comparing the English proficiency attained by 46 native Korean or Chinese speakers who had arrived in the United States between the ages of 3 and 39, and who had lived in the United States between 3 and 26 years by the time of testing. These subjects were tested on a wide variety of structures of English grammar, using a grammaticality judgment task. Both correlational and t-test analyses demonstrated a clear and strong advantage for earlier arrivals over the later arrivals. Test performance was linearly related to age of arrival up to puberty; after puberty, performance was low but highly variable and unrelated to age of arrival. This age effect was shown not to be an inadvertent result of differences in amount of experience with English, motivation, self-consciousness, or American identification. The effect also appeared on every grammatical structure tested, although the structures varied markedly in the degree to which they were well mastered by later learners. The results support the conclusion that a critical period for language acquisition extends its effects to second language acquisition.

PubMed Disclaimer

Similar articles

  • Critical period effects on universal properties of language: the status of subjacency in the acquisition of a second language. Johnson JS, Newport EL. Johnson JS, et al. Cognition. 1991 Jun;39(3):215-58. doi: 10.1016/0010-0277(91)90054-8. Cognition. 1991. PMID: 1841034
  • Acquisition of english grammatical morphology by native mandarin-speaking children and adolescents: age-related differences. Jia G, Fuse A. Jia G, et al. J Speech Lang Hear Res. 2007 Oct;50(5):1280-99. doi: 10.1044/1092-4388(2007/090). J Speech Lang Hear Res. 2007. PMID: 17905912
  • A critical period for second language acquisition: Evidence from 2/3 million English speakers. Hartshorne JK, Tenenbaum JB, Pinker S. Hartshorne JK, et al. Cognition. 2018 Aug;177:263-277. doi: 10.1016/j.cognition.2018.04.007. Epub 2018 May 2. Cognition. 2018. PMID: 29729947 Free PMC article.
  • Cross-linguistic universals in reading acquisition with applications to English-language learners with reading disabilities. Gorman BK. Gorman BK. Semin Speech Lang. 2009 Nov;30(4):246-60. doi: 10.1055/s-0029-1241723. Epub 2009 Oct 22. Semin Speech Lang. 2009. PMID: 19851952 Review.
  • Characteristics of Korean phonology: review, tutorial, and case studies of Korean children speaking English. Ha S, Johnson CJ, Kuehn DP. Ha S, et al. J Commun Disord. 2009 May-Jun;42(3):163-79. doi: 10.1016/j.jcomdis.2009.01.002. Epub 2009 Jan 20. J Commun Disord. 2009. PMID: 19203766 Review.
  • Orientation towards the vernacular and style-shifting as language behaviours in speech of first-generation Polish migrant communities speaking Norwegian in Norway. Malarski K, Castle C, Awedyk W, Wrembel M, Jensen IN. Malarski K, et al. Front Psychol. 2024 Aug 14;15:1330494. doi: 10.3389/fpsyg.2024.1330494. eCollection 2024. Front Psychol. 2024. PMID: 39205976 Free PMC article.
  • Beyond age: exploring ultimate attainment in heritage speakers and late L2 learners. Prela L, Dąbrowska E, Llompart M. Prela L, et al. Front Psychol. 2024 Aug 8;15:1419116. doi: 10.3389/fpsyg.2024.1419116. eCollection 2024. Front Psychol. 2024. PMID: 39176043 Free PMC article.
  • Enhancing Foreign Language Learning Approaches to Promote Healthy Aging: A Systematic Review. Klimova B, de Paula Nascimento E Silva C. Klimova B, et al. J Psycholinguist Res. 2024 May 17;53(4):48. doi: 10.1007/s10936-024-10088-3. J Psycholinguist Res. 2024. PMID: 38758475 Free PMC article.
  • Hearing parents learning American Sign Language with their deaf children: a mixed-methods survey. Lieberman AM, Mitchiner J, Pontecorvo E. Lieberman AM, et al. Appl Linguist Rev. 2022 May 23;15(1):309-333. doi: 10.1515/applirev-2021-0120. eCollection 2024 Jan. Appl Linguist Rev. 2022. PMID: 38221976 Free PMC article.
  • Prenatal experience with language shapes the brain. Mariani B, Nicoletti G, Barzon G, Ortiz Barajas MC, Shukla M, Guevara R, Suweis SS, Gervain J. Mariani B, et al. Sci Adv. 2023 Nov 24;9(47):eadj3524. doi: 10.1126/sciadv.adj3524. Epub 2023 Nov 22. Sci Adv. 2023. PMID: 37992161 Free PMC article.

Publication types

  • Search in MeSH

Related information

  • Cited in Books

Grants and funding

  • R01 DC000167/DC/NIDCD NIH HHS/United States
  • R01 DC000167-26/DC/NIDCD NIH HHS/United States
  • HD07205/HD/NICHD NIH HHS/United States
  • NS16878/NS/NINDS NIH HHS/United States

LinkOut - more resources

Full text sources.

  • Elsevier Science
  • Citation Manager

NCBI Literature Resources

MeSH PMC Bookshelf Disclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.

Multilingual Pedagogy and World Englishes

Linguistic Variety, Global Society

Multilingual Pedagogy and World Englishes

Critical Period Hypothesis (CPH)

Tom Scovel writes, “The CPH [critical period hypothesis] is conceivably the most contentious issue in SLA because there is disagreement over its exact age span; people disagree strenuously over which facets of language are affected; there are competing explanations for its existence; and, to top it off, many people don’t believe it exists at all” (113). Proposed by Wilder Penfield and Lamar Roberts in 1959, the Critical Period Hypothesis (CPH) argues that there is a specific period of time in which people can learn a language without traces of the L1 (a so-called “foreign” accent or even L1 syntactical features) manifesting in L2 production (Scovel 48). If a learner’s goal is to sound “native,” there may be age-related limitations or “maturational constraints” as Kenneth Hyltenstam and Niclas Abrahamsson call them, on how “native” they can sound. Reducing the impression left by the L1 is certainly possible after puberty, but eliminating that impression entirely may not be possible.

Kenji Hakuta et al. explains that the relationship between age and L1 interference in L2 production is really not up for debate:

“The diminished average achievement of older learners is supported by personal anecdote and documented by empirical evidence….What is controversial, though, is whether this pattern meets the conditions for concluding that a critical period constrains learning in a way predicted by the theory” (31).

Some learners manage to overcome the “constraints” that Scovel believes are “probably accounted for by neurological factors that are genetically specified in our species” (114), but these learners are exceptional rather than the rule. It may be biology; it may be due to something else. The debate will continue, but evidence seems to indicate that the older learners become, the more difficult complete acquisition can be.

“David Birdsong, Looking Inside and Beyond the Critical Period Hypothesis.”  YouTube,  uploaded by IWL Channel, 09 May 2016, https://www.youtube.com/watch?v=9Bo0C4dj7Mw.

Application

Instructors should consider taking the CPH into account when assessing their students’ oral communication in the target language. When “maturational constraints” are a potential concern, it seems more fair for instructors to weight comprehension more heavily than nativeness. A thorough understanding of the CPH can also help instructors to counteract adult learners’ “self-handicapping” by helping the learners understand that, in spite of constraints due to aging, they are still capable of acquiring many–if not most–aspects of the target language.

Bibliography

Hakuta, Kenji, et al. “Critical Evidence: A Test of the Critical-Period Hypothesis for Second-Language Acquisition.”  Psychological Science , vol. 14, no. 1, 2003, pp. 31–38.  JSTOR , www.jstor.org/stable/40063748.

Hyltenstam, Kenneth, and Niclas Abrahamsson. “Comments on Stefka H. Marinova-Todd, D. Bradford Marshall, and Catherine E. Snow’s ‘Three Misconceptions about Age and L2 Learning’: Age and L2 Learning: The Hazards of Matching Practical ‘Implications’ with Theoretical ‘Facts.’”  TESOL Quarterly , vol. 35, no. 1, 2001, pp. 151–170.  JSTOR , www.jstor.org/stable/3587863.

Nemer, Randa. “Critical Period Hypothesis.”  Prezi,  04 Dec. 2013, https://prezi.com/zzuch40ibrlq/critical-period-hypothesis-sla/#.

Scovel, Tom.  Learning New Languages . Heinle & Heinle, 2001.

Stack Exchange Network

Stack Exchange network consists of 183 Q&A communities including Stack Overflow , the largest, most trusted online community for developers to learn, share their knowledge, and build their careers.

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

What are the main arguments for and against the critical period hypothesis, and what are alternative explanations?

Why is the critical period hypothesis so heavily disputed, yet widely accepted; what are its major strengths and weaknesses; what other explanations exist for the perceived "critical period", if it does not exist?

  • critical-period

Hatchet's user avatar

Let us start with a simple, relatively informal statement: “in most cases, those who start learning a language as children become ultimately become more proficient in a language than those who start learning it later”. This is uncontroversial, and something I think the vast majority of second-language acquisition researchers would agree on. However, this is not how the Critical Period Hypothesis (CPH) is understood within the field of Second Language Acquisition (SLA). CPH is a large subject, and your question is hard to answer in a few paragraphs. Therefore, I am reusing large fragments of an assignment I wrote on this very topic for an SLA course a few years ago. Let me know if something is unclear or the style is too terse at some points.

Summary (TL;DR)

There is no universally accepted definition of a critical period within linguistics and some of the controversies are caused by the fact that different researchers use different definitions.

There are a few key findings that are not controversial:

  • early learners perform consistently well in all aspects of language use,
  • as we move the starting age, they perform statistically worse and worse until puberty,
  • however, the decrease in performance is not uniform.

An explanation, provided by Bialystok (1997) as an alternative to CPH, is the different learning style of children, compared to late learners.

Paradoxically, the differences (or lack thereof) between those who learn a foreign language as adults is the key factor in deciding whether CPH is true or not, and a controversial one:

  • Some studies found correlations between the age adult learners started learning a language and their ultimate attainment. In other words, these studies suggest that if we compare people who have been learning a language for a very long time, the ultimate attainment of those who started at the age of 20 is statistically higher than the ultimate attainment of those who started at the age of 40. These studies argue that there is no CPH in the childhood, but rather that our abilities in learning a new language consistently decrease throughout our whole lives.
  • Other studies found no clear correlations between the starting age and the ultimate attainment among adult language learners. They point out that the correlation between the starting age and ultimate attainment is clear for those who started before puberty. Based on that, they argue that there is something qualitatively different about starting to learn in an early age, and therefore conclude that it is an argument for CPH.

Definitions of the critical period used by those who argue against CPH

Controversies with the Critical Period Hypothesis (CPH) are related to the issue of ultimate attainment of early and late language learners, that is, the highest language proficiency level they can attain. The patterns in ultimate attainment may be explained by CPH, but they may also have different explanations. Some researchers support some form of the Critical Period Hypothesis (Johnson and Newport 1989, DeKeyser and Larson-Hall 2005, Patkowsky 1994, Scovel 1988), while others argue against it (Bialystok 1997).

A major problem with the Critical Period Hypothesis is that there appears to be no universally accepted definition of a critical period within linguistics . Bialystok (1997) bases her discussion of the critical/sensitive period (which she takes to be synonymous 1 ) on a specific technical definition used in ethology, which includes 14 essential structural characteristics that describe such a period (Bornstein 1989). She argues that one of these characteristics is especially problematic – the system: “structure or function altered in the sensitive period” (Bornstein 1989:184). In other words, she argues that there is no period where a structure in the brain is modified in a way that makes subsequent language learning harder or impossible. Bialystok does, however, agree that there is an optimal period for language learning – something that can be characterised by the statement “ On average, children are more successful than adults when faced with the task of learning a second language ” (Bialystok 1997:117). Despite the controversy around other issues, this fact is uncontested and has been verified by numerous studies .

Bialystok (1997) rejects the existence of a critical period, because of lack of postulated structure that is modified when the period is over. She postulates that an important factor that causes differences in ultimate attainment between early and late starters is learning style: children prefer accommodation (creating new concepts) over assimilation (extending existing concepts). The question remains: why do they prefer accommodation? She suggests that “[t]his may be because children are in the process of creating new categories all the time as they are learning new information” (Bialystok 1997:132).

Definitions of the critical period used by supporters of CPH

The researchers who support some form of the Critical Period Hypothesis (Johnson and Newport 1989, DeKeyser and Larson-Hall 2005), formulate it in a form that is much weaker than Bialystok's (1997) formulation. What they postulate often resembles what Bialystok calls the optimal age.

Johnson and Newport (1989) reformulated CPH into two alternative hypotheses, in order to fit second language acquisition into the picture:

The exercise hypothesis : “Early in life, humans have a superior capacity for acquiring languages. If the capacity is not exercised during this time, it will disappear or decline with maturation. If the capacity is exercised, however, further language learning abilities will remain intact throughout life.” (Johnson and Newport 1989:64)

The maturational state hypothesis : “Early in life, humans have a superior capacity for acquiring languages. This capacity disappears or declines with maturation.” (Johnson and Newport 1989:64)

We can see that if a critical period was found for second language acquisition, we could be almost sure that it exists for L1 acquisition as well (the maturational state hypothesis). However, we cannot deduce in this way in case of the exercise hypothesis – non-existence of a critical period for L2 acquisition does not exclude in any way a possibility of such period for the first language (Bialystok 1997).

DeKeyser and Larson-Hall (2005) formulate the hypothesis as: “language acquisition from mere exposure (i.e. implicit learning) […] is severely limited in older adolescents and adults”. Their formulation is quite vague, as is the constatation that there is a “qualitative change in language learning capacities somewhere between 4 and 18 years”.

There are also definitions that restrict the Critical Period Hypothesis to specific subareas of the language faculty. The most commonly mentioned area is phonology, see e.g. Patkowsky (1994, cited in Bialystok 1997), Scovel (1988, cited in Bialystok 1997).

Age effects before and after puberty

The current consensus is that early learners perform consistently well in all aspects of language use. As we move the starting age, they perform statistically worse and worse until puberty. The decrease in performance is not uniform, and in some areas (such as phonology) it is particularly visible. Performance on the same level as early bilinguals is possible, but rare.

Probably the most controversial aspect is the performance of adult learners. After puberty there is much bigger variance in the performance, so data are more prone to different interpretations. The results obtained by Derwing and Munro (2013) suggest that comprehensibility and good accent are negatively correlated with the age of arrival, that is, the age when English language immersion started. Johnson and Newport (1989) found no correlation of starting age after puberty with ultimate language proficiency, while Bialystok (1997) re-analysed these data and found some negative correlation. A meta-analysis by DeKeyser and Larson-Hall (2005) downplays the role of post-adolescent correlations. As we can see, the jury is still out on this debate.

1 In neuroscience critical period and sensitive period are two separate concepts, see Knudsen (2004).

Bibliography

  • Bialystok, E. 1997. The structure of age: in search of barriers to second language acquisition. Second Language Research 13(2): 116-137.
  • Bornstein, M.H. 1989. Sensitive periods in development: structural characteristics and causal interpretations. Psychological Bulletin 105,179–97.
  • DeKeyser, R. and J. Larson-Hall. 2005. What does the critical period really mean? In J. F. Kroll and A. M. B. de Groot. 2005. Handbook of bilingualism: Psycholinguistic approaches . Cary, NC: Oxford University Press. Pp. 109–27.
  • Derwing, T. M., & Munro, M. J. 2013. The development of L2 oral language skills in two L1 groups: A 7-year study. Language Learning 63, 163-185.
  • Johnson, J.S., & Newport, E.L. 1989. Critical period effects in second language learning: The influence of maturational state on the acquisition of English as a second language. Cognitive Psychology , 21, 60-99.
  • Knudsen, E. I. 2004. Sensitive periods in the development of the brain and behavior. Journal of Cognitive Neuroscience , 16, 1412-25
  • Newport, E. L., & Supalla, T. 1987. A critical period effect in the acquisition of a primary language .
  • Patkowsky, M. 1980. The sensitive period for the acquisition of syntax in a second language. Language Learning 30, 449–72
  • Scovel, T. 1988. A time to speak: a psycholinguistic inquiry into the critical period for human speech . New York: Newbury House

michau's user avatar

Your Answer

Sign up or log in, post as a guest.

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy .

Not the answer you're looking for? Browse other questions tagged critical-period or ask your own question .

  • Featured on Meta
  • Site maintenance - Mon, Sept 16 2024, 21:00 UTC to Tue, Sept 17 2024, 2:00...
  • User activation: Learnings and opportunities
  • Join Stack Overflow’s CEO and me for the first Stack IRL Community Event in...

Hot Network Questions

  • Why was Panama Railroad in poor condition when US decided to build Panama Canal in 1904?
  • Concerns with newly installed floor tile
  • NSolve uses all CPU resources
  • How can I support a closet rod where there's no shelf?
  • BASH - Find file with regex - Non-recursively delete number-only filenames in directory
  • "Truth Function" v.s. "Truth-Functional"
  • Can a V22 Osprey operate with only one propeller?
  • When does a finite group have finitely many indecomposable representations?
  • Should I change advisors because mine doesn't object to publishing at MDPI?
  • The quest for a Wiki-less Game
  • How much could gravity increase before a military tank is crushed
  • Is there mathematical significance to the LaGuardia floor tiles?
  • A journal has published an AI-generated article under my name. What to do?
  • What are the pros and cons of the classic portfolio by Wealthfront?
  • Why is resonance such a widespread phenomenon?
  • Working principle of the Zener diode acting as a voltage regulator in a circuit
  • What would the natural diet of Bigfoot be?
  • LaTeX labels propositions as Theorems in text instead of Propositions
  • Navigating career options after a disastrous PhD performance and a disappointed advisor?
  • Gridded plane colouring problem. Can a 2x2 black square be created on a white gridded plane using 3x3 and 4x4 "stamps" that invert the grid colour?
  • Seeking Advice for Outdoor Junction Box
  • Word switching from plural to singular when it is many?
  • Inequality involving finite number of nonnegative real numbers
  • What film is it where the antagonist uses an expandable triple knife?

critical period hypothesis language learning

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 31 August 2024

A framework for the emergence and analysis of language in social learning agents

  • Tobias J. Wieczorek 1 , 2 ,
  • Tatjana Tchumatchenko   ORCID: orcid.org/0000-0001-9137-809X 2 ,
  • Carlos Wert-Carvajal 2   na1 &
  • Maximilian F. Eggl   ORCID: orcid.org/0000-0001-5815-1045 2   na1  

Nature Communications volume  15 , Article number:  7590 ( 2024 ) Cite this article

403 Accesses

61 Altmetric

Metrics details

  • Learning algorithms
  • Neural decoding

Neural systems have evolved not only to solve environmental challenges through internal representations but also, under social constraints, to communicate these to conspecifics. In this work, we aim to understand the structure of these internal representations and how they may be optimized to transmit pertinent information from one individual to another. Thus, we build on previous teacher-student communication protocols to analyze the formation of individual and shared abstractions and their impact on task performance. We use reinforcement learning in grid-world mazes where a teacher network passes a message to a student to improve task performance. This framework allows us to relate environmental variables with individual and shared representations. We compress high-dimensional task information within a low-dimensional representational space to mimic natural language features. In coherence with previous results, we find that providing teacher information to the student leads to a higher task completion rate and an ability to generalize tasks it has not seen before. Further, optimizing message content to maximize student reward improves information encoding, suggesting that an accurate representation in the space of messages requires bi-directional input. These results highlight the role of language as a common representation among agents and its implications on generalization capabilities.

Similar content being viewed by others

critical period hypothesis language learning

Quantifying effects of tasks on group performance in social learning

critical period hypothesis language learning

Language and culture internalization for human-like autotelic AI

critical period hypothesis language learning

The neural and computational systems of social learning

Introduction.

In exploring task representations in biological and artificial agents, studies have traditionally emphasized the role of self-experience and common circuitry priors 1 , 2 . Intriguingly, shared neural representations underlie similar behaviors among conspecifics 3 . Indeed, common convergent abstractions are also essential for communication among individuals of the same species or group 4 . Such social pressure implies that neural circuits may have evolved to produce internal representations that are not only useful for a given individual but also maximize communication efficacy, which has been argued to be essential in the development of cognition 5 , 6 , 7 .

We posit that social communication is crucial in providing task-efficient representations underpinning the generalization of experiences among cooperative agents. The hypothesis that context and communication alter the task representation can be attributed to the introduction of language games 8 and supported by the ability of the neural activity to represent semantic hierarchies 9 . Early studies in this direction focused on the conditions and constraints that would allow an artificial language to evolve and how similar this construction would be to human communication 10 , 11 , 12 , 13 . With the advent of deep learning, there has been a surge of work that combines multi-agent systems with communication policies 14 , 15 , 16 , 17 , 18 , 19 , 20 . This includes studies on multi-agent games where agents send and receive discrete messages to perform tasks 21 , translation tasks 22 , and low-level policy formation through competition or collaboration 23 . Further work has highlighted the importance of pragmatic approaches 24 , the contrast between scientific or applied models in language emergence 25 , and multi-agent cooperative learning 26 . However, these studies have focused more on the performance consequences of the communication system instead of examining the nature of shared representations.

To understand the interplay between the environmental experiences and the internal abstractions, we build on a teacher-student framework to develop a communication protocol that allows agents to cooperate while solving a common task 27 , 28 . We employ a reinforcement learning (RL) framework 29 , which has been previously used in artificial agents 21 , 30 , to produce empirical task abstractions that vary among agents. Using this RL-based student-teacher framework, we can recapitulate features considered to be critical for language 31 , including “interchangeability” 32 , where individuals can both send and receive messages 33 , “total feedback”, where speakers can hear and internally monitor their own speech 34 or “productivity”, where individuals can create and understand an infinite number of messages that have not been expressed before 35 .

In contrast with previous work, we focus on understanding how hidden representations can be shared between agents and what effect the structure of the lower-dimensional language space is. We are particularly interested in three aspects: how agents internally abstract real-world variables, how these abstractions translate into a common, shareable language, and the interaction of these elements. Hence, we opted for a non-discrete language model to directly compare the continuous nature of both brain processes and real-world phenomena. By feeding into the language model the learned information provided by RL agents performing a navigational task 36 , we investigated the development of natural language as it arises from social and decision-making processes 37 . This leads to individualized abstractions emerging organically rather than being pre-defined, in contrast to supervised learning methods 22 , 38 , 39 . By analyzing the structure of the language embedding, we can gain insights into information content in the message space and its relation to neural representations underpinning task performance and generalization. Additionally, it stands apart from previous non-supervised, symbolic methods 28 , 40 , 41 , 42 , 43 , taking cues from continuous language generation models 44 , 45 and animal communication systems, like the bee waggle dance, which translate a continuous environment into a concise message space 46 , 47 , also seen in human languages 44 , 45 .

To summarize, we present a tractable framework for studying emergent communication, drawing upon established multi-agent language models. We disentangle the relation between the internal neural representations and the message space, contributing to the following three results to the neuroscience and neuroAI communities:

Reveal structural features in the lower-dimensional embedding space necessary for higher student success and task generalization 6 , 48 .

Demonstrate how the structure of the lower-dimensional embedding or message space is altered to enhance the information content when the communication channel is provided with feedback to optimize student performance.

Understand how the hidden representations can be used in studying symmetric communication, i.e., where the sender and receiver can be interchanged, highlighted recently as an important challenge in the field 32 .

Model architecture

To study the emergence of language between agents, we define two agents passing information to each other: a teacher and a student. Both of these agents are modeled as deep neural networks, whereby the teacher network is trained in an RL framework, and the student learns to interpret the instructions of the teacher 29 , 49 , 50 . We used RL due to our interest in analyzing shared and generalizable abstractions arising from individual experiences and strategies instead of predetermined labels. Additionally, RL provides an intuitive and robust connection to neuroscience 29 , 51 , which we aim to take advantage of to gain insight into the mechanisms and features of language emergence.

In our setup, the teacher agent is presented with a task with complete access to its observations and rewards (Fig.  1 ). After a certain amount of training, the teacher will have obtained sufficient information to represent the task. The teacher network aims to produce a state-action value function or Q-matrix ( Q ( s ,  a )) of the task, which contains the expected return of state-action pairs, hence learning in a model-free and off-policy form. The student then aims to solve the same task but with additional information from the teacher, which we will refer to as a “message” (Fig.  1 a). Thus, the student must learn and complete the task through its observations and the message from the teacher. In our framework, we assume each teacher observes and learns from a single task and then passes a relevant task message—e.g., information derived from the Q-matrix—to the student. In that way, students can succeed on tasks they have yet to encounter by correctly interpreting the given information (Fig.  1 b).

figure 1

a Model sketch depicting a generalist student agent that is provided messages from teacher agents for various tasks. The student learns to decode these messages and then perform the relevant tasks. b (Top) Representative navigation tasks used to train and test agents to analyze the social learning framework. Beginning in the bottom left corner, the agents aim to reach the goal (trophy) in as few steps as possible while avoiding the walls (light blue squares). (Bottom) Overlaid are example policies for tasks learned by the teacher agents. The student needs to decode the encoded version of this information it receives. Messages ( m i ) may contain erroneous instructions or be misunderstood by the student (red squares). c Detailed communication architectures used in this study. In each of the three approaches, task information (Q-matrices in our framework) is first learned by teacher agents, who then pass this information through a sparse autoencoder (language proxy), which generates the associated low-dimensional representations, m i . When student feedback is absent (top row), these representations m i are provided directly to the student who learns to interpret them to solve task i . In the case of student feedback (middle row), we also allow feedback from the student performance to propagate back to the language training and enhance the usefulness of the messages. The final schematic (bottom row) depicts the “closing-the-loop" architecture. Here, the student is trained on a set of messages from expert teachers. Once it is sufficiently competent, its task information is supplied to itself (after being passed through the language embedding trained with feedback), and the effect on performance is studied.

The most crucial component of the architecture is the communication process. Natural language represents a lower-dimensional representation of higher-dimensional concepts. When one individual speaks to another, high-dimensional descriptors - e.g., time, location, shape, context—of a concept in the brain of the sender are encoded into a low-dimensional vocabulary that is decoded back into a higher-dimensional and distributed representation in the brain of the receiver. This is supported by the observed semantic correlations and low-dimensional embedding space of human language representations 45 , 52 , 53 and in the brain activity 9 , 54 , 55 , which is congruent across species 6 , 56 . To mimic this interaction, we introduced a sparse autoencoder (SAE) 57 , that takes the information from the teacher and produces a compressed message, m , passed to the student alongside the task. SAEs are also neural networks that consist of two parts, an encoder, and a decoder, and promote sparsity for the lower-dimensional representations. The encoder continuously projects the teacher network’s output, Q , onto a message, m , which is a real-valued vector of length K . The decoder then uses this message to minimize the difference between its reconstruction \(Q^{\prime}\) and the true Q .

Furthermore, inspired by the sparse coding hypothesis 54 , we assume that the brain, and thus, by extension, language, is inherently sparsity-promoting 58 , 59 . We implemented this by adding the norm of the message vector to the autoencoder loss, which follows the principle of least effort to guide our artificial communication closer to natural language (see eq. ( 4 ) in the “Methods”). Here, we utilized the L 1 -norm of the message vector, which leads to a promotion of zeroes in the message and, therefore, less information that mimics sparsity.

The combination of one teacher, SAE, and student for an arbitrary task can be seen in Fig.  1 c, where we depict three different language-student interaction protocols. We note that this framework differs from an approach where all agents and languages are connected via one network. Instead, the teachers are always trained separately to generate the relevant task information. Then, we either sequentially train the language and student networks (no feedback in Fig.  1 c) or connect the language and the student by providing the auto-encoder feedback on the student performance (Fig.  1 c, with feedback). In essence, the teacher and the language (feedback and non-feedback) are connected conceptually through the information transfer process but not in a way that results in a single neural network or shared gradient flow. Variations of this approach were employed by Foerster et al. 34 and Tampuu et al. 60 , which studied “independent Q-learning” where the agents each learn their own network parameters, treating the other agents as part of the environment. In this framework, we study a goal-directed navigational task in a grid-world maze (see “Methods”, Fig.  1 b). We chose this relatively simple toy problem for the agents to learn due to its straightforward implementation—allowing us to focus on analyzing the message structure—, its usefulness in studying generalization and exploration strategies, and the possibility of extending the framework to more complex navigational settings 29 . We emphasize that the above architecture does not rely on a predetermined vocabulary for which the agents must assign meaning. Instead, the language evolves naturally from the task and the lower-dimensional encoding, mirroring natural language evolution.

The purpose of this study is two-fold: (i) analyze the structure of the lower-dimensional representations generated by the trained language (which are lower-dimensional representations of our tasks), and (ii) evaluate the performance of an agent who has learned to interpret a message coming from this embedding space.

The structure of the lower-dimensional message

We trained a set of teachers to solve one maze task each with a specific goal location and wall setting. As mentioned above, we use the trained language to embed the Q-matrices into a lower-dimensional space—firstly, considering a language created without feedback by the student. The resulting latent space shows wall positions as the most prominent dimension in the lower-dimensional representations (Fig.  2 a(ii)), with goal locations being a secondary feature of the variability (Fig.  2 a (iii)). This structure is represented in the lower-dimensional PCA through discrete groupings with minimal overlap and stratification within each grouping according to the goal location. This result follows intuitively from the fact that the language is trained without student feedback, only relying on the reconstruction of the Q-matrix and regularization of the message space (eq. ( 4 )). Thus, to achieve this reconstruction most sparsely, a hierarchical structure appears: first, we distinguish mazes, and then, within each maze, we distinguish the goal location. This structure appears regardless of whether this information is helpful for the student. We note that when we used linear activations or singular value decomposition for the language encoding, we did not reproduce this clear grouping (cf. Supplementary Figs.  S1 and  S2 ).

figure 2

a Principal Component Analysis (PCA) of the lower-dimensional messages of size K  = 5 obtained from a language encoding without student feedback (eq. ( 4 )) for all possible tasks in the 4 × 4 mazes with ≤1 walls (see “Methods” for a description of the tasks). i) Explained variance by principal component. ii)-iii) depicts the messages highlighted by the position of the single wall (gray refers to the maze with no walls) and by the position of the goal, respectively. b Result of the message encoding now including student feedback achieved by using eq. ( 1 ) for the loss function. i)-iii) depict the same concepts as in ( a ). iv) shows messages highlighted by the preferred first student action (step up or right). c PCA of the messages with student feedback from an example grid-world (depicted in ii)).

While direct labeling of the tasks by such dimensions may help the student solve trained tasks, the average performance concerning trained tasks and generalization is significantly lower than when student feedback helps shape the language (see Supplementary Figs.  S3 and   S4 ). Furthermore, this interaction is purely one-directional and does not reflect the natural emergence of language, which is a back-and-forth between the receiver and sender. Therefore, we introduced student feedback into the message structure to encourage this natural evolution of language. Such feedback is implemented by including and maximizing the probability of the student finding the goal in the language training. This translates to a compound autoencoder loss function of the form

where the \({{{\mathcal{L}}}}_{{{\rm{goal}}} \, {{\rm{finding}}}}\) is defined by eq. ( 5 ) in the “Methods” and ζ is a tunable hyperparameter. After each trial of the student, the language is updated to (i) maintain the reconstruction of the information, (ii) promote sparsity of the message, and (iii) increase the success rate of the student given the message.

Notably, the latent structure of the language space significantly changes through this reward-maximizing term (Fig.  2 b(ii)–(iv)). Even if the variance distribution remains similar (compare Fig.  2 a(i),b(i)), task settings are no longer clustered in the latent space, but instead form a more continuous gradient when marked by wall position (Fig.  2 b(ii)) or goal location (Fig.  2 b(iii)). Therefore, the feedback changes the lower-dimensional task representations so that the student obtains more information on where to go, i.e., the policy, rather than the actual composition of the state space. We note some overlap in the middle of the cluster when marking the tasks by goal location; here, the policy differences are negligible as there might be two competing policies that are equally optimal. This focus on policy is additionally emphasized by the variability along the initial action of the student (Fig.  2 b(iv)), where a clear split between the two choices of going right or up can be observed. By providing this policy label, language moves away from providing maze labels and towards a framework that can generalize to tasks the student has not seen before. Table  1 shows the changes in explained variability by wall position and goal location in both languages without and with student feedback. Notably, the message variability between groups of goal locations (see Methods) rises when the utility constraint is introduced, marking the increased importance of describing the goal location accurately in the language.

This focus on policy, rather than state space, appears to be independent of the architecture of the autoencoder we use (cf. all linear activations in Supplementary Fig.  S1 ) or the dimensionality reduction technique we employ (see Supplementary Figs.  S5 – S9 for results using PCA, UMAP, and t-SNE, related within and between variances in Supplementary Table  S1 ). Additionally, the projection of messages to the main dimensions of a linear decoder was consistent with the unsupervised representational space (Supplementary Fig.  S10 ). This implies that transmitting this representational feature is fundamental to the success of the student.

We can extend this analysis to understand the student feedback representation of the different goal locations for a single maze, where more than 80% of the variance is explained by a single principal component (Fig.  2 c(i)). Geometric structure (Fig.  2 c(iii)) and action selectivity (Fig.  2 c(iv)) are well represented in the embedding, the former indicating that language is performing a simple linear transformation of the geometric shape of the maze. We hypothesize that such information hierarchy benefits overall learning and generalization. We note that these results hold independent of the activation function (Supplementary Figs.  S1 ,   S2 ).

One crucial feature of language is its compositionality, which neighborhood properties can measure; for example, close meanings in a compositional language should map to nearby messages. To test whether our communication protocol includes this feature in its mapping from meaning to message space, we performed topographic similarity analysis 21 , 61 by comparing the distances in the message space (Euclidean distance) against (i) the task labels, i.e., spatial difference in the mazes and (ii) the information that the teacher provides to the autoencoder (Q-matrices). The distance in the labels was calculated as a weighted sum of the differences between the goal and the wall locations (see Fig.  3 a). In contrast, the distance of the teacher Q-matrices, representing a space-based and action-based meaning, was calculated using the Frobenius norm.

figure 3

a Visualization of goal (Δ g ) and wall (Δ w ) distance vectors between two tasks (task 1: solid, task 2: checkered). In combination, these are used to compute a spatial task distance Δ t  =  ∣ ∣ R wall Δ w ∣ ∣ 2  +  ∣ ∣ R goal Δ g ∣ ∣ 2 for ( b ). b , c Comparison of pairwise task meaning distances and pairwise distances of the corresponding messages. Message and task distances are measured with the Euclidean norm, Q-matrix distances with the equivalent for matrices, the Frobenius norm. Center and end points of the bars refer to the mean over five languages and ± one standard deviation, respectively. m refers to the gradient of linear fit. d Entropy analysis through discretization: The first two PCs of normalized samples of each data type are binned. The entropy is then computed for different choices of bin size. Here, we depict an example discretization for 5 bins in each PC direction (bin side lengths are identical along both axes). e Calculating the entropy for each PC discretization demonstrates a clear ranking of the teacher, message, and student information-carrying capabilities. The maximum possible entropy is that of a uniform distribution over all maze tasks.

The calculated pairwise distances are plotted in Fig.  3b , c. The message and meaning spaces show topographic similarity as indicated by the positive slope parameters of the linear regressions. Taking this slope as a quantitative measure, we also find that languages trained with feedback show higher degrees of topographic similarity (and thus compositionality) than languages trained without feedback (cf. blue vs red).

Another property of emergent (discrete) communication is its tendency to minimize entropy 62 , i.e., the general pressure towards communication efficiency. To see whether this effect was reproduced in our framework, we performed an information theoretic analysis using Shannon entropy to measure the information-carrying capabilities of our messages. As Shannon entropy is restricted to random variables taking discrete values and our messages arise from a continuous embedding space, we projected our messages onto a set of bins and then calculated the entropy of that discrete distribution. For visualization purposes, we only show the first two PCs and normalized the distributions with an identical factor along both PC axes so that samples were restricted to [−1, 1] 2 (see Fig.  3d for an example). While this means that some information is omitted, direct comparison of the entropy of tasks (the maximum entropy value), teacher outputs, messages, and student output becomes possible. We calculated the entropy across various bin widths to verify that our findings are independent of our chosen bin size.

We found that the entropy decreases when moving through the communication framework, i.e., the entropy of the teacher outputs is highest, followed by the entropy of the messages and finally the entropy of the student outputs (see Fig.  3e ). This result is intuitive because, at each stage, information is lost as it passes through an agent/network. Nonetheless, when comparing the framework where the auto-encoder is provided with feedback on the student performance to the one without feedback, we see that more information is retained. This implies that bi-directionality is critical to the ability of the language to transmit helpful information.

Additionally, we confirm one of the findings of ref. 62 , which states that there is pressure for a language to be as simple as possible and that this pressure is amplified as we increase communication channel discreteness. We simulate this scenario by removing the reconstruction loss from the auto-encoder training so that the auto-encoder loss only consists of the sparsity promotion and the student performance feedback. This amplifies the pressure of the auto-encoder to generate a sparse message space. We find that the entropy of the messages and student output is significantly lower than the situation with the reconstruction loss (no reconstruction data in Fig.  3e ). Furthermore, the entropy values of the messages and student output do not differ significantly, meaning that the messages and student output have collapsed onto distributions with similar information-carrying capabilities.

Interestingly, the addition of student feedback reduced the overall reconstruction error of the message space (Fig.  4 a–c, Eq. ( 4 )). This may suggest that the reconstruction of the teacher Q-matrix benefits from including features guided by utility and transmissibility criteria. Nevertheless, this comes at the cost of lower sparsity (Fig.  4 c), mirroring the effect of natural language: communication aims to transmit the most sparse message, allowing for the best reconstruction of the underlying idea. Overall, these three items achieve a similar level of compound loss in both feedback and non-feedback (Fig.  4 d).

figure 4

Components of the compound autoencoder training losses with and without student feedback; ( a ) reconstruction loss, ( b ) sparsity loss, and ( c ) SAE loss, which additionally includes the goal finding loss when student feedback is included (see Eq. ( 4 ) and Eq. ( 5 )). d shows the difference \({{{\mathcal{L}}}}_{{{\rm{SAE}}}}-{{{\mathcal{L}}}}_{{{\rm{SAE,feedback}}}}\) which highlights that the student-feedback autoencoder achieves a lower reconstruction loss. e – h Student performance on training and test maze tasks (see “Methods” for a description of the tasks). A comparison is made between the informed student, who receives the correct message, the misinformed student, who receives a message corresponding to a random task, and two random walkers, one of which never walks into walls (smart random walker). The performance is further evaluated for seven sets of trained goal locations, i)-vii), displayed as the green squares in the inset figures of ( e ). In the 4 panels, the bars represent the mean task performance  ± the SEM. In ( e ), the performance is measured for the trained goal locations and trained mazes with 0 or 1 wall state. f shows the solve rate for the unknown goal locations (white) for mazes with 0 or 1 wall state. g , h depict the solve rates for new mazes (2 wall states) with trained and unknown goal locations, respectively. The error bars for each case refer to the variance across languages (five languages were trained for each case). * refers to p  < 0.05 where the p is obtained using a two-sided and one-sided t -test (multiple tests were accounted for using a Bonferroni correction factor) for the informed vs misinformed and smart random walker, respectively. Exact p -values for the significant results can be found in Table  6 .

The effect of the message on student performance

In order to test the performance and generalization capabilities of the student, we used messages from teachers who mastered mazes with zero or one wall state and trained the student on patterned subsets of their goal locations (Fig.  4 e, inset). We define the task solve rate as the percentage of goals attained under 2 s opt steps, where s opt corresponds to the shortest path from start to goal (see “Methods”).

Under these terms, we can observe an increased performance of the student against misinformed students (given incorrect messages) and random walkers (one of which avoids walls) when evaluating the trained goal sets (Fig.  4 e). We note that even in this scenario, this misinformed student slightly outperforms the random walkers, which we hypothesize is because of the initial action preference we observe for all messages, which allows the misinformed student to avoid the outer walls. To ascertain whether the generalization of the goal locations across the messages was achieved, we tested the performance of the students on unknown goals. We observe that the best generalization is achieved under checkerboard patterns. However, the performances of the other four cases do not differ significantly from the random walkers (Fig.  4 f). This implies that generalization is difficult when large portions of the task space are unknown, and interpolation between known tasks is not possible. Training on far-away goal locations leads to slightly better performance (Fig.  4 f(v), (vii)), but this might also be due to a wall-avoiding action preference. In this respect, when adding new wall locations, the overall performance is reduced, but the improvement against the other agents is preserved (Fig.  4 g, h). These results highlight the importance of the goal-oriented structure of the lower-dimensional representations for these tasks and reinforce the benefits of the altered language achieved by the student feedback. In line with previous observations (Fig.  2 b), the main features of the encoded message are the policy and goal location. Therefore, when the agent attempts to solve the maze with unknown goals, it performs markedly worse. This behavior is only avoided when the student is trained on the checkerboard pattern, which means it has seen the entire maze and can use the information presented and its own experience to compensate for the lack of information. In other words, new tasks must be composable from other tasks within the language framework for communication to succeed.

We note that the above results arise from a language generated with student feedback, i.e., the representations that help the student have direct knowledge of the student parameters. To ascertain whether this language is useful to students who were not directly involved in the language training, we studied the performance of novel students who were trained to interpret “frozen” languages without feedback (Supplementary Fig.  S3 ) and with feedback (Supplementary Fig.  S4 ). We note that the former approach treats all the components of our framework (teacher, language, student) as separate networks and that no gradient information is propagated back through the language channel. We see that in both cases, the students perform well on many tasks and can generalize to unknown scenarios. However, the student trained to interpret the frozen feedback language performs better across all scenarios (and outperforms the smart walker). This implies that initial feedback is fundamental to generating helpful language features that other new students can use.

Closing the loop

As natural language is not usually restricted to sender and receiver but is a robust exchange between two agents (i.e., “interchangeability” 31 ), our final analysis is related to studying the effect of passing task information gained by the student through language encoding to obtain a set of novel messages. Rather than solely relying on a set of teachers that perform single tasks and pass on compressed information, we allow the student to generate messages itself after performing - and thus learning—tasks with messages from teachers. These student messages are then passed back to themselves, and their performance with these messages is assessed. A schematic depicting this structure can be seen in Fig.  1 c (bottom row). Thus, we attempt to create a simple generalist agent to supply information through the same language encoding, which we keep fixed. This communication process will naturally erode the message, leading to comparisons to the children’s game “telephone”. What information is robust to communication erosion is often studied in that setting. We can use this analogy also to identify the type of information that is more transmissible between agents 63 .

Firstly, we can observe a degradation of the information content. Notably, the low-dimensional form of the student-generated task information entails that variability among student messages is mainly concentrated on a single dimension that is identifiable by the goal location and initial action (Fig.  5 a(i), Table  2 ). This contrasts with previous findings that the message space of teachers was not dominated by one principal component, and variability also corresponded to wall arrangements (Fig.  2 b). We then turn to the task completion rates of the students. Here, both the informed and misinformed students are given messages resulting from encoding the task information the student has learned when supplied the teacher with messages. The informed student is supplied with the encoded message corresponding to the current task, while the misinformed student is provided a message from a random task. From a performance perspective, we note that the degradation of the message content translates into lower task solve rates (Fig.  5 b–e). This decrease can be seen even when considering trained goal locations (Fig.  5 b). Nevertheless, students performed better than the misinformed agents, which implies that passed degraded messages include sufficient information to avoid walls and find the goal state.

figure 5

a PCA on the student messages (encoded messages arising from the student task information): i) shows the variance explained by PC. ii) - iv) depict the student messages marked by wall position, goal location, and probability of initial student action, respectively. b – e Informed student performance on training and test maze tasks (see “Methods” for details) is compared against the misinformed student and two random walkers. The comparison is once more performed for the seven sets of trained goal locations, (i)–(vii). In the four panels, the bars represent the mean task performance ± the SEM. The relevant tasks per panel are identical to Fig.  4 . For ( b – e ), 25 languages were originally trained and evaluated, but a subset was excluded (see “Methods” for details on this exclusion). * refers to p  < 0.05 where the p is obtained using a two-sided and one-sided t -test (multiple tests were accounted for using a Bonferroni correction factor) for the informed vs misinformed and smart random walker, respectively. Exact p -values for the significant results can be found in Table  6 .

Given that the key features of the message arising from the lower-dimensional representations are the goal location and initial action features, it is unsurprising that, as long as the goals are known, the informed student performs well on maze tasks it has not seen before (Fig.  5 d). When considering the performance of the student on unknown goals, for both trained (Fig.  5 c) and untrained goal locations (Fig.  5 e), we note that, in most cases, the informed student performs, at most, on par with the smart random walker. This indicates that the message has a detrimental effect on the students. It can not generalize to the goals it has not seen, as the information provided does not allow it to build an adequate representation of the task. Finally, even though the degraded messages do not carry significant information about the world configuration, we hypothesize it is sufficient to produce minimally better performance in the known worlds compared to the unknown worlds.

We conclude that the student output retains pertinent task information that can enhance the performances of other students, even if degraded. This can be seen in the solve rates of the informed student. They are always higher than those of the misinformed student, allowing us to assume that the student can use relevant information within the degraded message. However, generalizability to unknown goals is lost under this framework, even when the student previously achieved high success rates (Fig.  4 f, h, checkerboard).

Nevertheless, these results represent an early attempt to analyze task-driven communication with generalist agents. Notably, one key aspect is how a compromise or balance between tutoring and learning can be achieved in multi-task and multi-agent systems to keep a relevant and generalizable message space. In other words, relevant features across tasks can be captured by a centralized embedding generated by individual experiences of agents, similar to how biological agents behave.

Task-relevant representations, either in the brain 2 , 64 , as part of a linguistic system 4 , 65 or in artificial agents 1 , ought to be generalizable. Humans take advantage of this generalizability to perform new or slightly different tasks from the ones they may have encountered before. For instance, when learning to ride a bicycle, an individual does not need to relearn all the principles of balance and coordination when switching to a different bike or even another mode of transportation, like a scooter or a motorcycle. Similarly, an artificial agent faced with an out-of-distribution task may need to draw on its internal representations and their generalizability to complete it successfully. However, it remains an open question how social agents can reconcile abstractions from their own experience with those acquired through communication.

We present a multi-agent RL system of teacher-to-student communication that accounts for task-wide variability. Notably, embedding the state-action value function into a low-dimensional format leads to effective abstractions, enabling agents to learn goals and states flexibly across trials from model-free instructors. Additionally, we introduce a framework to analyze the nature of such communication protocols. Drawing inspiration from Tucker et al. 28 , our research builds upon their findings that agents can effectively communicate in noisy environments by clustering tokens within a learned, continuous space. Additionally, we reference the work of Foerster et al. 34 , who developed a model for independent parameter learning by agents, along with a system facilitating real-valued messages during centralized learning. Unlike the approach of Foerster et al., which shares gradient information between agents for end-to-end training, our method uses a continuous channel solely for task representations and trains agent parameters separately, without shared gradient data. This approach yielded a latent structure that prioritized variability along the goal space instead of the maze configuration, contrasting with the prominence of the state space in solely teacher-based models.

Using this framework, we studied the representational nature of the message space that includes the student return in training. We found this approach improved performance and yielded a latent structure that prioritized variability along the goal space instead of the maze configuration, in contrast with the prominence of the state space in the solely teacher-based one. Thus, by including reward-based constraints in language development, we saw that the communication channel could prioritize answering to task structure while acquiring a similar—or superior—reconstruction error. Akin to “total feedback” 31 in human speech, where the speaker modifies their message based on environmental factors/presence of other individuals, the student-autoencoder structure adjusted the message space based on the student performance.

Additionally, we studied the importance and sensitivity of this language space by feeding back the space-action value maps of students in a similar manner to the telephone game 63 . The degraded information confirms the relationship between the quality of representations of agents and their performance and points to the importance of good sample space to construct language. Despite this degradation, we retained certain features that were important for task performance (e.g., the ability to interpret novel messages using known messages—akin to “productivity” 31 ), a similar effect that has been observed in human speech 66 .

Overall, our results indicate that a generalist agent should be able to relate to the language space in an invariant manner. This opens up avenues to study the importance of specific social structures, such as teacher or student roles, which may be critical to a robust language space and to moderate the information flow.

The implications of our study suggest possible analogies with natural languages. First, our system evolves according to a utility or gain function, not solely to error minimization or comprehensibility. Lossless information transmission is insufficient for competent behavior, and the message space needs to adapt to be advantageous for other agents. This is similar to the natural language, where morphemes evolve according to motives, goals, and efficiency of a group 37 , 67 . For instance, in birds, it has been observed that utility drives the emergence of new linguistic relations or compositions 68 , 69 . Second, introducing dimensionality and sparsity constraints is motivated by anatomical and cognitive limitations, such as vocal tract size or memory capacity 70 , 71 . Hence, by allocating a predefined number of dimensions to our communication system, we replicate such properties and observe that these are organized into hierarchical task-relevant modes. However, ongoing work still aims to answer how channel size relates to the representational space, as machine learning and brain activity tend to converge to a high-dimensional space in the representations that are not shared by the actual symbolic space 55 , 72 . Studies have shown that brain activity is compressed relative to the message space even if our languages are not precisely low-dimensional 9 , 53 .

In this study, we did not utilize sequential composition to generalize our message 39 , 73 . Instead, we aimed to generalize through interpolation of the continuous messages. Nonetheless, the framework can readily be extended to include composable messages using sequential sets of tasks (termed “duality of patterning” in the work of Hockett 31 ), which will be the focus of future studies. Additionally, in contrast to natural language, our model lacks predefined syntax or grammar with respect to behavioral variables 74 , 75 . By encoding task-relevant information (e.g., Q-matrices), the lower-dimensional embedding space was biased to include features of the task information indirectly. This was observed in the emergent hierarchical latent structure obeying task variables and is similar to social species that show cultural or experience-dependent complexity in their linguistic traits, like non-primate mammals such as bottlenose dolphins 76 or naked-mole rats 77 . In this sense, we presume that the neural representations and circuitry of the agents evolve and rewire to enable social learning 5 . By doing so, we look at the interplay between the community scale and the cognitive one instead of fixing communication or neuronal representations. Thus, research around generalist agents performing dual roles as teachers and students is crucial. This involves creating agents with distinct sender and receiver units and an experience-based policy. Additionally, examining the impact of the social graph on language construction and expanding to further tasks is vital.

Thus, inspired by Tieleman et al. 48 , who employed an encoder-decoder model to examine how community size influences message representations, future work should systematically examine diverse channel structures. This would permit, for instance, to reverse the student-teacher roles and understand how information emerges and propagates through embeddings. Additionally, it would be interesting to examine more model-agnostic outcomes using sequential network architectures, e.g., recurrent neural networks or transformers. Furthermore, since discrete communication protocols are more commonly used than the continuous approach in our work 28 , 40 , 41 , 42 , 43 , they should be integrated into our existing framework. Finally, as introduced by Dupoux 78 , there are several features critical for the study of language emergence and language learning: (i) being computationally tractable, (ii) using realistic tasks that can be performed by real biological agents and (iii) use the results of those biological agents as benchmarks for the artificial agent performance. In this sense, while tractability is a key component of our framework, we emphasize its utility to neuroethologists, who can work with biological data within our framework to study brain activity in relation to language abilities in future studies.

In this sense, while tractability is a key component of our framework, we emphasize its utility to neuroethologists, who can readily harness our framework study brain activity in relation with language abilities.

In conclusion, we have introduced a multidisciplinary approach to studying language emergence using reinforcement agents and an encoding network. Instead of treating our system in a fashion akin to linguistics, we approach the communication problem as a top-down representation problem starting at the neural representations. This framework opens compelling avenues to generate hypotheses about the interplay between individual and collective behavior and those abstractions, both internal and external, emerging from social communication.

Teacher agent Q-learning

In our communication model, shown in Fig.  1 , the navigation task solutions are learned by the teacher agent (implemented via a multilayer perceptron) via deep Q-learning 29 . The Q -value for action a and state s , Q ( s ,  a ), represents the agent’s future maximum return achievable by any policy. Despite the small state-action space, using an artificial neural network provides the flexibility to apply the framework to future tasks that may have much larger state-action spaces. Concretely, the teacher agents are trained to output Q -values satisfying the Bellman equation:

Thus the expected future reward is composed of the immediate reward, \({{{\mathcal{R}}}}^{s,a}\) , of the action, a , and the maximum reward the agent can expect from the next state \(s^{\prime}\) onward when behaving optimally, i.e., picking the action that promises the most reward. The temporal discount \({\gamma }_{{{\rm{Bellman}}}}\in \left[0,\, 1\right]\) signifies the uncertainty about rewards obtained for future actions ( γ Bellman  = 0 would be maximum uncertainty, we use γ Bellman  = 0.99, see Supplementary Table  S2 ).

To train the DQN, we minimize Mean Squared Error loss between the left- and right-hand sides of eq. ( 2 ), i.e., we minimize

where \({{\mathcal{T}}}\) is a set of transitions (state, action, and corresponding reward) 〈 s ,  a ,  R s , a 〉 that the teacher DQN is trained on in the current optimization step. Thus, one optimization step is performed after each step the agent takes in the maze. The transition set \({{\mathcal{T}}}\) is composed of two distinct transitions: i) “long-term memory” transitions, which are all unique transitions the agent has seen since training began, and ii) additionally weighted “short-term memory” transitions, which are the last L transitions the agent has seen. Therefore, the transitions that have recently been executed several times have a higher impact on the loss function \({{{\mathcal{L}}}}_{{{\rm{DQN}}}}\) than the ones that were encountered a long time ago.

Network specifications

The student and teacher networks are identical multilayer perceptrons apart from the input dimension. Each neuron in the two networks (except for the K message neurons) has a ReLU activation function and a bias parameter. The number of parameters per layer for the student and teachers is listed in Table  3 . The autoencoder neural network, which we use as a language proxy, consists of convolutional layers in addition to the fully connected layers. We use convolutions because the entries of the Q-matrix represent the states of the two-dimensional grid-world and, therefore, include spatial information that the network needs to learn. Thus, the input (Q-matrix of the teacher) is processed by two convolutional layers in the first half of the autoencoder, followed by one fully connected linear layer that outputs the message vector. After this dimensionality reduction, the decoding half of the autoencoder aims at reconstructing the original Q-matrix from the message vector. The architecture of the autoencoder is summarized in Table  4 . For a brief study of the effect of different hyperparameters see Supplementary Figs.  S11 and   S12 .

Training and test tasks

The square grid-world setting consists of a grid of size n  ×  n (see examples in Figs.  6 , 7 ). Given that impenetrable walls surround each maze, this gives the agent an effective number of possible states (including the initial state where the agent starts, the goal state the agent has to reach, and the wall states the agent can not cross) equal to \(\tilde{n}\times \tilde{n}\) , where \(\tilde{n}=n-2\) . In all cases, the agent starts in the bottom left corner. During the training of the SAE and student, we only include mazes with zero and one interior wall state, which gives us \(({\tilde{n}}^{2}-1)+({\tilde{n}}^{2}-1)({\tilde{n}}^{2}-2)={({\tilde{n}}^{2}-1)}^{2}\) possible maze-solving tasks. The agent moves through the grid-worlds with four discrete actions: single steps to the right, up, left, and down. Each episode starts with the agent at the initial state and ends when the goal is reached, or the maximum number of steps has been taken. To avoid potential infinite loops or movements into the walls, the agent receives a small negative reward for any action ( R step  = −0.1) and a large negative reward for hitting any wall ( R wall  = −0.5). If the agent reaches the goal, they receive a large positive reward ( R goal  = 2).

figure 6

The student always starts in the bottom left corner. Light blue squares mark wall locations, which can not be accessed. The 4 × 4 mazes with 0 or 1 wall state comprise the training tasks.

We used all 4 × 4 grid-worlds with 0 or 1 wall state, amounting to 16 worlds in total, see Fig.  6 as tasks for training the language and the student. In world 0 (top left), there are 15 possible tasks, i.e., goal locations, namely all states that are neither a wall nor the starting location (all white squares without inset in the figure). Similarly, in the 15 worlds with a single wall, there are 14 possible tasks, amounting to 225 tasks used for training the language and the student agent. During the teacher training, the Q -values of the wall state positions of the teacher Q-matrix are set to 0, as the agent can never visit them (due to bounce back).

For unknown tasks, we chose all possible configurations of mazes with two wall states, six examples of which are shown in Fig.  7 . We eliminated mazes that led to inaccessible states, leading to 101 possible configurations with two walls, each with 13 goal locations. Therefore, the test set was made up of 1313 test tasks in total.

figure 7

The student always starts in the bottom left corner. Light blue squares mark wall locations, which can not be accessed. The 4 × 4 mazes with 2 wall states, except those where permissible states are cut off by walls, comprise the test tasks.

The full autoencoder loss

The loss function for the SAE (which does not include student feedback) is defined as:

where κ is a hyperparameter, which we can adjust to increase the importance of either the reconstruction or sparsity, Q is the input Q-matrix, \(Q^{\prime}\) the reconstruction of the autoencoder, and m is the lower-dimensional message.

We also included the student in the training of the language. Therefore, we augmented the autoencoder loss to include a term that enforced the usefulness of the messages to the student. This was done by first generating the student output for each possible state s  = ( x s ,  y s ), which consisted of four real numbers representing the four possible actions. Applying a softmax function to those values, we obtained action probabilities for the four actions in each state. Given all the action probabilities, we could calculate the state occupancy probabilities for the student after any number of steps k . We then defined the solve rate of a task as the state occupancy probability of the goal state after k steps, as this state could not be left once it had been reached. We aimed for optimal solutions to be found; therefore, we always allowed the student only k  =  k opt steps to solve the task during training, where k opt was the length of the shortest path to the goal.

This amounts to the first term in eq. ( 5 ) of the student goal finding loss. The exponent was chosen to avoid the local minimum of the loss in which a small number of training tasks are not solved at all while the majority is solved perfectly. The second term in eq. ( 5 ) is a regularization of the student output while the hyperparameter γ controls the relation between the two parts. The regularization of the student Q-matrix is also normalized by the number of its entries \(4{\tilde{n}}^{2}\) .

Analysis of variance in the message spaces

We analyze the structure of the different message spaces by studying the relative variances explained by the two features describing each navigation task: the placement of the walls and the goal’s location.

In this context, two types of variance can be computed: a variance within groups and a variance between groups , where a group is made up of either all tasks within a maze (i.e., same wall position) or all tasks with the same goal location. The former variance is lower when each group is clustered tightly, but the distance between groups is large. The latter variance is lower when the means of the groups cluster tightly, but there is a larger data spread within each group. To simplify the equations that follow, we introduce M , the total number of messages, N , the number of distinct groups, M i , the number of elements in group i and m i j , which refers to the j -th message of group i . Then, the mean of each group is \({\bar{m}}_{i}=\frac{1}{{M}_{i}}{\sum }_{j=1}^{{M}_{i}}{m}_{ij}\) and the overall mean is \(\bar{x}=\frac{1}{M}\mathop{\sum }_{i=1}^{N}\mathop{\sum }_{j=1}^{{M}_{i}}{m}_{ij}\) . Thus the variance within and between groups of messages is defined by

Here, we introduce a value β , which allows for a comparison between the two different variances. When β is close to one, the variance between groups dominates and vice versa.

Using the above variances, we can statistically test whether the means of all message groups (grouping either by wall position or goal location) are significantly different from each other by introducing the concept of the F -value, which is defined as the ratio of the mean square distance between groups M S B and the mean square distance within groups M S W :

The group means differ significantly when the F -value is greater than a critical F-statistic (depending on a significance threshold p and the degrees of freedom). As we removed the world with no wall states in the analysis of variances, the values of F crit (listed in Table  5 ) are the same in both grouping cases (by maze and by goal). The two degrees of freedom are N  − 1 = 14 and M  −  N  = 195.

Statistical methods

One-sided t -tests were used when comparing the informed and misinformed students against the smart random walker, and a two-sided t -tests when comparing against the misinformed student. When multiple groups were considered, a Bonferroni factor was included in the t-tests.

Language filtering used in the close-the-loop protocol

Initially, 25 languages were trained and evaluated when the student information was encoded and used for the navigation maze in Fig.  5 . However, within that set of languages, we encountered a subset of languages (~30%) that led to lower solving rates for the informed student than the misinformed student or random walker on the trained tasks. The structure of these languages, which we defined as inefficient, led to a loss of task-critical information during the encoding. These languages were only removed from the set of languages we analyzed in Fig.  5 . This language filtering was performed by retaining languages that, on average, led to a higher average task-solving rate for the informed student (receiving the message from encoded student information) compared to the average solving rate of the misinformed student and the random walker (all measured on the trained tasks). We included the misinformed student in the criterion to test whether our language is dysfunctional, i.e., the correct message leads to worse performance than a random message. Additionally, we included the random walker in this criterion as it is inherently intuitive that the informed student should perform better than the misinformed student, but that the language may nonetheless not provide a competitive advantage over taking random actions. Therefore, we check (i) if the language is inherently functional and (ii) if it provides information that the student can use. The languages that fail our criterion are those that lead to task information loss or did not imbue the student with generalization abilities. Mathematically, we retain a language if

where \({\mathbb{E}}\) is the expected value over all trained tasks, S I refers to the informed student who is provided the correct distorted message and S M is the misinformed student. We argue that this can be viewed through the biological lens where selective pressures favor more adaptive or efficient systems. Akin to the effect of natural evolution, where weak and inefficient members (in this case, languages) die out, languages that are detrimental to the student do not survive. For completeness’ sake, we provided a supplemental figure where the full set of languages is displayed in Supplementary Fig.  S13 . As expected we see a decrease in the performance of the informed student for all scenarios but nonetheless, in all cases where previously the informed student performed better than the random walker/misinformed student, that ordering is maintained, i.e., even using the sub-optimal language embeddings allows the informed student to perform better at the tasks.

Reporting summary

Further information on research design is available in the  Nature Portfolio Reporting Summary linked to this article.

Data availability

Source data are provided with this paper and can be found to generate the figures in this work can be found in the following public GitHub repository: github.com/meggl23/multi_agent_language, zenodo.org/doi/10.5281/zenodo.7885526 79 .

Code availability

Computer code to train the agents, generate languages, and plot the figures can be found in the following public GitHub repository: github.com/meggl23/multi_agent_language, zenodo.org/doi/10.5281/zenodo.7885526 79 .

Bengio, Y., Courville, A. & Vincent, P. Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35 , 1798–1828 (2013).

Article   PubMed   Google Scholar  

Behrens, T. E. et al. What is a cognitive map? Organizing knowledge for flexible behavior. Neuron 100 , 490–509 (2018).

Article   PubMed   CAS   Google Scholar  

Safaie, M. et al. Preserved neural dynamics across animals performing similar behaviour. Nature 623 , 765–771 (2023).

Article   ADS   PubMed   PubMed Central   CAS   Google Scholar  

Tomasello, M. The cultural origins of human cognition (Harvard University Press, 2009).

Dunbar, R. I. The social brain hypothesis. Evolut. Anthropol. 6 , 178–190 (1998).

Article   Google Scholar  

Hasson, U., Ghazanfar, A. A., Galantucci, B., Garrod, S. & Keysers, C. Brain-to-brain coupling: a mechanism for creating and sharing a social world. Trends Cogn. Sci. 16 , 114–121 (2012).

Article   PubMed   PubMed Central   Google Scholar  

Wilson, E. O. The Social Conquest of Earth (Liveright Publishing Corporation, 2012).

Ludwig, W., Anscombe, G. et al. Philosophical investigations . (Basic Blackwell,1953).

McKenzie, S. et al. Hippocampal representation of related and opposing memories develop within distinct, hierarchically organized neural schemas. Neuron 83 , 202–215 (2014).

Article   PubMed   PubMed Central   CAS   Google Scholar  

Kirby, S. & Hurford, J. Learning, culture and evolution in the origin of linguistic constraints. In Proc . Fourth European conference on artificial life 493-502 (MIT Press,1997).

Steels, L. The synthetic modeling of language origins. Evolut. Commun. 1 , 1–34 (1997).

Cangelosi, A. & Parisi, D. Computer simulation: A new scientific approach to the study of language evolution. Simulating the evolution of language (Springer, 2002).

Wagner, K., Reggia, J. A., Uriagereka, J. & Wilkinson, G. S. Progress in the simulation of emergent communication and language. Adapt. Behav. 11 , 37–69 (2003).

Havrylov, S. & Titov, I. Emergence of language with multi-agent games: learning to communicate with sequences of symbols. In Proc. Advances in Neural Information Processing Systems. 30 (2017).

Kottur, S., Moura, J. M., Lee, S. & Batra, D. Natural language does not emerge 'naturally' in multi-agent dialog. arXiv preprint arXiv:1706.08502 (2017).

Jaques, N. et al. Social influence as intrinsic motivation for multi-agent deep reinforcement learning in International conference on machine learning. 3040–3049 (PMLR, 2019).

Lowe, R., Foerster, J., Boureau, Y.-L., Pineau, J. & Dauphin, Y. On the pitfalls of measuring emergent communication. arXiv preprint arXiv:1903.05168 (2019).

Kajić, I., Aygün, E. & Precup, D. Learning to cooperate: emergent communication in multi-agent navigation. arXiv preprint arXiv:2004.01097 (2020).

Yuan, L. et al. Emergence of pragmatics from referential game between theory of mind agents. arXiv preprint arXiv:2001.07752 (2020).

Andreas, J. Language models as agent models. In Findings of the Association for Computational Linguistics: EMNLP 2022 . 5769–5779 (Association for Computational Linguistics, Abu Dhabi, United Arab Emirates, 2022).

Lazaridou, A., Peysakhovich, A. & Baroni, M. Multi-agent cooperation and the emergence of (natural) language. arXiv preprint arXiv:1612.07182 (2016).

Lee, J., Cho, K., Weston, J. & Kiela, D. Emergent translation in multi-agent communication. arXiv preprint arXiv:1710.06922 (2017).

Sukhbaatar, S., Denton, E., Szlam, A. & Fergus, R. Learning goal embeddings via self-play for hierarchical reinforcement learning. arXiv preprint arXiv:1811.09083 (2018).

Fried, D., Tomlin, N., Hu, J., Patel, R. & Nematzadeh, A. Pragmatics in language grounding: phenomena, tasks, and modeling approaches. arXiv preprint arXiv:2211.08371 (2022).

Lazaridou, A. & Baroni, M. Emergent multi-agent communication in the deep learning era. arXiv preprint arXiv:2006.02419 (2020).

Oroojlooy, A. & Hajinezhad, D. A review of cooperative multi-agent deep reinforcement learning. Appl. Intell. 53 , 1–46 (2022).

Rita, M., Chaabouni, R. & Dupoux, E. “LazImpa”: lazy and Impatient neural agents learn to communicate efficiently. arXiv preprint arXiv:2010.01878 (2020).

Tucker, M. et al. Emergent discrete communication in semantic spaces. Adv. Neural Inf. Process. Syst. 34 , 10574–10586 (2021).

Google Scholar  

Sutton, R. S. & Barto, A. G. Reinforcement learning: an introduction (MIT Press, 2018).

Ndousse, K. K., Eck, D., Levine, S. & Jaques, N. Emergent social learning via multi-agent reinforcement learning . In Proc . International Conference on Machine Learning 7991–8004 (PMLR, 2021).

Hockett, C. F. & Hockett, C. D. The origin of speech. Sci. Am. 203 , 88–97 (1960).

Galke, L., Ram, Y. & Raviv, L. Emergent communication for understanding human language evolution: What’s missing? arXiv preprint arXiv:2204.10590 (2022).

Haber, J. et al. The PhotoBook dataset: building common ground through visually-grounded dialogue. arXiv preprint arXiv:1906.01530 (2019).

Foerster, J., Assael, I. A., De Freitas, N. & Whiteson, S. Learning to communicate with deep multi-agent reinforcement learning. Adv. Neural Inf. Process. Syst. 29 , (2016).

Lake, B. M. & Baroni, M. Human-like systematic generalization through a meta-learning neural network. Nature 623 , 115–121 (2023).

Ku, A., Anderson, P., Patel, R., Ie, E. & Baldridge, J. Room-across-room: multilingual vision-and-language navigation with dense spatiotemporal grounding. arXiv preprint arXiv:2010.07954 (2020).

Seyfarth, R. M. & Cheney, D. L. The evolution of language from social cognition. Curr. Opin. Neurobiol. 28 , 5–9 (2014).

Chaabouni, R., Kharitonov, E., Dupoux, E. & Baroni, M. Anti-efficient encoding in emergent communication. In Proc. Advances in Neural Information Processing Systems . Vol. 32 , (2019).

Chaabouni, R., Kharitonov, E., Bouchacourt, D., Dupoux, E. & Baroni, M. Compositionality and generalization in emergent languages. arXiv preprint arXiv:2004.09124 (2020).

Bratman, J., Shvartsman, M., Lewis, R. L. & Singh, S. A new approach to exploring language emergence as boundedly optimal control in the face of environmental and cognitive constraints . In Proc. 10th International Conference on Cognitive Modeling 7–12 (Drexel University, 2010).

Mordatch, I. & Abbeel, P. Emergence of grounded compositional language in multi-agent populations. In Proc. AAAI conference on artificial intelligence, Vol. 32 (2018).

Chaabouni, R. et al. Emergent communication at scale. In Proc. International Conference on Learning Representations (2021).

Rita, M., Strub, F., Grill, J.-B., Pietquin, O. & Dupoux, E. On the role of population heterogeneity in emergent communication. arXiv preprint arXiv:2204.12982 (2022).

Rosenfeld, R. et al. A maximum entropy approach to adaptive statistical language modelling. Comput. Speech Lang. 10 , 187 (1996).

Bengio, Y., Ducharme, R. & Vincent, P. A neural probabilistic language model. In Proc. Advances in Neural Information Processing Systems, Vol. 13 (2000).

Tinbergen, N. The evolution of signalling devices. Social behavior and organization among vertebrates , 206–230 (1964).

Dong, S., Lin, T., Nieh, J. C. & Tan, K. Social signal learning of the waggle dance in honey bees. Science 379 , 1015–1018 (2023).

Article   ADS   PubMed   CAS   Google Scholar  

Tieleman, O., Lazaridou, A., Mourad, S., Blundell, C. & Precup, D. Shaping representations through communication: community size effect in artificial learning systems. arXiv preprint arXiv:1912.06208 (2019).

Silver, D. et al. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science 362 , 1140–1144 (2018).

Article   ADS   MathSciNet   PubMed   CAS   Google Scholar  

François-Lavet, V. et al. An introduction to deep reinforcement learning. Found. Trends Mach. Learn. 11 , 219–354 (2018).

Lee, D., Seo, H. & Jung, M. W. Neural basis of reinforcement learning and decision making. Annu. Rev. Neurosci. 35 , 287–308 (2012).

Rocktäschel, T., Bošnjak, M., Singh, S. & Riedel, S. Low-Dimensional Embeddings of Logic. In Proc. Annual Meeting of the Association for Computational Linguistics (2014).

Antonello, R., Turek, J., Vo, V. A. & Huth, A. G. Low-dimensional structure in the space of language representations is reflected in brain responses. In Proc. Neural Information Processing Systems (2021).

Huth, A. G., De Heer, W. A., Griffiths, T. L., Theunissen, F. E. & Gallant, J. L. Natural speech reveals the semantic maps that tile human cerebral cortex. Nature 532 , 453–458 (2016).

Article   ADS   PubMed   PubMed Central   Google Scholar  

Caucheteux, C. & King, J.-R. Brains and algorithms partially converge in natural language processing. Commun. Biol. 5 , 134 (2022).

Robotka, H. et al. Sparse ensemble neural code for a complete vocal repertoire. Cell Rep. 42 , 112034 (2023).

Ng, A. et al. Sparse autoencoder. CS294A Lect. Notes 72 , 1–19 (2011).

Manning, C. & Schutze, H. Foundations of statistical natural language processing (MIT Press, 1999).

Olshausen, B. A. & Field, D. J. Sparse coding of sensory inputs. Curr. Opin. Neurobiol. 14 , 481–487 (2004).

Tampuu, A. et al. Multiagent cooperation and competition with deep reinforcement learning. PloS one 12 , e0172395 (2017).

Brighton, H. & Kirby, S. Understanding linguistic evolution by visualizing the emergence of topographic mappings. Artif. Life 12 , 229–242 (2006).

Kharitonov, E., Chaabouni, R., Bouchacourt, D. & Baroni, M. Entropy minimization in emergent languages in International Conference on Machine Learning, 5220–5230 (2020).

Mesoudi, A. & Whiten, A. The multiple roles of cultural transmission experiments in understanding human cultural evolution. Philos. Trans. R. Soc. B Biol. Sci. 363 , 3489–3501 (2008).

Flesch, T., Saxe, A. & Summerfield, C. Continual task learning in natural and artificial agents. Trends Neurosci. 46 , P199-210 (2023).

Ten Cate, C. Assessing the uniqueness of language: animal grammatical abilities take center stage. Psychon. Bull. Rev. 24 , 91–96 (2017).

Breithaupt, F., Li, B., Liddell, T. M., Schille-Hudson, E. B. & Whaley, S. Fact vs. affect in the telephone game: all levels of surprise are retold with high accuracy, even independently of facts. Front. Psychol. 9 , 2210 (2018).

McMahon, A. & McMahon, R. Evolutionary linguistics (Cambridge University Press, 2012).

Engesser, S. & Townsend, S. W. Combinatoriality in the vocal systems of nonhuman animals. Wiley Interdiscip. Rev. Cogn. Sci. 10 , e1493 (2019).

Suzuki, T. N., Wheatcroft, D. & Griesser, M. Experimental evidence for compositional syntax in bird calls. Nat. Commun. 7 , 10986 (2016).

Fitch, W. T. The evolution of language (Cambridge University Press, 2010).

Christiansen, M. H. & Chater, N. The now-or-never bottleneck: a fundamental constraint on language. Behav. Brain Sci. 39 , e62 (2016).

Mikolov, T., Yih, W.-t. & Zweig, G. Linguistic regularities in continuous space word representations. In Proc. 2013 conference of the North American chapter of the Association for Computational Linguistics: Human Language Technologies, 746–751 (Association for Computational Linguistics, 2013).

Kharitonov, E. & Baroni, M. Emergent language generalization and acquisition speed are not tied to compositionality. arXiv preprint arXiv:2004.03420 (2020).

Nowak, M. A. & Krakauer, D. C. The evolution of language. Proc. Natl Acad. Sci. 96 , 8028–8033 (1999).

Spranger, M. The evolution of grounded spatial language. Computational Models of Language Evolution , Vol. 5 (Language Science Press, Berlin, 2016).

Janik, V. M. & Slater, P. J. Context-specific use suggests that bottlenose dolphin signature whistles are cohesion calls. Anim. Behav. 56 , 829–838 (1998).

Barker, A. J. et al. Cultural transmission of vocal dialect in the naked mole-rat. Science 371 , 503–507 (2021).

Dupoux, E. Cognitive science in the era of artificial intelligence: a roadmap for reverse-engineering the infant language-learner. Cognition 173 , 43–59 (2018).

Wieczorek, T. J., Tchumatchenko, T., Wert-Carvajal, C. & Eggl, M. F. A framework for the emergence and analysis of language in social learning agents version v5. Apr. https://doi.org/10.5281/zenodo.7885526 (2024).

Download references

Acknowledgements

We acknowledge the support of the Institute of Experimental Epileptology and Cognition Research at the University of Bonn Medical Center and the Joachim Herz Foundation (M.F.E. and C.W-C.). This research was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation)—Project-ID 227953431—SFB 1089 (T.T.). We thank Alison Barker and Martin Fuhrmann for fruitful discussions, and all members of the Tchumatchenko group, particularly Pietro Verzelli, for feedback on the manuscript. This work was supported by the Open Access Publication Fund of the University of Bonn.

Open Access funding enabled and organized by Projekt DEAL.

Author information

These authors jointly supervised this work: Carlos Wert-Carvajal, Maximilian F. Eggl.

Authors and Affiliations

Department of Computer Science, Technical University Darmstadt, Darmstadt, Germany

Tobias J. Wieczorek

Institute of Experimental Epileptology and Cognition Research, University of Bonn Medical Center, Bonn, Germany

Tobias J. Wieczorek, Tatjana Tchumatchenko, Carlos Wert-Carvajal & Maximilian F. Eggl

You can also search for this author in PubMed   Google Scholar

Contributions

T.J.W., experimental design, code writing, data analysis, visualization, and writing; T.T., supervision, project administration, experimental design, funding, and writing; C.W-C., conceptualization, supervision, experimental design, data analysis, and writing; M.F.E., conceptualization, supervision, experimental design, data analysis, and writing.

Corresponding authors

Correspondence to Carlos Wert-Carvajal or Maximilian F. Eggl .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Peer review

Peer review information.

Nature Communications thanks Lukas Galke, and Roma Patel for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary information, peer review file, reporting summary, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Wieczorek, T.J., Tchumatchenko, T., Wert-Carvajal, C. et al. A framework for the emergence and analysis of language in social learning agents. Nat Commun 15 , 7590 (2024). https://doi.org/10.1038/s41467-024-51887-5

Download citation

Received : 30 May 2023

Accepted : 16 August 2024

Published : 31 August 2024

DOI : https://doi.org/10.1038/s41467-024-51887-5

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

critical period hypothesis language learning

IMAGES

  1. The Critical Period Hypothesis in SLA (Second Language Acquisition)

    critical period hypothesis language learning

  2. PPT

    critical period hypothesis language learning

  3. Benefits of the Bilingual Brain. The critical period hypothesis for

    critical period hypothesis language learning

  4. Critical Period Hypothesis in Second Language Acquisition

    critical period hypothesis language learning

  5. Critical period hypothesis

    critical period hypothesis language learning

  6. PPT

    critical period hypothesis language learning

VIDEO

  1. B.Ed

  2. LENNEBERG, CRITICAL PERIOD HYPOTHESIS, LATERLISATION

  3. What is CRITICAL PERIOD HYPOTHESIS What does CRITICAL PERIOD HYPOTHESIS mean

  4. The Critical Period Hypothesis #CPH فرضية المرحلة الحرجة

  5. NATIVIST THEORY by Noam Chomsky in Urdu/Hindi

  6. Critical Period Hypothesis

COMMENTS

  1. Critical periods for language acquisition: New insights with particular

    Evidence for the critical period hypothesis (CPH) comes from a number of sources demonstrating that age is a crucial predictor for language attainment and that the capacity to learn language diminishes with age. To take just one example, a recent study by Hartshorne, ...

  2. Cognitive scientists define critical period for learning language

    The findings suggest that the critical period for learning language is much longer than cognitive scientists had previously thought. "It was surprising to us," Hartshorne says. "The debate had been over whether it declines from birth, starts declining at 5 years old, or starts declining starting at puberty."

  3. Critical period hypothesis

    The critical period hypothesis [1] is a theory within the field of linguistics and second language acquisition that claims a person can only achieve native-like fluency [2] in a language before a certain age. It is the subject of a long-standing debate in linguistics [3] and language acquisition over the extent to which the ability to acquire language is biologically linked to developmental ...

  4. The Critical Period Hypothesis in Second Language Acquisition: A

    Delineating the scope of the critical period hypothesis. First, the age span for a putative critical period for language acquisition has been delimited in different ways in the literature .Lenneberg's critical period stretched from two years of age to puberty (which he posits at about 14 years of age) , whereas other scholars have drawn the cutoff point at 12, 15, 16 or 18 years of age .

  5. Critical Period Hypothesis

    Adult and Second Language Learning. Denise H. Wu, Talat Bulut, in Psychology of Learning and Motivation, 2020 2.2 The critical period hypothesis for acquisition of foreign language. In addition to acquisition of L1, the critical period hypothesis has been employed to account for acquisition of a second language (L2). Different from the common ease and success of learning L1 in most people of ...

  6. Rethinking the critical period for language: New insights into an old

    The critical period hypothesis - a diamond in the rough. Bilingualism: Language and Cognition. doi: 10.1017/S1366728918000147 ... Critical period effects in second language learning: The influence of maturational state on the acquisition of English as a second language. Cognitive Psychology, 21, 60-90.

  7. The Critical Period Hypothesis: Support, Challenge, and Reconc

    through which to test the effects of maturation on language learning. At first glance, the evidence supporting a critical period for second language acquisition seems to be convincing. As Bley-Vroman's (1988) Fundamental Difference Hypothesis argues, adult language learning of an L2 as opposed to an L1 is characterized by widespread failure.

  8. Do Language Models Have a Critical Period for Language Acquisition?

    The idea that there is a critical period (or sensi-tive period) for language learning has long been prominent in language acquisition research (Pen-field and Roberts,1959;Lenneberg,1967). Dis-cussions around the CP, however, typically cluster a number of related observations; these must be teased apart in order to be properly understood ...

  9. Rethinking the critical period for language: New insights into an old

    1.1 The Critical Period for Language. The existence of a CPL has been the subject of considerable debate. Skinner (Reference Skinner 1957) initially proposed that children learn language as a result of stimulusresponse reinforcements emanating from the environment.Chomsky (Reference Chomsky 1959) countered that features of the environment cannot explain language development, which he proposed ...

  10. Age and the critical period hypothesis

    The 'critical period hypothesis' (CPH) is a particularly relevant case in point. This is the claim that there is, indeed, an optimal period for language acquisition, ending at puberty. However, in its original formulation (Lenneberg 1967), evidence for its existence was based on the relearning of impaired L1 skills, rather than the learning ...

  11. (PDF) The Critical Period Hypothesis in Second Language Acquisition: A

    The present paper aims at highlighting the Critical Period Hypothesis (CPH) in Second Language Acquisition (SLA) which suggests that the individuals' attempts to learn a second language after ...

  12. [PDF] Critical periods for language acquisition: New insights with

    One of the best-known claims from language acquisition research is that the capacity to learn languages is constrained by maturational changes, with particular time windows (aka 'critical' or 'sensitive' periods) better suited for language learning than others. Evidence for the critical period hypothesis (CPH) comes from a number of sources demonstrating that age is a crucial predictor ...

  13. PDF The Critical Period Hypothesis for Second Language Acquisition

    David Singleton's (2005) study, ''The Critical Period Hypothesis: A coat of many colors'', is the second most-cited article ever to appear in International Review of Applied Linguistics in Language Teaching. At its core, the piece is a critique of the Critical Period Hypothesis (CPH) as it has been applied in the context of second

  14. Critical Period In Brain Development and Childhood Learning

    The critical period hypothesis applies to both first and second-language learning. Until recently, research around the critical period's role in first language acquisition revolved around findings about so-called "feral" children who had failed to acquire language at an older age after having been deprived of normal input during the ...

  15. The Critical Period Hypothesis in Second Language Acquisition: A ...

    Delineating the scope of the critical period hypothesis. First, the age span for a putative critical period for language acquisition has been delimited in different ways in the literature .Lenneberg's critical period stretched from two years of age to puberty (which he posits at about 14 years of age) , whereas other scholars have drawn the cutoff point at 12, 15, 16 or 18 years of age .

  16. The Critical Period Hypothesis: Support, Challenge, and

    Given the general failure experienced by adults when attempting to learn a second or foreign language, many have hypothesized that a critical period exists for the domain of language learning. Supporters of the Critical Period Hypothesis (CPH) contend that language learning, which takes place outside of this critical period (roughly defined as ending sometime around puberty), will inevitably ...

  17. PDF Brain Mechanisms Underlying the Critical Period for Language: Linking

    Critical Period for Language: Linking Theory and Practic e Patricia K. Kuhl Introduction Half a century ago, humans' capacity for speech and language provoked classic debates on what it means to be human by strong proponents of na-tivism (Chomsky, 1959) and learning (Skinner, 1957). The debate centered

  18. Critical period effects in second language learning: the influence of

    Lenneberg (1967) hypothesized that language could be acquired only within a critical period, extending from early infancy until puberty. In its basic form, the critical period hypothesis need only have consequences for first language acquisition. Nevertheless, it is essential to our understanding of …

  19. Critical Period Hypothesis (CPH)

    Proposed by Wilder Penfield and Lamar Roberts in 1959, the Critical Period Hypothesis (CPH) argues that there is a specific period of time in which people can learn a language without traces of the L1 (a so-called "foreign" accent or even L1 syntactical features) manifesting in L2 production (Scovel 48). If a learner's goal is to sound ...

  20. The Critical Period for Language Acquisition: Evidence from Second

    The Critical Period for Language Ac- quistion: Evidence from Second Language Learning. CHILD DEVELOPMENT, 1978, 49, 1114-1128. The critical period hypothesis holds that first language acquisition must occur before cerebral lateralization is complete, at about the age of puberty.

  21. What are the main arguments for and against the critical period

    Controversies with the Critical Period Hypothesis (CPH) are related to the issue of ultimate attainment of early and late language learners, that is, the highest language proficiency level they can attain. ... & Newport, E.L. 1989. Critical period effects in second language learning: The influence of maturational state on the acquisition of ...

  22. Critical period effects in second language learning: The influence of

    Lenneberg (1967) hypothesized that language could be acquired only within a critical period, extending from early infancy until puberty. In its basic form, the critical period hypothesis need only have consequences for first language acquisition. Nevertheless, it is essential to our understanding of the nature of the hypothesized critical period to determine whether or not it extends as well ...

  23. PDF The Critical Period Hypothesis for Language Learning: What the 2000 US

    It is generally agreed that learning a language is easier for younger than older people, where the measure of success is ultimate achievement (see, for example, Larsen-Freeman and Long, 1991, Scovel, 2000, Hyltenstam and Abrahamsson, 2003).1 Many studies of age-related language acquisition have hypothesized a critical learning period.

  24. A framework for the emergence and analysis of language in social

    Finally, as introduced by Dupoux 78, there are several features critical for the study of language emergence and language learning: (i) being computationally tractable, (ii) using realistic tasks ...