Sex differences in the speech sound development of young children | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Sex differences in the speech sound development of young children Beate Peter, Diane A. Ogiela, Laurel Bruce This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-8408784/v1 This work is licensed under a CC BY 4.0 License Status: Under Review Version 1 posted 5 You are reading this latest preprint version Abstract Past studies have demonstrated higher prevalence rates of speech sound disorders among male than female children, based on standardized articulation tests with sex-averaged norms. Here, we survey the most commonly used standardized articulation tests for properties (size, age range, sex) of their norming samples. Based on the articulation assessment with sex-specific norms across the longest age span, we investigate sex differences in raw error scores corresponding to key standard score bands. Of 12 tests of speech sound production, six provided separate norms for females and males, but only two, the Goldman-Fristoe Test of Articulation – Third Edition (GFTA-3) and the Khan-Lewis Phonological Assessment – Third Edition (KLPA-3) that is based on the same word productions and norming sample as the GFTA-3, provided sex-specific norms for the entire sampled age span from 2 years 0 months to 21 years 11 months. In the lowest range of the score distribution, 1 and 2 standard deviations below the mean, and up to ages up to 6 years, both the GFTA-3 and KLPA-3 showed a distinct articulation accuracy advantage for females. Whereas the causes for this discrepancy are not well understood, clear clinical implications emerge: When diagnosing children with speech sound disorders toward qualifying them for interventions, it is imperative to use sex-specific norms, as sex-averaged norms may lead to over-diagnosing boys and under-diagnosing girls. Similarly, more precise prevalence rates of speech sound disorders among females and males can be obtained from sex-specific norms than from sex-averaged norms. Speech sound disorders standardized assessment tools sex-specific norms sex differences precision diagnostics Figures Figure 1 Figure 2 Introduction Speech sound production is a skill that most children anywhere in the world acquire naturally without explicit instruction. Typically, children say their first word around their first birthday; by their second birthday, many children can say anywhere from 50 to 300 different words. As children learn to talk, they typically master vowels by about age 3 years (Pollock & Berni, 2003 ) and consonants by age 6 years (Crowe & McLeod, 2020 ; McLeod & Crowe, 2018 ). Although there is considerable individual variability, most children acquire the different consonants in predictable patterns, with stops (/p, b, t, d, k, g/[1] ), glides (/j/ as in “ y ellow,” /w/) and nasals /m, n, ŋ/ (/ŋ/ as in “wi ng ”) mastered first, followed by fricatives (e.g., /f, v, s, z/), then affricates (/ʧ/ as in “ ch ime” and /ʤ/ as in “ j ungle”), and finally liquids (/l, r/). Sounds produced in the front of the mouth (e.g., bilabials /p, b, m/ and alveolars /t, d, n/) are generally acquired before sounds produced in the back of the mouth (e.g., velars /k, g, ŋ/). These patterns have been observed in English-acquiring children as well as children acquiring many other languages (McLeod & Crowe, 2018 ). The order of consonant acquisition may not be primarily influenced by chronological age but, rather, by mastery of basic motor patterns that form the foundation for acquiring successively more complex motor patterns. This hypothesis is based on a rare case of a child who did not begin to speak until age 10 years. She was born with a profound global motor disease that resolved after DNA sequencing revealed a loss-of-function variant in a gene in the dopamine pathway. Within three months of this diagnosis and the start of pharmaceutical intervention, the child began communicating orally and developed mastery of consonants in the same order as typical children at younger ages, with /k, g, ŋ/ emerging after age 12 years and /r, l/ after age 15 years (Peter et al., 2019 ). Some children struggle with learning to produce speech sounds correctly by the expected ages, usually only regarding consonants, in the absence of known medical causes. The term speech sound disorder (SSD) is generally used for this scenario. Not being able to produce late-developing sounds like affricates and liquids until much later than typically expected is common among early school-age children, usually diagnosed as articulation delay/disorder. Leaving out parts of consonant clusters (e.g., “top” instead of “stop”) or word-final consonants (“ha” instead of “hat”), even though the individual components have been mastered, are examples of phonological delay/disorder, based on the position of sounds in words rather than the individual sound. Childhood apraxia of speech is a rare and severe subtype of speech sound disorder at the level of motor coordination, characterized, among other things, by a small consonant inventory, inconsistent word productions, mis-stressed syllables, vowel errors, mis-sequenced consonants in words, and speech that is difficult to understand (American Speech-Language-Hearing Association, 2007 ; Case et al., 2024 ; Shriberg et al., 2019 ). Difficulties with learning to produce speech sounds are not rare. In a representative sample of 1,328 six-year-old children, 3.8% showed evidence of delayed speech sound production (Shriberg et al., 1999 ). This determination was made using the Word Articulation subtest of the Test of Language Development – 2 Primary (TOLD-2:P) (Newcomer & Hammill, 1988 ), consisting of 20 target words and normed for specific age bands but not separately for males and females. Interestingly, more males (4.5%) than females (3.1%) were identified as speech delayed, arriving at a ratio of 1.5:1 (males:females). Similar sex differences in speech sound disorder prevalence have been reported elsewhere. For instance, 639 three-year-old children were assessed with the Speech Disorders Classification System (Shriberg et al., 1997 ) using phonetic transcriptions of play sessions. Similar to the TOLD-3:P, this classification system applies the same scoring rules to female and male probands. Of the assessed children, 100 (15.6%) met criteria for speech delay, and of these, 70% were males and 30% were females, arriving at an affectation ratio of 2.3:1 (males:females). Of the 539 children who did not qualify for a diagnosis of speech delay, 52% were males and 48% females, a ratio of 1.08:1 (males:females) (Campbell et al., 2003 ). This suggests that sex differences are most evident in the lower ranges of speech sound skills. Before moving further into sex-based characteristics of standardized tests of speech sound production, we offer two definitions and a disclaimer. The term “sex” is often used to describe maleness or femaleness on the basis of biological traits, for instance the presence or absence of a Y chromosome, whereas the term “gender” often describes maleness or femaleness as a social or behavioral construct (Bates et al., 2022). Regarding tests of speech sound production with separate norms for females and males where maleness and femaleness were treated as binary traits, we acknowledge here that this stringent binary framework may not fit all children. In addition, these assessment tools do not report on what basis participants were classified as male or female, e.g., school records, physical appearance, self-identification, or other means. Our analyses were based on the norming samples as provided in the test manuals, where data were reported in binary form for male and female participants. When a caregiver or teacher becomes concerned about a child’s ability to produce speech sounds with age-appropriate accuracy, a referral for a speech sound assessment can be made. Speech-language pathologists (SLPs) typically plan an assessment session that includes a language sample collected during a play interaction to gauge general intelligibility. For children age 2 years and up, the assessment also routinely includes a standardized test of articulation designed to elicit word productions such that all consonants can be checked for accuracy in all or most relevant word positions (e.g., /ŋ/ in medial and final but not initial positions in English). Vowels are not typically included in tests of articulation because incorrect speech sound productions are nearly always observed for consonants. Two exceptions to this rule are rhotic (“r-colored”) vowels (/ɝ, ɚ/ in “b ir d” and “teach er ”) for certain dialects of English such as American English, and children with CAS who can have difficulty with a variety of vowels. Where indicated, tests of phonological patterns can be used as well. Most articulation and phonology tests are designed to be administered using pictures or objects and prompts asking the child to say the target words without providing a model. This approach allows the child to say the word as they typically would say it, rather than imitate the word as produced by the SLP, which may not be a true representation of the child’s habitual production of the word. Some tests of articulation also include subtests probing word production accuracy at the sentence level. In the context of school-based therapy, standardized testing can play a key role in determining eligibility for services. In a survey of more than 500 school SLPs, one question was, “Which items are part of an assessment of SSD at your school?” The SLP survey responders could choose multiple items from a 10-item list, and using standardized tests ranked highest with 73.5% of the SLPs selecting this option (Farquharson & Tambyraja, 2019 ). Tests of speech sound production are normed based on representative populations (e.g., geographical region, parental SES, race, ethnicity) of children at selected age bands. The productions of a child are scored for any errors, which provide the basis for the raw score that is then compared to the norming sample at the child’s age and, where available, sex, to arrive at a standard score and percentile. The decision to provide speech therapy is based on the level of observed speech disorder severity. Depending on the setting, a child may qualify for services based on thresholds of -1, -1.5, or -2 standard deviation (SD) below the mean or some other thresholds and/or criteria such as number of sounds impacted and percent intelligibility (i.e., percent of a child’s conversational speech judged to be understandable by others). Tests of speech sound production vary in the composition of their norming samples in terms of inclusion/exclusion of children with disorders but especially also in terms of whether separate norms were calculated for male and female participants. An extensive review of psychometric characteristics of 10 widely used assessment tools (Flipsen & Ogiela, 2015 ) showed that eight of these tests discussed sex differences. Since then, several of these tests have been revised. The first goal of the present study, hence, was to provide an updated overview of demographic aspects of the norming samples used for the most widely used tests of speech sound production. Additional goals were to conduct detailed analyses of sex-based differences in a large norming sample, not only regarding speech sound production errors but also phonological errors. Method To update a previous survey of the psychometric properties of the most widely used tests of speech sound production (Flipsen & Ogiela, 2015 ), currently available tests used widely were surveyed for size and sex-based attributes of their norming samples. The following 12 assessment tools were examined: Arizona Articulation Proficiency Scale – Fourth Edition, Arizona-4 (Fudala & Stegall, 2017 ); Bankson-Bernthal Test of Phonology – Second Edition, BBTOP-2 (Bankson & Bernthal, 2020 ); Clinical Assessment of Articulation and Phonology – Second Edition, CAAP-2 (Secord & Donahue, 2013 ); Diagnostic Evaluation of Articulation and Phonology, DEAP-A (Dodd et al., 2006 ); Glaspey Dynamic Assessment of Phonology, GDAP (Glaspey, 2019 ); Goldman-Fristoe Test of Articulation – Third Edition, GFTA-3 (Goldman & Fristoe, 2015 ); Hodson Assessment of Phonological Patterns – Third Edition, HAPP-3 (Hodson, 2004 ); Khan-Lewis Phonological Analysis – Third Edition, KLPA-3 (Khan & Lewis, 2015 ); Linguisystems Articulation Test – Normative Update, LAT-NU (Bowers & Huisingh, 2018 ); Profiles of Early Expressive Phonological Skills, PEEPS (Williams & Stoel-Gammon, 2023 ); Structured Articulation Test – Third Edition, PAT-3 (Lippke et al., 1997 ); and Structured Photographic Articulation Test featuring Dudsberry-Third Edition, SPAT-D3 (Tattersall & Dawson, 2016 ). To investigate sex differences in speech articulation in detail, the GFTA-3 was selected because of its large norming sample, extended age range, separate norms for females and males across the entire sampled age, and the fact that the same norming sample facilitated a detailed sex-based analysis of phonological errors, described below. The GFTA-3 and its previous versions are widely used. For instance, in a 2007 survey of over 300 pediatric SLPs, the original and second GFTA versions that were available at the time (Goldman & Fristoe, 1986 , 2000 ) were by far the most commonly used standardized tests for assessing SSD (Skahan et al., 2007 ). The Sounds in Words subtest is administered by presenting pictured stimuli designed to elicit 60 words that sample the consonants of American English in their most relevant word positions (initial, medial, final) and additionally the rhotic vowel /ɚ/ (e.g., “broth er ”) in final position. Errors in designated words and positions are summed to arrive at a raw score that can range from 0 to 141, where high raw scores correspond to low standard scores and percentiles. A second subtest, Sounds in Sentences, samples speech sounds in word productions in the context of sentences and stories, with different stimuli selected for different ages. Here, we focus on the Sounds in Words subtest because the same stimuli are administered to all age groups, whereas the Sounds in Sentences subtest is administered only to children ages 4;0 to 6;11 (years;months). As mentioned, the GFTA-3 was designed to sample American English dialects and as such, covers postvocalic /r/ and the unstressed rhotic vowel /ɚ/. It was normed on 1,500 speakers of American English age 2;0 to 21;11, representative of the US population in terms of geographical region, parental SES, race, and ethnicity. Equivalent numbers of female and male participants were included so that separate norms could be constructed based on sex. The test manual does not report how the participants’ male or female designation was determined; the headers in the norms tables state that the standard scores and confidence intervals are given “by Age and Sex” (Goldman & Fristoe, 2015 ). In terms of general abilities, 6% of the sample was characterized as being gifted or talented, and 20% was characterized as having been diagnosed with one or more of certain conditions, e.g., 8% had a speech and/or language disorder, 4% had a disorder of attention, and 3% had a learning disability. Regarding stratification by age, higher numbers of individuals were probed for the younger ages, likely because the greatest amount of variability in speech sound accuracy is observed while children are still in the process of acquiring their speech sound inventories. The distribution is heavily right-skewed as most individuals age 6 years and older produce all speech sounds without errors. For the youngest ages (2;0 to 4;11), the normative sample consisted of 100 individuals for each 6-months age range. For ages 5;0 to 6;11, 200 individuals were included for each 12-months age range. For ages 7;0 to 7;11 and 8;0 to 8;11, 100 individuals were included for each 12-months range. In each age range from 9;0 to 10;11 and 11;0 to 12;11, 100 individuals were included. For the oldest groups (13;0 to 21;11), 100 individuals were sampled. Overall, the GFTA-3 is normed such that the population mean corresponds to a standard score (SS) of 100, with an SD of 15 points. The KLPA-3 provides insights into phonological errors in the word productions elicited with the GFTA-3. It is based on the same norming sample as the GFTA-3. Whereas the GFTA-3 performance is calculated via the numbers of misarticulated consonants and rhotic vowels, the KLPA-3 provides a more pattern-based view of the nature of speech sound substitutions, deletions, insertions, or other error types. For instance, substituting a stop consonant for one that is made with continuous airflow (e.g., “doo” for “zoo” or “tumb” for “thumb”) is classified as stopping; leaving off final consonants regardless of their type (e.g., “sli” for “slide” or “ho” for “hot”) is classified as final consonant deletion; and substituting an alveolar consonant for a velar one (e.g., “tup” for “cup”) is classified as velar fronting. The KLPA-3 provides 12 common types of systematic speech sound errors and 12 additional ones that occur less frequently. In some cases, a single incorrect production can trigger more than one phonological process. For instance, “bider” for “spider” counts as both cluster reduction (/sp/ ➝ /p) and prevocalic voicing (/p/ ➝ /b/). Possible error points range from 0 to 160, with high raw scores corresponding to low SSs and percentiles. Corresponding SSs are normed to a mean of 100 and SD of 15, based on the same model as that used for the GFTA-3. To investigate details of sex-based differences in speech sound production accuracy, we analyzed the GFTA-3 and KLPA-3 norming samples separately for female and male participants. Sex differences were evaluated for statistical properties in the normed distribution and for the ages at which they were greatest. Raw scores were tabulated corresponding to each of the following SSs: 130, 115, 100, 85, and 70, equivalent to 2, 1, 0, -1, and − 2 SD. Where more than one raw score corresponded to the same standard score, the highest one was selected. For instance, for males in the 2;0–2;1 age band in the GFTA-3 norms table, raw scores of 69 and 70 both correspond to an SS of 100 and 70 was the raw score selected for further analysis. Conversely, if no raw score was listed in the norms as corresponding to a standard score selected for this study, an appropriate raw score was interpolated. For instance, for females in the 7;9 − 7;11 age band of the GFTA-3 norms table, a raw score of 5 corresponds to an SS of 87 and a raw score of 6, to an SS of 83. The interpolated raw score corresponding to an SS of 85 was 5.5. For all age bands where raw scores were available for both females and males, the difference between female and male raw scores was calculated and averaged across age bands. Peak differences were highlighted for the SS and age range in which they were observed. Results Overview of Tests of Speech Sound Production A review of the most commonly used norm-referenced tests of speech sound production (Table 1 ) indicates that approximately half of them provide scores based on sex, at least for some age groups (Table 1 ). Of the 12 tests that were evaluated, six did not provide any sex-based scores. Two of these, the CAAP-2 and the GDAP did not discuss sex differences, although they did report the sample sizes of the participants from each sex in the norming sample. The other four tests that did not provide different norms based on sex provided different levels of discussion and evaluation regarding sex differences. They primarily evaluated whether there was a sex/gender bias in the test. The BBTOP-2 also evaluated test performance by sex subgroups on three measures. The t -test results indicated that the score differences between males and females were significant for the Word index, Consonant Articulation index, and the Error Pattern index scores, but that the magnitude of the effect size for each of them was small and did not indicate a sex/gender bias in the test. The LAT-NU included a study comparing the scores for male and female individuals and found that the main difference score was − 1.41 and that the magnitude of the effect size was trivial. The HAPP-3 provided mean scores for males and females, but did not analyze whether the differences in the scores were significant, nor did it report the effect size of the differences. Two of the tests use separate scores for males and females only at younger ages. The Arizona-4 provides separate scores up to the age of 5;11 and the SPAT-D3 provides separate scores up to age 6;11. After those ages, scores for male and female test takers are combined. The remaining tests, the DEAP-A, GFTA-3, KLPA-3, and PAT-3 provide separate test scores for males and females at all ages included on the test. However, the scoring tables for males and females on the PAT-3 are not parallel to each other. The number of testing increments varies for the males and females and the increments themselves are not the same for males and females. There are 20 age-group increments for males and only 14 age-group increments for females. This suggests that there were marked differences in the norming sample between males and females, but this was not discussed in detail in the manual. The manual provides a table of mean male and female scores at six age intervals to demonstrate improvement in speech sound production with age, which indicates higher error scores for males than females at ages, 3, 4, 5 and 6. Additional details or further discussion of the sex differences were not provided. Thus, most speech-sound production tests take sex into consideration at some level to control for demographic representation of both sexes, to evaluate the potential of sex bias, or both, but not all of them conducted detailed studies of sex differences or provided differential normative scores by sex. Table 1 summarizes key properties of the 12 selected tests of speech sound production. Table 1 Norming samples of 11 tests of speech sound production by age range, sample size, age groups, number of single words elicited, and sex-based differences Test Name Age range Sample size Age groups Number of single word test Items Discussed differences in scores based on sex Provide sex-based scoring tables Arizona Articulation and Phonology Scale -Fourth Edition (Arizona-4; Fudala & Stegall, 2017 ) 1;6 – 21;11 3,192 28 48 Yes Yes a Bankson-Bernthal Test of Phonology (BBTOP-2; Bankson & Bernthal, 2020 ) 3;0–9;11 770 13 80 Yes No Clinical Assessment of Articulation and Phonology-Second Edition (CAAP-2; Secord & Donohue, 2013) 2;6–11;11 1,486 13 44 No No Diagnostic Evaluation of Articulation and Phonology-American Edition (DEAP-A; Dodd, Hua, Crosbie, Holm, & Ozanne, 2006 ) 3;0–8;11 650 11 30 Yes Yes b Glaspey Dynamic Assessment of Phonology (GDAP; Glaspey, 2019 ) 3;0–10;11 880 34 49 No No Goldman-Fristoe Test of Articulation-Third Edition (GFTA-3; Goldman & Fristoe, 2015 )* 2;0–21;11 1,500 49 60 Yes Yes b Hodson Assessment of Phonological Patterns-Third Edition (HAPP-3; Hodson, 2004 ) 3;0–7;11 886 10 50 Yes No Khan-Lewis Phonological Assessment-Third Edition (KLPA-3; Khan & Lewis, 2015 )* 2;0–21;11 1,500 49 60 Yes Yes b Linguisystems Articulation Test-Normative Update (LAT-NU; Bowers & Huisingh, 2018 ) 3;0–21; 11 3037 20 49 Yes No Profiles of Early Expressive Phonological Skills (PEEPS; Williams & Stoel-Gammon, 2023 ) 1;6 − 3;0 67 7 60 No No Photo Articulation Test-Third Edition (PAT-3; Lippke, Dickey, Selmar, & Sodor, 1997) 3;0–8;11 800 Males-20 Females-14 93 Yes Yes b Structured Photographic Articulation Test featuring Dudsberry-Third Edition (SPAT-D3; Tattersall & Dawson, 2016 ) 3;0–9;11 2,473 11 36 Yes Yes c *GFTA-3 and KLPA-3 are based on the same norming sample. a Separate M/F norms up to age 5;11. b Separate M/F norms for all ages. c Separate M/F norms up to age 6;11. Sex Differences in the Goldman-Fristoe Test of Articulation – 3 Norming Sample As mentioned, the GFTA-3 was selected for a detailed analysis of the norming tables due to its large norming sample, broad age range, and sex-specific norming tables for the entire age span covered. Of the maximally possible raw score range of 0 to 141 for the SS levels defined for this study, the norms tables included multiple occurrences of scores of 0. The single highest raw score of 136 was seen for males age 2;0–2;1 and SS = 70. The norms tables include the maximum raw score for females at SS = 66 and males at SS = 68, both at age 2;0–2;1. These SSs were outside the range of interest for the present study. SSs of 130 were only available for ages 2;0 to 4;1, likely because speech sound accuracy is most variable at the youngest ages. Because near complete speech sound mastery is expected by age 6 years, extremely high standard scores cannot be calculated past age 4;1. For SS 130, the raw scores were slightly higher for males than females, with an average difference across the available age bands of 1.1 raw score points. Greatest differences were seen during ages 2;0 to 2;6, and ranging from 1 to 4 raw score points. This indicates that the males in the norming sample outperformed females in the extremely advanced skill range and only during the earliest available age bands. SSs of 115 were only available from ages 2;0 to 6;1, likely for similar reasons as those restricting the available age range for an SS of 130. Here, females had lower maximum raw scores than males, with an average difference across the available age bands of 1.5 raw score points. Peak differences ranged from 2 to 4 raw score points between ages 2;2 and 4;5. This indicates more advanced speech sound productions in females than males at this borderline advanced skill range and early age range. For SSs of 100, a yet greater raw score advantage for females was evident. Raw scores for females were lower by 2.2 points, averaged across the available age bands, compared to raw scores for males. The greatest raw score differences between females and males were seen between ages 2;0 and 6;1, ranging from 4 to 7 raw score points. For SSs of 85, females were assigned this SS with a raw score that was 3.5 points lower, compared to males, averaged across all age bands. The greatest score differences were seen between ages 2;0 and 6;11 and ranged from 2 to 9 raw score points. For SSs of 70, the average raw score advantage for females was at its largest. The differences, averaged across all age bands, was 5.0 raw score points. The largest differences were seen for ages 4;9 to 6;9, ranging from 4 to 10 raw score points. Figure 1 shows raw scores corresponding to standard scores of 130, 115, 100, 85, and 70 as a function of child age, separately for females and males. Low raw scores indicate low numbers of speech sound in error and, hence, higher speech sound accuracy. Sex Differences in the Khan-Lewis Phonological Analysis – 3 Norming Sample The possible raw scores for the selected SSs ranged from 0 to 160. Whereas the norms tables contained multiple occurrences of scores of 0, the single highest raw score of 102 was seen for females age 2;0–2;1 for SS = 70. The norms tables show raw scores up to 160 for SS = 40, which was not included in our investigation. SSs of 130 were only available for the six youngest age brackets, up to the age bracket 2;10 − 2;11. Here, raw scores were higher for female than male probands by 2.5 points on average, with greatest differences in the youngest age bands. This indicates overall fewer phonological errors among males than females in this advanced skill range and early age. SSs of 115 were available up through the age bracket 6;6–6;7. The overall raw score advantage for females, compared to males, was 1.6, but there was a developmental pattern: Higher raw scores occurred in females up to the age bracket 4;0–4;1, and generally higher raw scores were seen in males from that age forward. This finding shows a developmental shift from males leading females in phonological skill in the earlier ages, followed by the reverse pattern at later age brackets. For SSs of 100, the female advantage over males was 1.4 points overall, with a male advantage for the earliest three age brackets and the female advantage setting in with the 2;8 − 2;9 age bracket. Peak differences for females ranged from 5.5 to 6.5 points at age brackets 5;2–5;3, 5;4–5;5, and 5;6 − 5;7. For SSs of 85, the average female advantage over males was 3.1 raw score points. Here, the female advantage over males also set in with the 2;8 − 2;9 age bracket, with a peak raw score advantage of 10 at the age brackets 5;10 − 5;11 and 6;0–6;1. For SSs of 70, the female advantage over males on average was 4.1 raw score points. It set in by age 2;6 − 2;7, peaking at a 10-point raw score difference at age brackets 5;4–5;5 and 5;6 − 5;7. Figure 2 shows raw KLPA-3 scores as a function of child age in months for each of the sampled standard scores. Discussion Six of the currently most widely used tests of speech sound production provide separate standardization statistics for male and female test takers. Of these, only the Arizona-4, GFTA-3, and KLPA-3, which is based on the same norming sample as the GFTA-3, cover age ranges from toddlerhood through young adulthood. The GFTA-3 and KLPA-3 provide separate norms for the full age range of 2;0 through 21;11. These properties supported selecting the GFTA-3 and KLPA-3 for an investigation of sex-specific trajectories of articulation and phonological accuracy. Developmental trajectories for numbers of errors corresponding to selected SSs were similar for the GFTA-3 and KLPA-3 norms. In both tests generally, substantially higher error scores were seen in males, compared to females, with the greatest differences for SSs of 70 and 85 and substantial differences also evident for SS of 100, roughly in the toddler, preschool, and early school ages. This finding replicates earlier findings describing disproportional numbers of boys among 3-year-old children identified as speech delayed (Campbell et al., 2003 ). Immediate clinical applications are evident for assessing and qualifying children for intervention services, as these decisions are often based on assessment findings in the low SS ranges. For example, if an SS of 85 from the GFTA-3 is used to qualify a child for services, raw scores corresponding to that SS can vary between females and males by 7 to 9 points between ages 4;9 and 6;1. At age 5;6 to 5;7, females would qualify with a raw error score of 13 or higher whereas males would qualify with a raw error score of 22 or higher. If, at that same age, an SS of 70 is used as a qualifying threshold, the corresponding raw error score would be 29 for females and 39 for males. The corresponding KLPA-3 raw error scores for females and males for an SS of 85 are 13 and 22, respectively, and for an SS of 70, 28 and 38, respectively. Intervention decisions based on sex averaged norms will lead to over-qualifying males for services while females will be underserved. Both scenarios are problematic in their own way. Qualifying males for services when their articulatory or phonological errors are within normal limits for males not only consumes resources unnecessarily, it also sends a false message to males and their caregivers that delays or deficits were present when, compared to other males, performance was within normal limits. Conversely, not qualifying females for services when their articulatory or phonological development is delayed when compared to other girls deprives them of services that have the potential of addressing delays early and effectively. For the age ranges in which sex-specific differences in normative speech and phonology development is evident, greater diagnostic precision can be achieved by using sex-specific norms instead of averages. Similarly, when assessing population prevalence rates of articulation or phonological disorders, using sex-averaged norms may lead to over-identification of males and under-identification of females. Future studies should re-evaluate previously reported male:female prevalence ratios (Campbell et al., 2003 ; Shriberg et al., 1997 ; Shriberg et al., 1999 ) in light of sex-specific developmental trajectories for the age ranges in which sex-based differences are most significant. The KLPA-3 raw scores showed a male advantage in the youngest ages, especially for the highest SS scores. This pattern was seen in the GFTA-3 only for SS of 130. It is unclear what drove this flip in male/female advantage after the youngest age bands. Whereas nearly the full range of raw scores was represented in the GFTA-3 norms tables (0 to 136 out of 0 to 141) for the SS range from 70 to 130, the KLPA-3 raw score range was truncated (0 to 102 out of 0 to 160). A possible reason for this difference is that phonologically based speech sound errors are diverse in nature and represent only a subset of all speech sound errors. Generally, sex-based differences emerged at slightly earlier ages and persisted longer in the KLPA-3 norms, compared to the GFTA-3 norms. An in-depth analysis of the actual phonological processes in males and females would be needed to investigate the causes of these differences in phonological development. Whether sex differences in articulation or phonological development have genetic, environmental, social, or cultural causes cannot be determined with the available data. Sex-specific myelination patterns and associated cognitive performance may also be at play (Dean et al., 2015 ). Producing accurate speech requires intact skills on multiple levels, from accurate word form representations and motor schemas in long-term memory to intact motor planning and programming skills and motor execution skills. One possibility is that males and females have different trajectories in brain and neuromuscular development. Another possibility is that males and females have different linguistic environments due to parental interaction styles. The raw score sex difference for SSs of 70 and 85 diminished to < 1 point starting at the age brackets of 10;6–10;11 and 8;6–8;8, respectively, for GFTA-3 and for 9;6–9;11 and 8;3–8;5, respectively for KLPA-3, indicative of males catching up to females. As mentioned, 8% of the GFTA-3 norming samples represented children with speech and language delay, Whether the closure of the raw score gaps between ages 8 and 10 years reflects treatment effects in children whose low speech skills qualified them for services at younger ages or whether it reflects maturational processes cannot be determined with the available GFTA-3 and KLPA-3 information. The underlying reasons for sex-based differences in speech sound development were not addressed in the present study. Future studies should investigate biological, hormonal, cultural, educational, and other possible influences on speech sound development toward a more comprehensive understanding of sex-based differences. Declarations Acknowledgments Many thanks to Peter Flipsen for his extensive and foundational work exploring the nature of norming samples in tests of speech sound production. References American Speech-Language-Hearing Association (2007) Childhood Apraxia of Speech [Position Statement]. Bankson NW, Bernthal JE (2020) Bankson-Bernthal Test of Phonology - Second Edition. PRO-ED Bates N, Chin M, Becker T, National Academies of Sciences Engineering and Medicine (U.S.), National Academies of Sciences Engineering and Medicine (U.S.). Committee on National Statistics, National Academies of Sciences Engineering and Medicine (U.S.). Committee on Measuring Sex Gender Identity and Sexual Orientation,. Division of Behavioral and Social Sciences and Education, & National Academies of Sciences Engineering and Medicine (U.S.). (2022). Measuring sex, gender identity, and sexual orientation . National Academies Bowers L, Huisingh R (2018) Linguisystems Articulation Test - Normative Update. PRO-ED Campbell TF, Dollaghan CA, Rockette HE, Paradise JL, Feldman HM, Shriberg LD, Sabo DL, Kurs-Lasky M (2003) Risk factors for speech delay of unknown origin in 3-year-old children. Child Dev 74(2):346–357. http://www.ncbi.nlm.nih.gov/pubmed/12705559 Case J, Caspari S, Aggarwal P, Stoeckel R (2024) A Goal-Writing Framework for Motor-Based Intervention for Childhood Apraxia of Speech. Am J Speech-Language Pathol 33(4):1590–1607. https://doi.org/10.1044/2024_AJSLP-24-00014 Crowe K, McLeod S (2020) Children's English Consonant Acquisition in the United States: A Review. Am J Speech-Language Pathol 29(4):2155–2169. https://doi.org/10.1044/2020_AJSLP-19-00168 Dean DC 3rd, O'Muircheartaigh J, Dirks H, Waskiewicz N, Walker L, Doernberg E, Piryatinsky I, Deoni SC (2015) Characterizing longitudinal white matter development during early childhood. Brain Struct Function 220(4):1921–1933. https://doi.org/10.1007/s00429-014-0763-3 Dodd B, Hua Z, Crosbie S, Holm A, Ozanne A (2006) Diagnostic Evaluation of Articulation and Phonology. Pearson Farquharson K, Tambyraja SR (2019) Describing How School-Based SLPs Determine Eligibility for Children with Speech Sound Disorders. Semin Speech Lang 40(02):105–112. https://doi.org/10.1055/s-0039-1677761 Flipsen P Jr., Ogiela DA (2015) Psychometric characteristics of single-word tests of children's speech sound production. Lang Speech Hear Serv Sch 46(2):166–178. https://doi.org/10.1044/2015_LSHSS-14-0055 Fudala JB, Stegall S (2017) Arizona Articulation Proficiency Scale - Fourth Edition. Western Psychological Services Glaspey A (2019) Glaspey Dynamic Assessment of Phonology. Academic Therapy Goldman R, Fristoe M (1986) Goldman-Fristoe Test of Articulation. AGS Goldman R, Fristoe M (2000) Goldman-Fristoe Test of Articulation 2. American Guidance Service Goldman R, Fristoe M (2015) Goldman-Fristoe Test of Articulation – 3. Pearson Hodson B (2004) Hodson Assessment of Phonological Patterns - Third Edition. PRO-ED Khan L, Lewis N (2015) Khan-Lewis Phonological Analysis-3. Pearson Lippke BA, Dickey SE, Selmar JW, Soder AL (1997) Photo Articulation Test - Third Edition. PRO-ED McLeod S, Crowe K (2018) Children's Consonant Acquisition in 27 Languages: A Cross-Linguistic Review. Am J Speech-Language Pathol 1–26. https://doi.org/10.1044/2018_AJSLP-17-0100 Newcomer P, Hammill D (1988) Test of Language Development – 2 Primary. Pro-Ed Peter B, Vose C, Bruce L, Ingram D (2019) Starting to Talk at Age 10 Years: Lessons About the Acquisition of English Speech Sounds in a Rare Case of Severe Congenital But Remediated Motor Disease of Genetic Origin. Am J Speech-Language Pathol 28(3):1029–1038. https://doi.org/10.1044/2019_AJSLP-18-0156 Pollock KE, Berni MC (2003) Incidence of non-rhotic vowel errors in children: data from the Memphis Vowel Project. Clin Linguist Phon 17(4–5):393–401. https://doi.org/10.1080/0269920031000079949 Secord W, Donahue JS (2013) Clinical Assessment of Articulation and Phonology – Second Edition. Western Psychological Services Shriberg LD, Austin D, Lewis BA, McSweeny JL, Wilson DL (1997) The speech disorders classification system (SDCS): extensions and lifespan reference data. J Speech Lang Hear Res 40(4):723–740. http://www.ncbi.nlm.nih.gov/pubmed/9263939 Shriberg LD, Kwiatkowski J, Mabie HL (2019) Estimates of the prevalence of motor speech disorders in children with idiopathic speech delay. Clin Linguist Phon 33(8):679–706. https://doi.org/10.1080/02699206.2019.1595731 Shriberg LD, Tomblin JB, McSweeny JL (1999) Prevalence of speech delay in 6-year-old children and comorbidity with language impairment. J Speech Lang Hear Res 42(6):1461–1481. http://www.ncbi.nlm.nih.gov/pubmed/10599627 Skahan SM, Watson M, Lof GL (2007) Speech-Language Pathologists' Assessment Practices for Children With Suspected Speech Sound Disorders: Results of a National Survey. Am J Speech-Language Pathol 16(3):246–259. https://doi.org/doi:10.1044/1058-0360(2007/029) Tattersall PJ, Dawson JI (2016) Structured Photographic Articulation Test Featuring Dudsberry – Third Edition. PRO-ED Williams AL, Stoel-Gammon C (2023) Profiles of Early Expressive Phonology . Brookes Footnotes International Phonetic Alphabet denotes speech sounds using single symbols for each sound, separated from the surrounding text by slashes. Additional Declarations No competing interests reported. Cite Share Download PDF Status: Under Review Version 1 posted Reviewers agreed at journal 08 Jan, 2026 Reviewers invited by journal 08 Jan, 2026 Editor assigned by journal 23 Dec, 2025 Submission checks completed at journal 22 Dec, 2025 First submitted to journal 19 Dec, 2025 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-8408784","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":571733494,"identity":"8010dd5d-7fcd-4cc1-b35f-04a8047a8739","order_by":0,"name":"Beate Peter","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA6klEQVRIiWNgGAWjYBACewj1D0QwPoBwEvBrMWxgYGxgYDjMwMPAwGxAlBaDAwgtbBLE2dLe/PzBB4bDcvbsvc8qfvw5zMDPnmOAV4s9zzHDxhkM/4x5eI6b3extO8wg2fMGvxbDGQmGzTwMBxJ7JNLYbvA2HGYwuEHAFoP7zz82/2E4UN8j/4yt8A/QYfYEtdzgMWxmYDiQwCPBxsbMwwa0RYKAFsOenMKZPQYHDHvOpDFLy7al80iceVaAV4s9+/ENH35UHJBnbz/G+PHNH2s5/vbkDXi1QJ2HYPIQoXwUjIJRMApGASEAAMdBSJjE+f4fAAAAAElFTkSuQmCC","orcid":"","institution":"Arizona State University","correspondingAuthor":true,"prefix":"","firstName":"Beate","middleName":"","lastName":"Peter","suffix":""},{"id":571733496,"identity":"479580ec-d5f1-4e90-b93b-3002276ab60e","order_by":1,"name":"Diane A. Ogiela","email":"","orcid":"","institution":"Idaho State University-Meridian Health Science Center","correspondingAuthor":false,"prefix":"","firstName":"Diane","middleName":"A.","lastName":"Ogiela","suffix":""},{"id":571733498,"identity":"83735631-9045-40b3-ac63-be708a934eda","order_by":2,"name":"Laurel Bruce","email":"","orcid":"","institution":"Arizona State University","correspondingAuthor":false,"prefix":"","firstName":"Laurel","middleName":"","lastName":"Bruce","suffix":""}],"badges":[],"createdAt":"2025-12-20 01:23:10","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-8408784/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-8408784/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":100070392,"identity":"982f65e6-62fb-4ec4-aee9-37b3f2423b84","added_by":"auto","created_at":"2026-01-12 16:17:37","extension":"docx","order_by":0,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":338421,"visible":true,"origin":"","legend":"","description":"","filename":"Sexdifferencesinspeechdevelopment20251218.docx","url":"https://assets-eu.researchsquare.com/files/rs-8408784/v1/0952c16198090e01eec5a234.docx"},{"id":100070411,"identity":"64b66fb1-47fc-4518-bf14-34daf2b33d28","added_by":"auto","created_at":"2026-01-12 16:17:44","extension":"json","order_by":1,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":5327,"visible":true,"origin":"","legend":"","description":"","filename":"eadab25478f1433886e4e569837be011.json","url":"https://assets-eu.researchsquare.com/files/rs-8408784/v1/03005b0366d460050742ff27.json"},{"id":100070444,"identity":"60cb6350-5db8-4d22-b218-d2b96502e8e9","added_by":"auto","created_at":"2026-01-12 16:17:46","extension":"xml","order_by":2,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":88854,"visible":true,"origin":"","legend":"","description":"","filename":"eadab25478f1433886e4e569837be0111enriched.xml","url":"https://assets-eu.researchsquare.com/files/rs-8408784/v1/574e311679a531a24c1e4ee9.xml"},{"id":100070404,"identity":"b11cb115-2f4c-40e0-8806-856922d54a27","added_by":"auto","created_at":"2026-01-12 16:17:40","extension":"png","order_by":3,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":184414,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-8408784/v1/5a3d6c08acec1914c80df714.png"},{"id":100070378,"identity":"184dcddd-9762-4341-bcd7-9b1ef1f30757","added_by":"auto","created_at":"2026-01-12 16:17:35","extension":"png","order_by":4,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":141593,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-8408784/v1/c403319240d2786be48db405.png"},{"id":100070316,"identity":"4c0b50e1-2951-4944-95e5-6ae949690af2","added_by":"auto","created_at":"2026-01-12 16:17:30","extension":"png","order_by":5,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":55284,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-8408784/v1/20ec176c7bcb9b950435e318.png"},{"id":100070274,"identity":"223ca870-ef00-4945-9312-8a8de30acc47","added_by":"auto","created_at":"2026-01-12 16:17:21","extension":"png","order_by":6,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":44494,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-8408784/v1/d819b6448af1fa2a9077a9b9.png"},{"id":100070286,"identity":"c63d821c-e5eb-4a5f-b03c-3305b6b4fa19","added_by":"auto","created_at":"2026-01-12 16:17:22","extension":"xml","order_by":7,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":86183,"visible":true,"origin":"","legend":"","description":"","filename":"eadab25478f1433886e4e569837be0111structuring.xml","url":"https://assets-eu.researchsquare.com/files/rs-8408784/v1/5d6431f2a9219aef0afc6b78.xml"},{"id":100070484,"identity":"101e56ac-553a-43d1-95ab-59ca34bb0af9","added_by":"auto","created_at":"2026-01-12 16:17:53","extension":"html","order_by":8,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":93642,"visible":true,"origin":"","legend":"","description":"","filename":"earlyproof.html","url":"https://assets-eu.researchsquare.com/files/rs-8408784/v1/250de7ade26c5a33c0185156.html"},{"id":100070355,"identity":"06bea628-0f23-4459-9549-2ee8381060e7","added_by":"auto","created_at":"2026-01-12 16:17:32","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":184414,"visible":true,"origin":"","legend":"\u003cp\u003eRaw scores for standard scores of 130, 115, 100, 85, and 70, separately for female and male GFTA-3 norming sample participants, as a function of age in months\u003c/p\u003e","description":"","filename":"floatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-8408784/v1/815834ca335b9aae1345ab17.png"},{"id":100070317,"identity":"390bfdbb-555a-4633-b3c4-50c3d920ab12","added_by":"auto","created_at":"2026-01-12 16:17:30","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":141593,"visible":true,"origin":"","legend":"\u003cp\u003eRaw scores for standard scores of 130, 115, 100, 85, and 70, separately for female and male KLPA-3 norming sample participants, as a function of age in months\u003c/p\u003e","description":"","filename":"floatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-8408784/v1/efebec7f942cbacd6d28a0ef.png"},{"id":100071051,"identity":"b8f50ed2-635e-4e08-b261-da2998838d74","added_by":"auto","created_at":"2026-01-12 16:19:02","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":780040,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-8408784/v1/f9f1d373-4096-4c35-849b-0e968eead1d2.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Sex differences in the speech sound development of young children","fulltext":[{"header":"Introduction","content":"\u003cp\u003eSpeech sound production is a skill that most children anywhere in the world acquire naturally without explicit instruction. Typically, children say their first word around their first birthday; by their second birthday, many children can say anywhere from 50 to 300 different words. As children learn to talk, they typically master vowels by about age 3 years (Pollock \u0026amp; Berni, \u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e2003\u003c/span\u003e) and consonants by age 6 years (Crowe \u0026amp; McLeod, \u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e2020\u003c/span\u003e; McLeod \u0026amp; Crowe, \u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e2018\u003c/span\u003e). Although there is considerable individual variability, most children acquire the different consonants in predictable patterns, with stops (/p, b, t, d, k, g/[1]\u003ca class=\"FNLink\" href=\"#Fn1\" id=\"#FNLinkFn1\"\u003e\u003c/a\u003e), glides (/j/ as in \u0026ldquo;\u003cspan type=\"Underline\" class=\"Underline\" name=\"Emphasis\"\u003ey\u003c/span\u003eellow,\u0026rdquo; /w/) and nasals /m, n, ŋ/ (/ŋ/ as in \u0026ldquo;wi\u003cspan type=\"Underline\" class=\"Underline\" name=\"Emphasis\"\u003eng\u003c/span\u003e\u0026rdquo;) mastered first, followed by fricatives (e.g., /f, v, s, z/), then affricates (/ʧ/ as in \u0026ldquo;\u003cspan type=\"Underline\" class=\"Underline\" name=\"Emphasis\"\u003ech\u003c/span\u003eime\u0026rdquo; and /ʤ/ as in \u0026ldquo;\u003cspan type=\"Underline\" class=\"Underline\" name=\"Emphasis\"\u003ej\u003c/span\u003eungle\u0026rdquo;), and finally liquids (/l, r/). Sounds produced in the front of the mouth (e.g., bilabials /p, b, m/ and alveolars /t, d, n/) are generally acquired before sounds produced in the back of the mouth (e.g., velars /k, g, ŋ/). These patterns have been observed in English-acquiring children as well as children acquiring many other languages (McLeod \u0026amp; Crowe, \u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e2018\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eThe order of consonant acquisition may not be primarily influenced by chronological age but, rather, by mastery of basic motor patterns that form the foundation for acquiring successively more complex motor patterns. This hypothesis is based on a rare case of a child who did not begin to speak until age 10 years. She was born with a profound global motor disease that resolved after DNA sequencing revealed a loss-of-function variant in a gene in the dopamine pathway. Within three months of this diagnosis and the start of pharmaceutical intervention, the child began communicating orally and developed mastery of consonants in the same order as typical children at younger ages, with /k, g, ŋ/ emerging after age 12 years and /r, l/ after age 15 years (Peter et al., \u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e2019\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eSome children struggle with learning to produce speech sounds correctly by the expected ages, usually only regarding consonants, in the absence of known medical causes. The term speech sound disorder (SSD) is generally used for this scenario. Not being able to produce late-developing sounds like affricates and liquids until much later than typically expected is common among early school-age children, usually diagnosed as articulation delay/disorder. Leaving out parts of consonant clusters (e.g., \u0026ldquo;top\u0026rdquo; instead of \u0026ldquo;stop\u0026rdquo;) or word-final consonants (\u0026ldquo;ha\u0026rdquo; instead of \u0026ldquo;hat\u0026rdquo;), even though the individual components have been mastered, are examples of phonological delay/disorder, based on the position of sounds in words rather than the individual sound. Childhood apraxia of speech is a rare and severe subtype of speech sound disorder at the level of motor coordination, characterized, among other things, by a small consonant inventory, inconsistent word productions, mis-stressed syllables, vowel errors, mis-sequenced consonants in words, and speech that is difficult to understand (American Speech-Language-Hearing Association, \u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e2007\u003c/span\u003e; Case et al., \u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e2024\u003c/span\u003e; Shriberg et al., \u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e2019\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eDifficulties with learning to produce speech sounds are not rare. In a representative sample of 1,328 six-year-old children, 3.8% showed evidence of delayed speech sound production (Shriberg et al., \u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e1999\u003c/span\u003e). This determination was made using the Word Articulation subtest of the Test of Language Development \u0026ndash; 2 Primary (TOLD-2:P) (Newcomer \u0026amp; Hammill, \u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e1988\u003c/span\u003e), consisting of 20 target words and normed for specific age bands but not separately for males and females. Interestingly, more males (4.5%) than females (3.1%) were identified as speech delayed, arriving at a ratio of 1.5:1 (males:females). Similar sex differences in speech sound disorder prevalence have been reported elsewhere. For instance, 639 three-year-old children were assessed with the Speech Disorders Classification System (Shriberg et al., \u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e1997\u003c/span\u003e) using phonetic transcriptions of play sessions. Similar to the TOLD-3:P, this classification system applies the same scoring rules to female and male probands. Of the assessed children, 100 (15.6%) met criteria for speech delay, and of these, 70% were males and 30% were females, arriving at an affectation ratio of 2.3:1 (males:females). Of the 539 children who did not qualify for a diagnosis of speech delay, 52% were males and 48% females, a ratio of 1.08:1 (males:females) (Campbell et al., \u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e2003\u003c/span\u003e). This suggests that sex differences are most evident in the lower ranges of speech sound skills.\u003c/p\u003e \u003cp\u003eBefore moving further into sex-based characteristics of standardized tests of speech sound production, we offer two definitions and a disclaimer. The term \u0026ldquo;sex\u0026rdquo; is often used to describe maleness or femaleness on the basis of biological traits, for instance the presence or absence of a Y chromosome, whereas the term \u0026ldquo;gender\u0026rdquo; often describes maleness or femaleness as a social or behavioral construct (Bates et al., 2022). Regarding tests of speech sound production with separate norms for females and males where maleness and femaleness were treated as binary traits, we acknowledge here that this stringent binary framework may not fit all children. In addition, these assessment tools do not report on what basis participants were classified as male or female, e.g., school records, physical appearance, self-identification, or other means. Our analyses were based on the norming samples as provided in the test manuals, where data were reported in binary form for male and female participants.\u003c/p\u003e \u003cp\u003e When a caregiver or teacher becomes concerned about a child\u0026rsquo;s ability to produce speech sounds with age-appropriate accuracy, a referral for a speech sound assessment can be made. Speech-language pathologists (SLPs) typically plan an assessment session that includes a language sample collected during a play interaction to gauge general intelligibility. For children age 2 years and up, the assessment also routinely includes a standardized test of articulation designed to elicit word productions such that all consonants can be checked for accuracy in all or most relevant word positions (e.g., /ŋ/ in medial and final but not initial positions in English). Vowels are not typically included in tests of articulation because incorrect speech sound productions are nearly always observed for consonants. Two exceptions to this rule are rhotic (\u0026ldquo;r-colored\u0026rdquo;) vowels (/ɝ, ɚ/ in \u0026ldquo;b\u003cspan type=\"Underline\" class=\"Underline\" name=\"Emphasis\"\u003eir\u003c/span\u003ed\u0026rdquo; and \u0026ldquo;teach\u003cspan type=\"Underline\" class=\"Underline\" name=\"Emphasis\"\u003eer\u003c/span\u003e\u0026rdquo;) for certain dialects of English such as American English, and children with CAS who can have difficulty with a variety of vowels. Where indicated, tests of phonological patterns can be used as well. Most articulation and phonology tests are designed to be administered using pictures or objects and prompts asking the child to say the target words without providing a model. This approach allows the child to say the word as they typically would say it, rather than imitate the word as produced by the SLP, which may not be a true representation of the child\u0026rsquo;s habitual production of the word. Some tests of articulation also include subtests probing word production accuracy at the sentence level. In the context of school-based therapy, standardized testing can play a key role in determining eligibility for services. In a survey of more than 500 school SLPs, one question was, \u0026ldquo;Which items are part of an assessment of SSD at your school?\u0026rdquo; The SLP survey responders could choose multiple items from a 10-item list, and using standardized tests ranked highest with 73.5% of the SLPs selecting this option (Farquharson \u0026amp; Tambyraja, \u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e2019\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eTests of speech sound production are normed based on representative populations (e.g., geographical region, parental SES, race, ethnicity) of children at selected age bands. The productions of a child are scored for any errors, which provide the basis for the raw score that is then compared to the norming sample at the child\u0026rsquo;s age and, where available, sex, to arrive at a standard score and percentile. The decision to provide speech therapy is based on the level of observed speech disorder severity. Depending on the setting, a child may qualify for services based on thresholds of -1, -1.5, or -2 standard deviation (SD) below the mean or some other thresholds and/or criteria such as number of sounds impacted and percent intelligibility (i.e., percent of a child\u0026rsquo;s conversational speech judged to be understandable by others).\u003c/p\u003e \u003cp\u003e Tests of speech sound production vary in the composition of their norming samples in terms of inclusion/exclusion of children with disorders but especially also in terms of whether separate norms were calculated for male and female participants. An extensive review of psychometric characteristics of 10 widely used assessment tools (Flipsen \u0026amp; Ogiela, \u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e2015\u003c/span\u003e) showed that eight of these tests discussed sex differences. Since then, several of these tests have been revised. The first goal of the present study, hence, was to provide an updated overview of demographic aspects of the norming samples used for the most widely used tests of speech sound production. Additional goals were to conduct detailed analyses of sex-based differences in a large norming sample, not only regarding speech sound production errors but also phonological errors.\u003c/p\u003e"},{"header":"Method","content":"\u003cp\u003eTo update a previous survey of the psychometric properties of the most widely used tests of speech sound production (Flipsen \u0026amp; Ogiela, \u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e2015\u003c/span\u003e), currently available tests used widely were surveyed for size and sex-based attributes of their norming samples. The following 12 assessment tools were examined: Arizona Articulation Proficiency Scale \u0026ndash; Fourth Edition, Arizona-4 (Fudala \u0026amp; Stegall, \u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e2017\u003c/span\u003e); Bankson-Bernthal Test of Phonology \u0026ndash; Second Edition, BBTOP-2 (Bankson \u0026amp; Bernthal, \u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2020\u003c/span\u003e); Clinical Assessment of Articulation and Phonology \u0026ndash; Second Edition, CAAP-2 (Secord \u0026amp; Donahue, \u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e2013\u003c/span\u003e); Diagnostic Evaluation of Articulation and Phonology, DEAP-A (Dodd et al., \u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e2006\u003c/span\u003e); Glaspey Dynamic Assessment of Phonology, GDAP (Glaspey, \u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e2019\u003c/span\u003e); Goldman-Fristoe Test of Articulation \u0026ndash; Third Edition, GFTA-3 (Goldman \u0026amp; Fristoe, \u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e2015\u003c/span\u003e); Hodson Assessment of Phonological Patterns \u0026ndash; Third Edition, HAPP-3 (Hodson, \u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e2004\u003c/span\u003e); Khan-Lewis Phonological Analysis \u0026ndash; Third Edition, KLPA-3 (Khan \u0026amp; Lewis, \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e2015\u003c/span\u003e); Linguisystems Articulation Test \u0026ndash; Normative Update, LAT-NU (Bowers \u0026amp; Huisingh, \u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e2018\u003c/span\u003e); Profiles of Early Expressive Phonological Skills, PEEPS (Williams \u0026amp; Stoel-Gammon, \u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e2023\u003c/span\u003e); Structured Articulation Test \u0026ndash; Third Edition, PAT-3 (Lippke et al., \u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e1997\u003c/span\u003e); and Structured Photographic Articulation Test featuring Dudsberry-Third Edition, SPAT-D3 (Tattersall \u0026amp; Dawson, \u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e2016\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eTo investigate sex differences in speech articulation in detail, the GFTA-3 was selected because of its large norming sample, extended age range, separate norms for females and males across the entire sampled age, and the fact that the same norming sample facilitated a detailed sex-based analysis of phonological errors, described below. The GFTA-3 and its previous versions are widely used. For instance, in a 2007 survey of over 300 pediatric SLPs, the original and second GFTA versions that were available at the time (Goldman \u0026amp; Fristoe, \u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e1986\u003c/span\u003e, \u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e2000\u003c/span\u003e) were by far the most commonly used standardized tests for assessing SSD (Skahan et al., \u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e2007\u003c/span\u003e). The Sounds in Words subtest is administered by presenting pictured stimuli designed to elicit 60 words that sample the consonants of American English in their most relevant word positions (initial, medial, final) and additionally the rhotic vowel /ɚ/ (e.g., \u0026ldquo;broth\u003cspan type=\"Underline\" class=\"Underline\" name=\"Emphasis\"\u003eer\u003c/span\u003e\u0026rdquo;) in final position. Errors in designated words and positions are summed to arrive at a raw score that can range from 0 to 141, where high raw scores correspond to low standard scores and percentiles. A second subtest, Sounds in Sentences, samples speech sounds in word productions in the context of sentences and stories, with different stimuli selected for different ages. Here, we focus on the Sounds in Words subtest because the same stimuli are administered to all age groups, whereas the Sounds in Sentences subtest is administered only to children ages 4;0 to 6;11 (years;months).\u003c/p\u003e \u003cp\u003eAs mentioned, the GFTA-3 was designed to sample American English dialects and as such, covers postvocalic /r/ and the unstressed rhotic vowel /ɚ/. It was normed on 1,500 speakers of American English age 2;0 to 21;11, representative of the US population in terms of geographical region, parental SES, race, and ethnicity. Equivalent numbers of female and male participants were included so that separate norms could be constructed based on sex. The test manual does not report how the participants\u0026rsquo; male or female designation was determined; the headers in the norms tables state that the standard scores and confidence intervals are given \u0026ldquo;by Age and Sex\u0026rdquo; (Goldman \u0026amp; Fristoe, \u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e2015\u003c/span\u003e). In terms of general abilities, 6% of the sample was characterized as being gifted or talented, and 20% was characterized as having been diagnosed with one or more of certain conditions, e.g., 8% had a speech and/or language disorder, 4% had a disorder of attention, and 3% had a learning disability.\u003c/p\u003e \u003cp\u003eRegarding stratification by age, higher numbers of individuals were probed for the younger ages, likely because the greatest amount of variability in speech sound accuracy is observed while children are still in the process of acquiring their speech sound inventories. The distribution is heavily right-skewed as most individuals age 6 years and older produce all speech sounds without errors. For the youngest ages (2;0 to 4;11), the normative sample consisted of 100 individuals for each 6-months age range. For ages 5;0 to 6;11, 200 individuals were included for each 12-months age range. For ages 7;0 to 7;11 and 8;0 to 8;11, 100 individuals were included for each 12-months range. In each age range from 9;0 to 10;11 and 11;0 to 12;11, 100 individuals were included. For the oldest groups (13;0 to 21;11), 100 individuals were sampled. Overall, the GFTA-3 is normed such that the population mean corresponds to a standard score (SS) of 100, with an SD of 15 points.\u003c/p\u003e \u003cp\u003eThe KLPA-3 provides insights into phonological errors in the word productions elicited with the GFTA-3. It is based on the same norming sample as the GFTA-3. Whereas the GFTA-3 performance is calculated via the numbers of misarticulated consonants and rhotic vowels, the KLPA-3 provides a more pattern-based view of the nature of speech sound substitutions, deletions, insertions, or other error types. For instance, substituting a stop consonant for one that is made with continuous airflow (e.g., \u0026ldquo;doo\u0026rdquo; for \u0026ldquo;zoo\u0026rdquo; or \u0026ldquo;tumb\u0026rdquo; for \u0026ldquo;thumb\u0026rdquo;) is classified as stopping; leaving off final consonants regardless of their type (e.g., \u0026ldquo;sli\u0026rdquo; for \u0026ldquo;slide\u0026rdquo; or \u0026ldquo;ho\u0026rdquo; for \u0026ldquo;hot\u0026rdquo;) is classified as final consonant deletion; and substituting an alveolar consonant for a velar one (e.g., \u0026ldquo;tup\u0026rdquo; for \u0026ldquo;cup\u0026rdquo;) is classified as velar fronting. The KLPA-3 provides 12 common types of systematic speech sound errors and 12 additional ones that occur less frequently. In some cases, a single incorrect production can trigger more than one phonological process. For instance, \u0026ldquo;bider\u0026rdquo; for \u0026ldquo;spider\u0026rdquo; counts as both cluster reduction (/sp/ ➝ /p) and prevocalic voicing (/p/ ➝ /b/). Possible error points range from 0 to 160, with high raw scores corresponding to low SSs and percentiles. Corresponding SSs are normed to a mean of 100 and SD of 15, based on the same model as that used for the GFTA-3.\u003c/p\u003e \u003cp\u003e To investigate details of sex-based differences in speech sound production accuracy, we analyzed the GFTA-3 and KLPA-3 norming samples separately for female and male participants. Sex differences were evaluated for statistical properties in the normed distribution and for the ages at which they were greatest. Raw scores were tabulated corresponding to each of the following SSs: 130, 115, 100, 85, and 70, equivalent to 2, 1, 0, -1, and \u0026minus;\u0026thinsp;2 SD. Where more than one raw score corresponded to the same standard score, the highest one was selected. For instance, for males in the 2;0\u0026ndash;2;1 age band in the GFTA-3 norms table, raw scores of 69 and 70 both correspond to an SS of 100 and 70 was the raw score selected for further analysis. Conversely, if no raw score was listed in the norms as corresponding to a standard score selected for this study, an appropriate raw score was interpolated. For instance, for females in the 7;9\u0026thinsp;\u0026minus;\u0026thinsp;7;11 age band of the GFTA-3 norms table, a raw score of 5 corresponds to an SS of 87 and a raw score of 6, to an SS of 83. The interpolated raw score corresponding to an SS of 85 was 5.5.\u003c/p\u003e \u003cp\u003eFor all age bands where raw scores were available for both females and males, the difference between female and male raw scores was calculated and averaged across age bands. Peak differences were highlighted for the SS and age range in which they were observed.\u003c/p\u003e"},{"header":"Results","content":"\u003cdiv id=\"Sec4\" class=\"Section2\"\u003e \u003ch2\u003eOverview of Tests of Speech Sound Production\u003c/h2\u003e \u003cp\u003eA review of the most commonly used norm-referenced tests of speech sound production (Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e) indicates that approximately half of them provide scores based on sex, at least for some age groups (Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e). Of the 12 tests that were evaluated, six did not provide any sex-based scores. Two of these, the CAAP-2 and the GDAP did not discuss sex differences, although they did report the sample sizes of the participants from each sex in the norming sample. The other four tests that did not provide different norms based on sex provided different levels of discussion and evaluation regarding sex differences. They primarily evaluated whether there was a sex/gender bias in the test. The BBTOP-2 also evaluated test performance by sex subgroups on three measures. The \u003cem\u003et\u003c/em\u003e-test results indicated that the score differences between males and females were significant for the Word index, Consonant Articulation index, and the Error Pattern index scores, but that the magnitude of the effect size for each of them was small and did not indicate a sex/gender bias in the test. The LAT-NU included a study comparing the scores for male and female individuals and found that the main difference score was \u0026minus;\u0026thinsp;1.41 and that the magnitude of the effect size was trivial. The HAPP-3 provided mean scores for males and females, but did not analyze whether the differences in the scores were significant, nor did it report the effect size of the differences.\u003c/p\u003e \u003cp\u003eTwo of the tests use separate scores for males and females only at younger ages. The Arizona-4 provides separate scores up to the age of 5;11 and the SPAT-D3 provides separate scores up to age 6;11. After those ages, scores for male and female test takers are combined. The remaining tests, the DEAP-A, GFTA-3, KLPA-3, and PAT-3 provide separate test scores for males and females at all ages included on the test. However, the scoring tables for males and females on the PAT-3 are not parallel to each other. The number of testing increments varies for the males and females and the increments themselves are not the same for males and females. There are 20 age-group increments for males and only 14 age-group increments for females. This suggests that there were marked differences in the norming sample between males and females, but this was not discussed in detail in the manual. The manual provides a table of mean male and female scores at six age intervals to demonstrate improvement in speech sound production with age, which indicates higher error scores for males than females at ages, 3, 4, 5 and 6. Additional details or further discussion of the sex differences were not provided.\u003c/p\u003e \u003cp\u003eThus, most speech-sound production tests take sex into consideration at some level to control for demographic representation of both sexes, to evaluate the potential of sex bias, or both, but not all of them conducted detailed studies of sex differences or provided differential normative scores by sex. Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e summarizes key properties of the 12 selected tests of speech sound production.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eNorming samples of 11 tests of speech sound production by age range, sample size, age groups, number of single words elicited, and sex-based differences\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"7\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c7\" colnum=\"7\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eTest Name\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAge range\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eSample size\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eAge groups\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eNumber of single word test Items\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c6\"\u003e \u003cp\u003eDiscussed differences in scores based on sex\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c7\"\u003e \u003cp\u003eProvide sex-based scoring tables\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eArizona Articulation and Phonology Scale -Fourth Edition (Arizona-4; Fudala \u0026amp; Stegall, \u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e2017\u003c/span\u003e)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e1;6 \u0026ndash; 21;11\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e3,192\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e28\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e48\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003eYes\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eYes\u003csup\u003ea\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eBankson-Bernthal Test of Phonology (BBTOP-2; Bankson \u0026amp; Bernthal, \u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2020\u003c/span\u003e)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e3;0\u0026ndash;9;11\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e770\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e13\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e80\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003eYes\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eNo\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eClinical Assessment of Articulation and Phonology-Second Edition (CAAP-2; Secord \u0026amp; Donohue, 2013)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e2;6\u0026ndash;11;11\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e1,486\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e13\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e44\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003eNo\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eNo\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eDiagnostic Evaluation of Articulation and Phonology-American Edition (DEAP-A; Dodd, Hua, Crosbie, Holm, \u0026amp; Ozanne, \u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e2006\u003c/span\u003e)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e3;0\u0026ndash;8;11\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e650\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e11\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e30\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003eYes\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eYes\u003csup\u003eb\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eGlaspey Dynamic Assessment of Phonology (GDAP; Glaspey, \u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e2019\u003c/span\u003e)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e3;0\u0026ndash;10;11\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e880\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e34\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e49\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003eNo\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eNo\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eGoldman-Fristoe Test of Articulation-Third Edition (GFTA-3; Goldman \u0026amp; Fristoe, \u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e2015\u003c/span\u003e)*\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e2;0\u0026ndash;21;11\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e1,500\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e49\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e60\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003eYes\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eYes\u003csup\u003eb\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eHodson Assessment of Phonological Patterns-Third Edition (HAPP-3; Hodson, \u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e2004\u003c/span\u003e)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e3;0\u0026ndash;7;11\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e886\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e10\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e50\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003eYes\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eNo\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eKhan-Lewis Phonological Assessment-Third Edition (KLPA-3; Khan \u0026amp; Lewis, \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e2015\u003c/span\u003e)*\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e2;0\u0026ndash;21;11\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e1,500\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e49\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e60\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003eYes\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eYes\u003csup\u003eb\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eLinguisystems Articulation Test-Normative Update (LAT-NU; Bowers \u0026amp; Huisingh, \u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e2018\u003c/span\u003e)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e3;0\u0026ndash;21; 11\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e3037\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e20\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e49\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003eYes\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eNo\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eProfiles of Early Expressive Phonological Skills (PEEPS; Williams \u0026amp; Stoel-Gammon, \u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e2023\u003c/span\u003e)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e1;6\u0026thinsp;\u0026minus;\u0026thinsp;3;0\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e67\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e60\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003eNo\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eNo\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003ePhoto Articulation Test-Third Edition (PAT-3; Lippke, Dickey, Selmar, \u0026amp; Sodor, 1997)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e3;0\u0026ndash;8;11\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e800\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eMales-20 Females-14\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e93\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003eYes\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eYes\u003csup\u003eb\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eStructured Photographic Articulation Test featuring Dudsberry-Third Edition (SPAT-D3; Tattersall \u0026amp; Dawson, \u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e2016\u003c/span\u003e)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e3;0\u0026ndash;9;11\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e2,473\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e11\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e36\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003eYes\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eYes\u003csup\u003ec\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003e*GFTA-3 and KLPA-3 are based on the same norming sample.\u003c/p\u003e \u003cp\u003e \u003csup\u003ea\u003c/sup\u003e Separate M/F norms up to age 5;11.\u003c/p\u003e \u003cp\u003e \u003csup\u003eb\u003c/sup\u003e Separate M/F norms for all ages.\u003c/p\u003e \u003cp\u003e \u003csup\u003ec\u003c/sup\u003e Separate M/F norms up to age 6;11.\u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003eSex Differences in the Goldman-Fristoe Test of Articulation – 3 Norming Sample\u003c/h3\u003e\n\u003cp\u003eAs mentioned, the GFTA-3 was selected for a detailed analysis of the norming tables due to its large norming sample, broad age range, and sex-specific norming tables for the entire age span covered. Of the maximally possible raw score range of 0 to 141 for the SS levels defined for this study, the norms tables included multiple occurrences of scores of 0. The single highest raw score of 136 was seen for males age 2;0\u0026ndash;2;1 and SS\u0026thinsp;=\u0026thinsp;70. The norms tables include the maximum raw score for females at SS\u0026thinsp;=\u0026thinsp;66 and males at SS\u0026thinsp;=\u0026thinsp;68, both at age 2;0\u0026ndash;2;1. These SSs were outside the range of interest for the present study.\u003c/p\u003e \u003cp\u003eSSs of 130 were only available for ages 2;0 to 4;1, likely because speech sound accuracy is most variable at the youngest ages. Because near complete speech sound mastery is expected by age 6 years, extremely high standard scores cannot be calculated past age 4;1. For SS 130, the raw scores were slightly higher for males than females, with an average difference across the available age bands of 1.1 raw score points. Greatest differences were seen during ages 2;0 to 2;6, and ranging from 1 to 4 raw score points. This indicates that the males in the norming sample outperformed females in the extremely advanced skill range and only during the earliest available age bands.\u003c/p\u003e \u003cp\u003eSSs of 115 were only available from ages 2;0 to 6;1, likely for similar reasons as those restricting the available age range for an SS of 130. Here, females had lower maximum raw scores than males, with an average difference across the available age bands of 1.5 raw score points. Peak differences ranged from 2 to 4 raw score points between ages 2;2 and 4;5. This indicates more advanced speech sound productions in females than males at this borderline advanced skill range and early age range.\u003c/p\u003e \u003cp\u003eFor SSs of 100, a yet greater raw score advantage for females was evident. Raw scores for females were lower by 2.2 points, averaged across the available age bands, compared to raw scores for males. The greatest raw score differences between females and males were seen between ages 2;0 and 6;1, ranging from 4 to 7 raw score points.\u003c/p\u003e \u003cp\u003eFor SSs of 85, females were assigned this SS with a raw score that was 3.5 points lower, compared to males, averaged across all age bands. The greatest score differences were seen between ages 2;0 and 6;11 and ranged from 2 to 9 raw score points.\u003c/p\u003e \u003cp\u003eFor SSs of 70, the average raw score advantage for females was at its largest. The differences, averaged across all age bands, was 5.0 raw score points. The largest differences were seen for ages 4;9 to 6;9, ranging from 4 to 10 raw score points. Figure\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e shows raw scores corresponding to standard scores of 130, 115, 100, 85, and 70 as a function of child age, separately for females and males. Low raw scores indicate low numbers of speech sound in error and, hence, higher speech sound accuracy.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e\n\u003ch3\u003eSex Differences in the Khan-Lewis Phonological Analysis – 3 Norming Sample\u003c/h3\u003e\n\u003cp\u003eThe possible raw scores for the selected SSs ranged from 0 to 160. Whereas the norms tables contained multiple occurrences of scores of 0, the single highest raw score of 102 was seen for females age 2;0\u0026ndash;2;1 for SS\u0026thinsp;=\u0026thinsp;70. The norms tables show raw scores up to 160 for SS\u0026thinsp;=\u0026thinsp;40, which was not included in our investigation.\u003c/p\u003e \u003cp\u003eSSs of 130 were only available for the six youngest age brackets, up to the age bracket 2;10\u0026thinsp;\u0026minus;\u0026thinsp;2;11. Here, raw scores were higher for female than male probands by 2.5 points on average, with greatest differences in the youngest age bands. This indicates overall fewer phonological errors among males than females in this advanced skill range and early age.\u003c/p\u003e \u003cp\u003eSSs of 115 were available up through the age bracket 6;6\u0026ndash;6;7. The overall raw score advantage for females, compared to males, was 1.6, but there was a developmental pattern: Higher raw scores occurred in females up to the age bracket 4;0\u0026ndash;4;1, and generally higher raw scores were seen in males from that age forward. This finding shows a developmental shift from males leading females in phonological skill in the earlier ages, followed by the reverse pattern at later age brackets.\u003c/p\u003e \u003cp\u003eFor SSs of 100, the female advantage over males was 1.4 points overall, with a male advantage for the earliest three age brackets and the female advantage setting in with the 2;8\u0026thinsp;\u0026minus;\u0026thinsp;2;9 age bracket. Peak differences for females ranged from 5.5 to 6.5 points at age brackets 5;2\u0026ndash;5;3, 5;4\u0026ndash;5;5, and 5;6\u0026thinsp;\u0026minus;\u0026thinsp;5;7.\u003c/p\u003e \u003cp\u003eFor SSs of 85, the average female advantage over males was 3.1 raw score points. Here, the female advantage over males also set in with the 2;8\u0026thinsp;\u0026minus;\u0026thinsp;2;9 age bracket, with a peak raw score advantage of 10 at the age brackets 5;10\u0026thinsp;\u0026minus;\u0026thinsp;5;11 and 6;0\u0026ndash;6;1.\u003c/p\u003e \u003cp\u003eFor SSs of 70, the female advantage over males on average was 4.1 raw score points. It set in by age 2;6\u0026thinsp;\u0026minus;\u0026thinsp;2;7, peaking at a 10-point raw score difference at age brackets 5;4\u0026ndash;5;5 and 5;6\u0026thinsp;\u0026minus;\u0026thinsp;5;7. Figure\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e shows raw KLPA-3 scores as a function of child age in months for each of the sampled standard scores.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e"},{"header":"Discussion","content":"\u003cp\u003eSix of the currently most widely used tests of speech sound production provide separate standardization statistics for male and female test takers. Of these, only the Arizona-4, GFTA-3, and KLPA-3, which is based on the same norming sample as the GFTA-3, cover age ranges from toddlerhood through young adulthood. The GFTA-3 and KLPA-3 provide separate norms for the full age range of 2;0 through 21;11. These properties supported selecting the GFTA-3 and KLPA-3 for an investigation of sex-specific trajectories of articulation and phonological accuracy.\u003c/p\u003e \u003cp\u003eDevelopmental trajectories for numbers of errors corresponding to selected SSs were similar for the GFTA-3 and KLPA-3 norms. In both tests generally, substantially higher error scores were seen in males, compared to females, with the greatest differences for SSs of 70 and 85 and substantial differences also evident for SS of 100, roughly in the toddler, preschool, and early school ages. This finding replicates earlier findings describing disproportional numbers of boys among 3-year-old children identified as speech delayed (Campbell et al., \u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e2003\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eImmediate clinical applications are evident for assessing and qualifying children for intervention services, as these decisions are often based on assessment findings in the low SS ranges. For example, if an SS of 85 from the GFTA-3 is used to qualify a child for services, raw scores corresponding to that SS can vary between females and males by 7 to 9 points between ages 4;9 and 6;1. At age 5;6 to 5;7, females would qualify with a raw error score of 13 or higher whereas males would qualify with a raw error score of 22 or higher. If, at that same age, an SS of 70 is used as a qualifying threshold, the corresponding raw error score would be 29 for females and 39 for males. The corresponding KLPA-3 raw error scores for females and males for an SS of 85 are 13 and 22, respectively, and for an SS of 70, 28 and 38, respectively. Intervention decisions based on sex averaged norms will lead to over-qualifying males for services while females will be underserved. Both scenarios are problematic in their own way. Qualifying males for services when their articulatory or phonological errors are within normal limits for males not only consumes resources unnecessarily, it also sends a false message to males and their caregivers that delays or deficits were present when, compared to other males, performance was within normal limits. Conversely, not qualifying females for services when their articulatory or phonological development is delayed when compared to other girls deprives them of services that have the potential of addressing delays early and effectively. For the age ranges in which sex-specific differences in normative speech and phonology development is evident, greater diagnostic precision can be achieved by using sex-specific norms instead of averages.\u003c/p\u003e \u003cp\u003eSimilarly, when assessing population prevalence rates of articulation or phonological disorders, using sex-averaged norms may lead to over-identification of males and under-identification of females. Future studies should re-evaluate previously reported male:female prevalence ratios (Campbell et al., \u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e2003\u003c/span\u003e; Shriberg et al., \u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e1997\u003c/span\u003e; Shriberg et al., \u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e1999\u003c/span\u003e) in light of sex-specific developmental trajectories for the age ranges in which sex-based differences are most significant.\u003c/p\u003e \u003cp\u003eThe KLPA-3 raw scores showed a male advantage in the youngest ages, especially for the highest SS scores. This pattern was seen in the GFTA-3 only for SS of 130. It is unclear what drove this flip in male/female advantage after the youngest age bands.\u003c/p\u003e \u003cp\u003eWhereas nearly the full range of raw scores was represented in the GFTA-3 norms tables (0 to 136 out of 0 to 141) for the SS range from 70 to 130, the KLPA-3 raw score range was truncated (0 to 102 out of 0 to 160). A possible reason for this difference is that phonologically based speech sound errors are diverse in nature and represent only a subset of all speech sound errors. Generally, sex-based differences emerged at slightly earlier ages and persisted longer in the KLPA-3 norms, compared to the GFTA-3 norms. An in-depth analysis of the actual phonological processes in males and females would be needed to investigate the causes of these differences in phonological development.\u003c/p\u003e \u003cp\u003eWhether sex differences in articulation or phonological development have genetic, environmental, social, or cultural causes cannot be determined with the available data. Sex-specific myelination patterns and associated cognitive performance may also be at play (Dean et al., \u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e2015\u003c/span\u003e). Producing accurate speech requires intact skills on multiple levels, from accurate word form representations and motor schemas in long-term memory to intact motor planning and programming skills and motor execution skills. One possibility is that males and females have different trajectories in brain and neuromuscular development. Another possibility is that males and females have different linguistic environments due to parental interaction styles.\u003c/p\u003e \u003cp\u003eThe raw score sex difference for SSs of 70 and 85 diminished to \u0026lt;\u0026thinsp;1 point starting at the age brackets of 10;6\u0026ndash;10;11 and 8;6\u0026ndash;8;8, respectively, for GFTA-3 and for 9;6\u0026ndash;9;11 and 8;3\u0026ndash;8;5, respectively for KLPA-3, indicative of males catching up to females. As mentioned, 8% of the GFTA-3 norming samples represented children with speech and language delay, Whether the closure of the raw score gaps between ages 8 and 10 years reflects treatment effects in children whose low speech skills qualified them for services at younger ages or whether it reflects maturational processes cannot be determined with the available GFTA-3 and KLPA-3 information.\u003c/p\u003e \u003cp\u003eThe underlying reasons for sex-based differences in speech sound development were not addressed in the present study. Future studies should investigate biological, hormonal, cultural, educational, and other possible influences on speech sound development toward a more comprehensive understanding of sex-based differences.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eAcknowledgments\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eMany thanks to Peter Flipsen for his extensive and foundational work exploring the nature of norming samples in tests of speech sound production.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eAmerican Speech-Language-Hearing Association (2007) Childhood Apraxia of Speech [Position Statement]. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e\u003c/span\u003e\u003cspan address=\"http://www.asha.org/policy\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBankson NW, Bernthal JE (2020) Bankson-Bernthal Test of Phonology - Second Edition. PRO-ED\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBates N, Chin M, Becker T, National Academies of Sciences Engineering and Medicine (U.S.), National Academies of Sciences Engineering and Medicine (U.S.). Committee on National Statistics, National Academies of Sciences Engineering and Medicine (U.S.). Committee on Measuring Sex Gender Identity and Sexual Orientation,. Division of Behavioral and Social Sciences and Education, \u0026amp; National Academies of Sciences Engineering and Medicine (U.S.). (2022). \u003cem\u003eMeasuring sex, gender identity, and sexual orientation\u003c/em\u003e. National Academies\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBowers L, Huisingh R (2018) Linguisystems Articulation Test - Normative Update. PRO-ED\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCampbell TF, Dollaghan CA, Rockette HE, Paradise JL, Feldman HM, Shriberg LD, Sabo DL, Kurs-Lasky M (2003) Risk factors for speech delay of unknown origin in 3-year-old children. Child Dev 74(2):346\u0026ndash;357. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://www.ncbi.nlm.nih.gov/pubmed/12705559\u003c/span\u003e\u003cspan address=\"http://www.ncbi.nlm.nih.gov/pubmed/12705559\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCase J, Caspari S, Aggarwal P, Stoeckel R (2024) A Goal-Writing Framework for Motor-Based Intervention for Childhood Apraxia of Speech. Am J Speech-Language Pathol 33(4):1590\u0026ndash;1607. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1044/2024_AJSLP-24-00014\u003c/span\u003e\u003cspan address=\"10.1044/2024_AJSLP-24-00014\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCrowe K, McLeod S (2020) Children's English Consonant Acquisition in the United States: A Review. Am J Speech-Language Pathol 29(4):2155\u0026ndash;2169. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1044/2020_AJSLP-19-00168\u003c/span\u003e\u003cspan address=\"10.1044/2020_AJSLP-19-00168\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDean DC 3rd, O'Muircheartaigh J, Dirks H, Waskiewicz N, Walker L, Doernberg E, Piryatinsky I, Deoni SC (2015) Characterizing longitudinal white matter development during early childhood. Brain Struct Function 220(4):1921\u0026ndash;1933. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1007/s00429-014-0763-3\u003c/span\u003e\u003cspan address=\"10.1007/s00429-014-0763-3\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDodd B, Hua Z, Crosbie S, Holm A, Ozanne A (2006) Diagnostic Evaluation of Articulation and Phonology. Pearson\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFarquharson K, Tambyraja SR (2019) Describing How School-Based SLPs Determine Eligibility for Children with Speech Sound Disorders. Semin Speech Lang 40(02):105\u0026ndash;112. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1055/s-0039-1677761\u003c/span\u003e\u003cspan address=\"10.1055/s-0039-1677761\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFlipsen P Jr., Ogiela DA (2015) Psychometric characteristics of single-word tests of children's speech sound production. Lang Speech Hear Serv Sch 46(2):166\u0026ndash;178. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1044/2015_LSHSS-14-0055\u003c/span\u003e\u003cspan address=\"10.1044/2015_LSHSS-14-0055\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFudala JB, Stegall S (2017) Arizona Articulation Proficiency Scale - Fourth Edition. Western Psychological Services\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGlaspey A (2019) Glaspey Dynamic Assessment of Phonology. Academic Therapy\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGoldman R, Fristoe M (1986) Goldman-Fristoe Test of Articulation. AGS\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGoldman R, Fristoe M (2000) Goldman-Fristoe Test of Articulation 2. American Guidance Service\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGoldman R, Fristoe M (2015) Goldman-Fristoe Test of Articulation \u0026ndash;\u0026thinsp;3. Pearson\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHodson B (2004) Hodson Assessment of Phonological Patterns - Third Edition. PRO-ED\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKhan L, Lewis N (2015) Khan-Lewis Phonological Analysis-3. Pearson\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLippke BA, Dickey SE, Selmar JW, Soder AL (1997) Photo Articulation Test - Third Edition. PRO-ED\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMcLeod S, Crowe K (2018) Children's Consonant Acquisition in 27 Languages: A Cross-Linguistic Review. Am J Speech-Language Pathol 1\u0026ndash;26. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1044/2018_AJSLP-17-0100\u003c/span\u003e\u003cspan address=\"10.1044/2018_AJSLP-17-0100\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eNewcomer P, Hammill D (1988) Test of Language Development \u0026ndash;\u0026thinsp;2 Primary. Pro-Ed\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePeter B, Vose C, Bruce L, Ingram D (2019) Starting to Talk at Age 10 Years: Lessons About the Acquisition of English Speech Sounds in a Rare Case of Severe Congenital But Remediated Motor Disease of Genetic Origin. Am J Speech-Language Pathol 28(3):1029\u0026ndash;1038. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1044/2019_AJSLP-18-0156\u003c/span\u003e\u003cspan address=\"10.1044/2019_AJSLP-18-0156\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePollock KE, Berni MC (2003) Incidence of non-rhotic vowel errors in children: data from the Memphis Vowel Project. Clin Linguist Phon 17(4\u0026ndash;5):393\u0026ndash;401. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1080/0269920031000079949\u003c/span\u003e\u003cspan address=\"10.1080/0269920031000079949\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSecord W, Donahue JS (2013) Clinical Assessment of Articulation and Phonology \u0026ndash; Second Edition. Western Psychological Services\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eShriberg LD, Austin D, Lewis BA, McSweeny JL, Wilson DL (1997) The speech disorders classification system (SDCS): extensions and lifespan reference data. J Speech Lang Hear Res 40(4):723\u0026ndash;740. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://www.ncbi.nlm.nih.gov/pubmed/9263939\u003c/span\u003e\u003cspan address=\"http://www.ncbi.nlm.nih.gov/pubmed/9263939\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eShriberg LD, Kwiatkowski J, Mabie HL (2019) Estimates of the prevalence of motor speech disorders in children with idiopathic speech delay. Clin Linguist Phon 33(8):679\u0026ndash;706. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1080/02699206.2019.1595731\u003c/span\u003e\u003cspan address=\"10.1080/02699206.2019.1595731\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eShriberg LD, Tomblin JB, McSweeny JL (1999) Prevalence of speech delay in 6-year-old children and comorbidity with language impairment. J Speech Lang Hear Res 42(6):1461\u0026ndash;1481. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://www.ncbi.nlm.nih.gov/pubmed/10599627\u003c/span\u003e\u003cspan address=\"http://www.ncbi.nlm.nih.gov/pubmed/10599627\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSkahan SM, Watson M, Lof GL (2007) Speech-Language Pathologists' Assessment Practices for Children With Suspected Speech Sound Disorders: Results of a National Survey. Am J Speech-Language Pathol 16(3):246\u0026ndash;259. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/doi:10.1044/1058-0360(2007/029)\u003c/span\u003e\u003cspan address=\"doi:10.1044/1058-0360(2007/029)\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTattersall PJ, Dawson JI (2016) Structured Photographic Articulation Test Featuring Dudsberry \u0026ndash; Third Edition. PRO-ED\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWilliams AL, Stoel-Gammon C (2023) \u003cem\u003eProfiles of Early Expressive Phonology\u003c/em\u003e. Brookes\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"},{"header":"Footnotes","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003e International Phonetic Alphabet denotes speech sounds using single symbols for each sound, separated from the surrounding text by slashes.\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"behavior-genetics","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"bege","sideBox":"Learn more about [Behavior Genetics](https://www.springer.com/journal/10519)","snPcode":"10519","submissionUrl":"https://submission.nature.com/new-submission/10519/3","title":"Behavior Genetics","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"em","reportingPortfolio":"Springer Hybrid","inReviewEnabled":true,"inReviewRevisionsEnabled":false},"keywords":"Speech sound disorders, standardized assessment tools, sex-specific norms, sex differences, precision diagnostics","lastPublishedDoi":"10.21203/rs.3.rs-8408784/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-8408784/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003ePast studies have demonstrated higher prevalence rates of speech sound disorders among male than female children, based on standardized articulation tests with sex-averaged norms. Here, we survey the most commonly used standardized articulation tests for properties (size, age range, sex) of their norming samples. Based on the articulation assessment with sex-specific norms across the longest age span, we investigate sex differences in raw error scores corresponding to key standard score bands. Of 12 tests of speech sound production, six provided separate norms for females and males, but only two, the Goldman-Fristoe Test of Articulation \u0026ndash; Third Edition (GFTA-3) and the Khan-Lewis Phonological Assessment \u0026ndash; Third Edition (KLPA-3) that is based on the same word productions and norming sample as the GFTA-3, provided sex-specific norms for the entire sampled age span from 2 years 0 months to 21 years 11 months. In the lowest range of the score distribution, 1 and 2 standard deviations below the mean, and up to ages up to 6 years, both the GFTA-3 and KLPA-3 showed a distinct articulation accuracy advantage for females. Whereas the causes for this discrepancy are not well understood, clear clinical implications emerge: When diagnosing children with speech sound disorders toward qualifying them for interventions, it is imperative to use sex-specific norms, as sex-averaged norms may lead to over-diagnosing boys and under-diagnosing girls. Similarly, more precise prevalence rates of speech sound disorders among females and males can be obtained from sex-specific norms than from sex-averaged norms.\u003c/p\u003e","manuscriptTitle":"Sex differences in the speech sound development of young children","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-01-12 16:04:50","doi":"10.21203/rs.3.rs-8408784/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"reviewerAgreed","content":"319307764679121345044197724358963940032","date":"2026-01-08T22:38:12+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2026-01-08T14:34:28+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2025-12-23T20:53:47+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2025-12-22T12:38:04+00:00","index":"","fulltext":""},{"type":"submitted","content":"Behavior Genetics","date":"2025-12-20T01:10:58+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"behavior-genetics","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"bege","sideBox":"Learn more about [Behavior Genetics](https://www.springer.com/journal/10519)","snPcode":"10519","submissionUrl":"https://submission.nature.com/new-submission/10519/3","title":"Behavior Genetics","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"em","reportingPortfolio":"Springer Hybrid","inReviewEnabled":true,"inReviewRevisionsEnabled":false}}],"origin":"","ownerIdentity":"bd41325f-4dff-48d8-be7a-fe360349ba2d","owner":[],"postedDate":"January 12th, 2026","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"under-review","subjectAreas":[],"tags":[],"updatedAt":"2026-01-12T16:04:50+00:00","versionOfRecord":[],"versionCreatedAt":"2026-01-12 16:04:50","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-8408784","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-8408784","identity":"rs-8408784","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.