Scores Reported
ALL scaled scores are based on grade and semester (fall and spring) rather than a child’s age because specific reading skills are taught at each grade, and schools report children’s progress at grade level.
Subtest Scaled Scores
ALL subtest scaled scores provide measures of specific aspects of language and emergent literacy, depending on the subtest tasks and the child’s responses. Subtest scaled scores are normative scores used specifically to compare the child’s performance to the performance of children of the same grade and semester peer group. These scores are derived from the subtest total raw scores.
Index Scores
ALL includes four index scores: Emergent Literacy, Language, Phonological, and Phonological-Orthographic. ALL index scores are composite scores that are formed from the scores of two or more subtests. They provide information about a child’s strengths and weaknesses across language and emergent literacy domains.
Percentile Ranks
ALL provides grade-based percentile ranks for subtest scores and index scores. Percentile ranks are easy to understand and useful for explaining a child’s performance on ALL relative to the performance of others.
Criterion-Referenced Subtest Scores
Criterion-referenced scores provide a way to compare a child’s performance to a standard (criterion) of performance. The criterion-referenced cut scores for ALL were established by examining the frequency distributions of subtest raw scores by grade and determining the effect of different cut points on the correct classification of children in the sample according to their a priori diagnostic classification (i.e., normal or at-risk).
Minimizing Item Bias
Precautions were taken to ensure that ALL items are appropriate for a wide range of children from diverse cultural, linguistic, and socioeconomic backgrounds. A panel of speech-language pathologists with expertise in the areas of language, literacy development, and assessment of diverse populations reviewed the ALL test items for content and cultural bias.
ALL items were also submitted to statistical studies of group performance differences with regard to sex, race/ethnicity, and socioeconomic status based on the educational level of the primary caregiver/parent. Traditional bias analysis was conducted using both Mantel-Haenszel (Holland & Thayer, 1998) and item response theory (IRT) methods (Hambleton, 1993). Items that were considered biased were dropped from consideration for the final item sets of the assessment.
Reliability and Validity
The standardization sample data were analyzed for evidence of reliability, including test-retest stability, internal consistency, and interscorer reliability. Validity was evaluated by examining test content, response processes, internal structure, relationship to other diagnostic instruments, and diagnostic accuracy.
Diagnostic Accuracy
The diagnostic accuracy of ALL was evaluated using two diagnostic validity statistics that describe how a test performs: sensitivity and specificity. Sensitivity indicates the probability that someone who has a language disorder will test positive for it, and specificity indicates the probability that someone who does not have a language disorder will test negative. The table that follows shows the percentage of children classified as having a specific language impairment (sensitivity) and the percentage of children without specific language impairment (specificity) by the ALL Language Index Score at 1 and 1.5 standard deviations below the mean.
Classification of Specific Language Impairment by Language Index Score
Language | Index Score SD Sensitivity | Specificity |
---|---|---|
-1 SD | .98 | .89 |
-1.5 SD | .86 | .96 |
Standardization Sample
Fall Standardization
The ALL fall standardization was collected from August through November of 2004. The sample of 300 children closely represents the 2002 U.S. Census for race/ethnicity and socioeconomic status based on the educational level of the primary caregiver, and geographic region. Equal numbers of males and females were included in the study.
Spring Standardization
The ALL spring standardization was collected from March through May of 2004. The sample of 300 children closely represents the 2002 U.S. Census for race/ethnicity and socioeconomic status based on the educational level of the primary caregiver, and geographic region. Equal numbers of males and females were included in the study.