HANDOUT FOR NORM-REFERENCED TEST SCORE INTERPRETATION

The purpose of this handout is to provide instruction on the interpretation of results from norm-referenced tests. When people think of a teacher's job, they seldom think of it requiring the interpretation of results from standardized tests. However, interpreting such results is actually a very important part of a teacher’s yearly (versus daily) activities.

I know this be true for two reasons. First, after working in the public schools for six years as a school psychologist, I saw how teachers reacted with puzzlement, confusion, and wonder when I presented results from norm-referenced psychological evaluations. Second, I have been teaching long enough at the University of Delaware to have had undergraduates return and take graduate-level measurement classes with me. After a few years of working in the public schools, these teachers see the impact norm-referenced tests have on children - - and, they emphasize that someone should have taught them more about the norm-referenced test-score interpretations when they were undergraduates!

There is yet another way to demonstrate the importance of norm-referenced test interpretation to classroom teachers. Approximately 15% of all children in the public school receive special education. To be eligible for special education, federal law (i.e., the Individuals with Disabilities Education Act [IDEA]) specifies that children must receive comprehensive, norm-referenced assessments from Multi-Disciplinary Teams (MDTs). Furthermore, another 5% of the children in the public schools are evaluated by MDTs, but do not qualify for special education. Therefore, around 20% of all children in the public schools are evaluated, at one time or another, by MDTs.

Given the large number of children evaluated by MDTs, the odds are approximately 1 in 5 (i.e., 20%) each year that you, a "regular" education teacher, will refer a child for evaluation. Once you refer a child, you will receive one or more reports about him or her (e.g., a psychologist’s report, an educational diagnostician’s report, etc.). Almost all of the scores in these reports are norm-referenced, and it is the results from these tests that determine whether children: (1) are eligible for special education and (2) are diagnosed as having a handicapping condition such as mental retardation (MR), a learning disability (LD), attention-deficit/hyperactivity syndrome (ADHD), conduct disorder (CD), etc. Therefore, as you can see, the norm-referenced assessments conducted by MDTs are "high stakes" and have a significant impact on the lives of children and the regular-education teachers who instruct them.

Perhaps the best way to learn about norm-referenced test interpretation is to begin with a psychological evaluation. You will see one such psychological report just below. The report is fictitious. The child, the names of his parents, teacher, school, etc. are made up. Otherwise, the report is exactly what you would receive as a classroom teacher.

Read the report carefully. There are four major areas covered in a psychological report: IQ-test results (see the "WISC-III" section), adaptive-behavior inventory results (see the "ABAS" section), achievement-test results (see the "WIAT" section), and social-emotional adjustment results (see the "ASCA" section).

Try to determine whether the child is performing above average, average, or below average in each of the four areas. You probably will be able to make the determination based on what the psychologist says in the report (i.e., the report’s text presentation). However, look at the section of the report titled "Synopsis of Formal Test Scores". It is this section of the report that provides the actual, norm-referenced scores obtained by the child. Look at the test scores themselves and see if you can determine whether the child is performing above average, average, or below average based on the scores alone. You probably will not be able to make the determination without learning more about norm-referenced tests.

Also, as surprising as it may sound to you, the actual test scores and what is said about the test scores in a report (i.e., the report’s text presentation) sometimes do not agree with one another! For this reason, as a classroom teacher, you need to know something about norm-referenced test scores. Otherwise, you will be unable to determine whether the test results accurately portray how a child in your classroom is performing academically.

Once you finish reading the psychological report, the other sections of this document will teach you how to interpret norm-referenced test scores. At times, the document will refer back to the child (Billy) discussed in the psychological report and his test scores.

NOTE: Information in this report is fictitious. Any resemblance to real individuals is co-incidental.

CONFIDENTIAL: THIS REPORT IS TO BE SHOWN ONLY TO PROFESSIONAL PERSONNEL WORKING WITH THE STUDENT

PSYCHOLOGICAL EVALUATION

NAME: William (Billy) Smith                            PARENTS: William and Susan Smith
GENDER: Male                                                    ADDRESS:     411 Hanson Driver
DATE OF BIRTH: 12/12/95                                                          Omaha, NE 17111
CHRONOLOGICAL AGE: 6-11                             TELEPHONE: 807-555-1212

RACE: Anglo                                                             PRIMARY TEACHER: Mrs. Hopkins

EVALUATION DATES: 11/10/02, 11/12/02,            SCHOOL: Happy Valley Elementary

      11/13/02                                    GRADE: 1

Evaluation Techniques:

Wechsler Intelligence Scale for Children-Third Edition (WISC-III), Wechsler Individual Achievement Test- Second Edition (WIAT-II), Adaptive Behavior Assessment System (ABAS), Adjustment Scales for Children and Adolescents (ASCA), Structured Developmental History Interview with Parent, Structured Teacher Interview, Review of School Records, Structured (Time Sampling) Classroom Observation, Unstructured Clinical Interview with Student

SYNOPSIS OF FORMAL TEST SCORES

Note to Educ451 students: To aid interpretation, please know that the WISC-III (IQs, factor indexes, and composites), WIAT-II (all scores), ABAS (all scores) are in the IQ metric, which means average=100 and SD=15. The ASCA is in T-scores, which means average=50 and SD=10 (the highest scores are the worst on this test.) The WISC-III subtest scores are average=10 and SD=3.

WISC-III IQs and Subtest Standard Scores

Full Scale IQ: 65                           Verbal Scale IQ: 67                Performance Scale IQ: 68

Information                   5                                              Picture completion                    4

Similarities                    3                                              Coding                                     6

Arithmetic                     5                                              Picture Arrangement                 4

Vocabulary                   5                                              Block Design                          5

Comprehension            3                                              Object Assembly                     4

Digit Span                    6                                              Symbol Search             6

WISC-III Factor Indexes

                                                   STANDARD

INDEX                                           SCORE

Verbal Comprehension                        68

Perceptual Organization                 67

Freedom from Distractibility                 75

Processing Speed                          80

WIAT-II Composites and Subtest Standard Scores

                                                   STANDARD

COMPOSITES                             SCORE

Reading                                                **

Mathematics                                         59

Written Language                                  **

Oral Language                                     62

**Not calculated prior to age 8

                                                   STANDARD

SUBTESTS                                    SCORE

Word Reading                                     74

Pseudoword Decoding                      60

Numerical Operations                               63

Math Reasoning                                 **

Spelling                                                72

Written Expression                                **

Listening Comprehension                        69

Listening Comprehension                        67

Oral Expression                                67

**Not calculated prior to age 8

ABAS Composite and Subtest Standard Scores

                                                   STANDARD

                                                       SCORE

Composite                                            70

                                                   STANDARD

SUBTESTS                                    SCORE

Communication                         60

Community Use                              70

Functional Academics                               66

Home Living                                         80

Health and Safety                           70

Leisure                                                 **

Self-Care                                             85

Self-Direction                                       85

Social                                                   60

Work                                                   **

**Not Administered

Adjustment Scales for Children and Adolescents (ASCA)

                                                   STANDARD

                                                       SCORE

COMPOSITES

Over-reactivity                        60

Under-reactivity                                 52

SUBTESTS/SCALES

Attention-Deficit/

            Hyperactivity (ADHD)          60

Solitary Aggressive-

            Impulsive (SA-I)                    50

Solitary Aggressive-

            Provocative (SA-P)                52

Oppositional Defiance (OD)            53

Diffidence (DIF)                          50

Avoidance (Avoid)                                   51

Reason for Referral:

William (Billy) was referred by his classroom teacher, Mrs. Hopkins. Billy tries hard in school, but he is struggling in all academic areas.

History:

Billy is approaching his seventh birthday (age = 6 years, 11 months). He lives with his both of his biological parents, William (age 35) and Susan (age 33) Smith. William is an accountant and Susan works as a purchasing agent. Both Mr. and Mrs. Smith are college graduates. Neither report that they experienced learning difficulties in school. Mr. and Mrs. Smith have lived in the same community (Omaha) throughout their lives.

A developmental history was conducted with Mrs. Smith on 11/10/02. Two children besides Billy live in the home: Mary, age 10 and Ann, age 8. Mary and Ann are Billy’s biological siblings. Parent information and a review of school records reveal that both Mary and Ann are doing well in school.

Billy speaks only English, which he has been exposed to since birth and has been speaking since he first began talking. Mrs. Smith’s pregnancy with Billy, and her delivery, were unremarkable. Billy was born through a Cesarean section, as were his two siblings. However, Billy weighed less than 5 1/2 pounds at birth. His one-minute Apgar score was moderately depressed (score = 7), but the five-minute Apgar was in the healthy range (score = 8). Billy has never been hospitalized, and with the exception of measles, he experienced no childhood illnesses. He currently is taking no prescription medications.

A visual screening was conducted by the school nurse on 10-10-02. Results revealed Billy has normal visual acuity. Also, a hearing test was conducted in school by the speech therapist on 10-20-02 and showed normal auditory acuity.

According to his mother, Billy reached his motor milestones (sitting alone, crawling, standing alone, and walking) within the expected age ranges. Mrs. Smith is concerned because he reached his language milestones later than expected (speaking first words and speaking in short sentences). Mrs. Smith describes Billy as a happy, cooperative child who gets along with his two, older sisters. There are many children in Billy’s neighborhood. Mrs. Smith also indicated that Billy prefers playing with children younger than himself than with either his sisters or children his own age. Billy’s favorite activity is playing with trucks. His favorite food is ice cream.

Billy attended a preschool program at age 4 and a half-day kindergarten program last year. In addition, Billy’s first grade is in the same school (Happy Valley Elementary) as his kindergarten class. A review of school records shows he is maintaining good attendance this year and he had an excellent attendance record in kindergarten. A teacher interview was completed with Mrs. Hopkins, Billy’s current teacher. Mrs. Hopkins reports that Billy is very well-behaved. Likewise, Billy has an exemplary conduct record. Regarding academic performance, Mrs. Hopkins indicates that Billy tries very hard in class. At the same time, he is struggling and experiencing many academic difficulties. He is having problems with introductory reading and math skills, and in both areas, he is in Mrs. Hopkins lowest teaching groups. School records show the same pattern of academic performance was present in kindergarten. In October, standardized group-achievement tests were administered to all first graders at Happy Valley Elementary School. Results disclose Billy scored far below average in Reading, Math, and Language.

Several pre-referral interventions were attempted with Billy. For instance, Mrs. Hopkins provides one-to-one instruction whenever possible. Billy receives one-to-one tutoring from a community volunteer twice a week for one-half hour. Likewise, the school has a peer tutoring program. Once a week, Billy works with a fourth-grade student who helps him with sight-word identification.

Current Observations:

Billy was evaluated on two occasions in his school. Physically, he presented as appropriate in height and weight for his age. Billy’s dress was clean, and on each occasion, he was well groomed. It is obvious that Billy is well cared for at home. His articulation was clear, and his vision, hearing and gross-motor coordination appeared appropriate. He was somewhat nervous about leaving the classroom to work with the examiner. Nevertheless, Billy grew increasingly relaxed as the first test session progressed; he was cooperative; he regularly helped the examiner put away test materials; and he listened attentively to most test directions and questions. Similarly, Billy was equally relaxed and cooperative during the second test session.

Wechsler Intelligence Scale for Children – Third Edition (WISC-III)

One test administered to Billy was the WISC-III. This instrument evaluates a variety of abilities associated with school success and it is considered to be one of the best predictors of future achievement. The WISC-III does not assess all abilities such as some specific mechanical aptitudes that may be important to certain occupations and trades. Likewise, it does not measure creativity or how well children get along with others.

The WISC-III provides a progression of scores that can be thought of as forming a triangle. At the top is the Full Scale IQ (FSIQ). This is the best single predictor of school achievement on the WISC-III. Underlying the FSIQ are two scores that permit further distinctions. The first is the Verbal Scale IQ (VIQ). It assesses the ability to think in words and apply language skills and verbal information to solve problems. The second is the Performance IQ (PIQ) which requires fewer verbal skills. It evaluates the ability to think in terms of visual images and manipulate them fluently with relative speed. Another way to think of the PIQ is that it evaluates the ability to organize visually-presented material against a time limit. When there is a difference between the VIQ and PIQ, the VIQ is usually the better predictor of school achievement.

Results from the WISC-III indicate Billy may have difficulty keeping up with peers on most tasks requiring age-appropriate thinking and reasoning. His general cognitive ability is within the lower extreme range of intellectual functioning (WISC-III FSIQ = 65).

Billy's ability to think with words is comparable to his ability to reason without the use of words (VIQ = 67, PIQ = 68). Both Billy’s verbal and nonverbal reasoning abilities are in the lower extreme range and align with his overall ability level.

A personal strength for Billy is his ability to process simple information quickly and efficiently (Processing Speed Index [PSI] = 80). Billy’s PSI was his highest result on the WISC-III. The PSI converts to performance at the ninth percentile. In other words, Billy is able to process simple information more quickly than 9 out of every 100 children his age.

Adaptive Behavior Assessment System (ABAS)

Billy’s adaptive functioning skills were assessed to determine his level of social and daily living skills. His mother completed the ABAS during the interview. The ABAS assesses an individual’s personal and community independence, as well as aspects of personal development outside the school setting across 10 areas. These areas include communication skills, self-direction, social interaction skills, health and safety awareness, etc. The 10 skills form a composite and are collectively referred to as “adaptive behavior”.

Results of the ABAS suggest that Billy has limitations in several adaptive skills. Results of this assessment indicate that his skills for personal care including eating, dressing, and bathing are a personal strength (Self-Care standard score = 85). Another personal strength is Billy's skills for independence and responsibility, such as starting and completing tasks, following time limits and directions, and making choices (Self-Direction standard score = 85). However, when compared to same age peers without disability, Billy's speech, language, and listening skills (Communication standard score = 60) and his skills needed for social interaction (Social = 60) appear to be somewhat limited. Billy's mother noted that his vocabulary is restricted compared to other children his age and that he has fewer friends. Likewise, Billy's overall level of adaptive behavior was in the lower extreme range (Composite standard score = 70). The latter three scores, and most of Billy's adaptive behavior results, are commensurate with his overall cognitive functioning, as measured by the WISC-III FSIQ.

Wechsler Individual Achievement Test – Second Edition (WIAT-II)

Billy completed the WIAT-II, which is an individually administered achievement test. The WIAT provides information about children's reading, mathematics, and language performance. Additionally, Billy's teacher provided detailed information on his current academic performance.

Billy's highest level of achievement functioning took place in pre-reading skills. His Word Reading results were at the 4th percentile. He demonstrated evenly developed pre-reading skills and identified some beginning and ending sounds for a few common words (e.g., hat), but had trouble with others (e.g., fish, dish, star). He did not read any common words and had difficulty using phonetic knowledge to sound out nonsense or unfamiliar words (Pseudoword Reading = below 1st percentile).

Similar to his reading results, Billy's language skills were in the lower extreme range (Oral Language Composite = 1st percentile). In formal testing, he correctly identified pictures of many common objects when presented singly. But, when asked to describe scenes which contained many objects, Billy had difficulty naming more than one or two. His teacher reported that Billy can identify all primary colors, but has difficulty identifying common geometric shapes. The teacher also indicated that Billy occasionally has trouble following orally-presented directions, but that he has improved significantly since the beginning of the year.

Billy's lowest level of achievement functioning appears to be in the mathematics area, where his performance was measured in the lower extreme range (Mathematics Composite = less than 1st percentile). Billy wrote all of the single digit numbers presented; he counted objects up to 10; and he compared shapes according to size. However, he was unable to add 1+2. Billy did not correctly tell time, measure with a ruler, or do subtraction. His teacher reported that Billy identifies numbers and counts up to 10 consistently, but his performance becomes less consistent with higher numbers. She also noted that Billy performs simple addition using his fingers with adult help, but cannot do so for subtraction.

Billy's skills in reading, language, and math were measured commensurate with his estimated cognitive ability.

Social and Emotional Functioning

Billy’s behavior, as rated by his teacher, reflects adequate functioning in most areas. His behavior at school, as measured by the Adjustment Scales for Children and Adolescents (ASCA), was estimated to be predominantly in the adjusted range. However, his teacher rated Billy in the Borderline range for difficulty sustaining attention (ADHD scale standard score = 60). The teacher qualified her observation by noting that this difficulty was present when Billy was in situations where the teacher was addressing the whole class. The teacher reported that Billy is consistently cooperative and has made great strides in social interaction. She noted that Billy is very capable of working independently on academic tasks. During direct observation in the classroom, Billy was measured to be on-task approximately 90% of the time; he occasionally stared off into space and was distracted when another peer whispered to a girl seated next to him.

Similarly, his mother characterizes Billy as very well-behaved and affectionate. He gets along well with his sisters and cousins, and his mother noted that Billy speaks at great length about all the fun he has at school.

Summary:

Billy is approaching his 7th birthday. He is attending first grade. Billy was referred for an educational evaluation by his current teacher due to his minimal progress in attaining basic skills, including oral language and pre-reading skills.

The present evaluation suggests that Billy functions in the lower extreme range of general intellectual ability. Adaptively, delays commensurate with his measured cognitive ability were noted in several areas, including communication and social interaction. Academically, the results of formal testing indicate that Billy is performing at levels that would be expected, given his measured cognitive ability. Behavior-assessment results suggest generally appropriate levels of classroom adjustment.

Evaluated by,

Joseph J. Glutting, Ph.D.

School Psychologist

Happy Valley Public Schools

The direct numerical report of a child’s test performance is the child's raw score (e.g., the number of right-wrong answers). Most often, we cannot interpret raw test scores as we do physical measures such as height because raw scores in a psychological report have no true meaning. Likewise, raw scores are NOT measured in equal units along a line. Therefore, the way one can meaningfully talk about test scores is to bring in a referent. There are two major referents for tests: norm-referencing and criterion-referencing. We already discussed both types of referents earlier in the course. Now, as a result of the psychological report for Kelly, we will pay particular attention to instruments that facilitate norm-referenced comparisons.

The basic difference between norm- and criterion-referenced tests is their interpretation; that is; how we derive the meaning from a score. Norm-referenced tests are constructed to provide information about the relative status of children. Thus, they facilitate comparisons between a child's score to the score distribution (i.e., mean and standard deviation of some norm group. As a result, the meaningfulness of these scores depends on:

Before we learn how to interpret Billy’s test scores, we need to learn why the norm group in norm-referenced test interpretations is so important.

The American Psychological Association (APA), the American Educational Research Association (AERA), and the National Council for Measurement in Education (NCME) (1985) clearly state that it is the test publisher's responsibility to develop suitable norms for the groups on whom the test is to be used. There are four major types of norms. Which of the four types of norms are used by a psychologist (or a school district when conducting group testing) can have a radical impact on the interpretation of a child’s test results.

As we already know from our earlier lesson on statistics, the basic standard score is the z-score. We also know that once we obtain a z-score, it is a simple process to convert a z-score to a t-score, IQ score, and such.

The mean for a full set of z-scores is set at zero and the standard deviation is set at 1.0. Stated simply, z-scores are raw-scores expressed in standard deviation units from the mean. Further, we know that a major advantage of standard scores is that they are measured in equal units.

I am going to give you a lot of help with problem 2 just above. The correct answer is z = -1.0. The answer shows that z-scores below the mean have negative values. In order to get enough precision when using z-scores, we must use at least one decimal place. This makes z-scores such as -1.0 awkward. Another drawback is that approximately half of all z-scores are negative.

Let’s consider again the case of Billy. He obtained a WISC-III FSIQ of 67. This number may not mean much to you yet, but it is a pretty low IQ. It is possible to convert his IQ to a z-score. When you do so, a WISC-III FSIQ of 67 converts to a z-score of -2.20. How would you like to tell Billy’s parents that his IQ was negative! I know I wouldn’t - - and its for this reason that tests use metrics.

Another way of saying all of the above is that we can avoid negative scores and decimals by simply using a standard score with a mean sufficiently greater than 0 to avoid minus score values, and a standard deviation sufficiently greater than 1 to make decimals unnecessary.

We already learned the general formula for converting z-scores to other standard scores.

For example, Wechsler's intelligence tests (WISC-III, WAIS-III) use this form:

Many behavior rating scales use t-scores. You can convert z-scores to this form as follows:

Johnny obtains a z-score of -2.0. What numerical value would his score be if we converted it to Wechsler IQ units, SAT units, and T-score units?

As can be seen from this, IQs, SATs, and T-scores have all the properties of z-scores without the awkwardness resulting from negative scores and decimal points.

To find this out, we need to convert both scores to a common unit, the z-score. All we have to do is use the formula for a z-score.

We already discussed other common, standard-score metrics during our lesson on statistics. However, they are so important that I will present them again:

Wechsler subtest units (e.g., the Information, Similarities, and other subscales of the WISC-III)

We need to discuss some other types of derived scores (i.e., converted from raw scores). There are several types of derived scores that give a child’s relative status. Like standard scores, these other relative-status scores are derived from raw scores. However, these other relative-status scores are not standard scores.

Remember, standard scores present everything in equal units. This means we can add, subtract, multiply, and divide standard scores. We cannot add, subtract, multiply, and divide the other types of relative-status scores.

Besides standard scores, three other types of relative-status scores are commonly used by MDTs: (a) percentiles, (b) grade equivalents, and (c) age equivalents. We will now discuss each type of relative-status score.

Percentiles. A percentile is the point in a score distribution BELOW which a certain percentage of the people fall. Thus, if a person obtains a percentile score of 50, it means that 50 percent of the population falls below this person. Likewise, if a person gets a percentile score of 75, it means that 75 percent of the population falls below this person.

Percentiles are not standard scores. The reason is because percentiles are expressed in ordinal units (ranks). All that the term "ordinal" means is that the distance between units (i.e., percentile numbers) is not equal. In other words, the distance between the 49th and 50th percentiles is much smaller than the distance between the 1st and 2nd percentiles. The reason is because the 49th and 50th percentiles are near the middle of the bell-shaped curve and the 1st and 2nd percentiles are at one "tail" of the bell-shaped curve. As strange as it may seem (and I will show you this in class), the distance between the 1st and 3rd percentiles is exactly the same distance as that between the 16th and 50th percentiles!

Although widely used, percentiles suffer from two serious limitations. One limitation is that the size of percentile units is not constant in terms of standard-score units. We just covered this limitation above, but I will repeat it again to be thorough. For example, if the distribution of test scores is a normal, bell-shaped curve, the distance between the 90th and 99th percentiles is much greater than the distance between the 50th and 59th percentiles. One standard-score unit change near the mean of a test may alter a percentile score by many units while a single standard-score unit change at the tail of the distribution may not change the percentile score at all!

A second limitation of percentiles is that gains and losses cannot be compared meaningfully because percentiles are not measured in equal units. Thus, because the units are not equal, you cannot add, subtract, multiple, or divide percentiles.

Percentile scores can be very deceiving!!! Let’s consider the psychological report for a second student, Kelly. Her standard score in mathematics on the WIAT was 86. This score converts to a percentile score of 17. A standard score of 86 is in the Average range of achievement. However, most teachers would say that a child whose mathematics score is at the 17th percentile is having big trouble academically. This simply is not the case! Yes, like her classroom teacher, the psychologist would prefer to see Kelly have a much higher achievement level. However, a score at the 17th percentile is not all that low. Psychologists know this fact. It is not until you are about the 5th percentile, or lower, that the score suggests a need for special education. Because percentiles are misinterpreted so often, I tell graduate students that, in general, it is best not to present them in their psychological reports. (Note: the psychological report did present percentiles for the case of Billy because I wanted to show you the problems they can pose.)

Standard scores are clearly better to interpret than percentiles. Furthermore, once you know a child’s standard score on a test it is relatively easy to translate the standard score to a percentile. You can use standard score-to-percentile conversion tables to do this without having to make any calculations of the sort described earlier. In class, I will show you how to use such tables. Here are four:

Like percentiles, age- and grade-equivalents are two other types of derived scores. However, percentiles, age-equivalents, and grade-equivalents are not standard scores.

We already know that percentile scores can cause problems for interpretation. The truth of the matter is that age- and grade-equivalents are far worse to interpret than percentiles!

Age equivalents are intended to convey the meaning of test performance in terms of the typical child at a given age. Likewise, grade equivalents attempt to provide information in terms of the typical child at a given grade level.

Grade equivalents are the most common method for reporting results on standardized achievement tests prior to high school (Echternach, 1977). Although grade equivalents are very popular, they also are very problematic. Approximately 20 years ago, the APA, AERA, and NCME proposed that they be banned. Unfortunately, this never occurred.

Age- and grade-equivalents are essentially the same thing, except that age-equivalents compare children to other children who are at the same age level, whereas grade-equivalents compare children to others at their grade level. Therefore, because grade-equivalents are more popular than age-equivalents, the rest of the document will discuss grade equivalents.

Grade-equivalents can be explained best by an example. If a student obtains a raw score on a test that is equal to the median score for all the beginning sixth-graders (September testing) in the norm group, then that student is given a grade-equivalent of 6.0. A student who obtains a score equal to the median score of all beginning fifth-graders is given a grade equivalent of 5.0. If a student should score between these two points, "interpolation" would be used to determine the grade equivalent. Because most schools run for 10 months, successive months are expressed as decimals. Thus, 5.1 would refer to the average performance of fifth graders in October, 5.2 in November, and so on to 5.9 in June.

Grade-equivalents have a great deal of intuitive appeal because parents, teachers, as well as many psychologists, think the numbers actually mean something. However, this is not the case. By way of example, most parents, teachers, (and many psychologists) would assume that a fifth grade child who obtain a grade equivalent of 3.2 knows the same amount of reading as a third grade child who obtains a grade equivalent of 3.2. This simply is not true! The fifth grade child actually knows more reading! Thus, this short example shows some of the problems associated with grade equivalents.

If you do not believe what I just said about grade equivalents (or even if you do), it would be worthwhile to read the question and answer section of "Hills Handy Hints" devoted to grade equivalents.

We are now going to discuss the limitations of grade equivalents, but the problems cited for grade equivalents also apply to age equivalents.

Grade equivalents suffer from at least 7 major limitations. I will now present each of these limitations

Grade equivalents remain popular in spite of their inadequacies. Educators are under the impression that such scores are easily and correctly interpreted - an unfortunate assumption. At a minimum, it is appropriate to suggest that grade equivalents never be used alone without some other type of score, such as standard scores or percentile ranks - - and, it may not be too dogmatic to suggest that we stop using these scores altogether.

You probably noticed that the psychological report for Billy presented neither age- or grade-equivalents. The reason, quite simply, is because their metrics are so bad.