Overheads for Unit 9--Chapter 14 (Assembling,
Evaluating a Test)
Review of Steps in the Test Construction Process
- Select specific learning outcomes
- Identify appropriate item types for those outcomes
- Design items to measure the intended outcomes
This weekís topic
- Assemble test
- Prepare directions
- Administer the test
- Evaluate the testís results
Assembling the Test
- Record items on index cards
- See example on p. 354 for info to include
- File them in an item bank
Double-check all individual test items
- Review the checklist for each item type (pp. 178, 185, 190, 214,
Double-check the items as a set
- Still follows the table of specifications?
- Enough items for desired interpretations?
- Difficulty level appropriate?
- Items non-overlapping so donít give clues?
Arrange items appropriately, which usually means:
- Keep all items of one type together
- Put lowest-level item types first (T/F, matching, short-answer, MC, interpretive, RR essay, and then ER essay)
- Within item types, put easiest learning outcomes first (knowledge, comprehension, application, etc.)
- Administer time-consuming extended-response essays and performance-based tasks separately
- Why put items of a type together? Clearer and more efficient.
- Why put easiest items first? Motivational.
- How to allot time
- How to respond (pick best alternative, etc.)
- How and where to record answers (circle, etc.; same vs. separate page)
- How guessing will be treated (or whether to answer all questions)
- How extended essays will be evaluated (accuracy, organization,
Reproduce the test
- Leave ample white space on every page
- List multiple choice options vertically
- Keep all parts of an item on the same page
- The introduction to an interpretive item may be on a facing page
- When not using a separate answer sheet, provide spaces for answering down one side of the page (preferable the left)
- When using a separate answer sheet, consult the example on p. 354
- Number items consecutively
Administering the Test
The guiding principle
- Provide conditions that give all students a fair chance to show what they know
- Light, ventilation, quiet, etc.
- Avoid inducing test anxiety
- Try to reduce test anxiety
- Donít give test when other events will distract
- Donít talk unnecessarily before the test
- Minimize interruptions
- Donít give hints to individuals who ask about items
- Discourage cheating (see hints on p. 358)
- Give students plenty of time to take the test.
Scoring the Test
- Prepare stencils when useful
- When using stencil with holes, make sure that students marked only one alternative
- When response wrong, put red mark through correct answer
- Apply formula for guessing only when a test is speeded
- Weight all items the same (doing otherwise seldom makes a difference and only confuses scoring)
- Use your carefully developed rubrics!
Appraising the Test
Post-administration appraisalóSelection-type tests
Post-administration appraisalóEssay and performance tests
- If possible, at least the shadow of an item analysis
To determine whether each item was effective
- Functioned as intended?
- Appropriate difficulty?
- Free of clues and technical defects?
- All distracters effective?
- Basis for efficient class discussion of results
Basis for remedial work
- Can better focus discussion to hit trouble spots in learning
- Can short-circuit complaints by pointing out (and rescoring) defective items
Basis for changing instruction
- May see a need to revisit some topics
Basis for greater skill in test construction
- When material consistently too easy or difficult
- Persisting errors
- Feedback on what worked and what didnít
How to conduct a simple item analysis
- See the instructions on the syllabus under Week 10
- Note: Those instructions cover material that the textbook does not. Specifically, they discuss maximum possible discrimination and how to use those numbers to evaluate an itemís discrimination index.
- Note also, to avoid confusion, that the format for recording item analysis results is different in the textbook.
Cautions in Interpreting Item Analysis Results
- Do not assume that better discriminating power means higher
- Because we almost never have proof that the total score is valid, so
we canít conclude that items that correlate with it are valid
Do not assume that low discriminating power means low validity.
- Because items donít have to be defective to have low discrimination indexes
- Because low indexes are the rule in classroom tests
- Why are so many indexes low?
- Because tests usually measure different learning outcomes (knowledge vs. understanding) that are not highly correlated. This means that items from the smaller segment of a test cannot possibly correlate highly with the total score (have high discrimination indexes) when the total score results mostly from other types of learning outcomes.
- Because an itemís ability to discriminate depends on its difficulty
level (we have discussed this in class)
Keep non-defective items with low discrimination indexes when they are
needed to cover the domain of learning outcomes. Discarding them
Donít put too much weight on results from small samples (chance
factors make them unstable)
Remember that discrimination and difficulty depend on many factorsóthe
mix and level of abilities in a classroom, instruction given, etc.óand may
not apply to your next group.
- An item is technically adequate if:
- It discriminates positively
- All its alternatives are functioning (they also discriminate)
- It has no apparent defects
But technical adequacy isnít enough
- What is most important?
- Whether it measures an important learning outcome!
Applying Item Analysis to Performance-Based Assessments
Applicability is limited
- Because there are usually so few items (i.e., there is no separate total score to correlate an itemís responses with)
What can be done, sometimes
- When tasks are given a score on a scale (say, from 1-6) and
- When there is more than one task (so there is a meaningful total score),
- Then see whether the averages on each task differ for low and high scorers on the full assessment
Building an Item Bank
- Build one. Later, youíll be very happy you did.
- Item banks are especially useful for the hardest-to-construct items (understanding and application)
- Consult the websites listed on pp. 372-373 for useful information