- Record items on index cards
- See example on p. 354 for info to include
- File them in an item bank
- Double-check all individual test items
- Review the checklist for each item type (pp. 178, 185, 190, 214,
232, 248)
- Double-check the items as a set
- Still follows the table of specifications?
- Enough items for desired interpretations?
- Difficulty level appropriate?
- Items non-overlapping so don’t give clues?
- Arrange items appropriately, which usually means:
- Keep all items of one type together
- Put lowest-level item types first (T/F, matching, short-answer, MC, interpretive, RR essay, and then ER essay)
- Within item types, put easiest learning outcomes first (knowledge, comprehension, application, etc.)
- Administer time-consuming extended-response essays and performance-based tasks separately
- Why put items of a type together? Clearer and more efficient.
- Why put easiest items first? Motivational.
- Prepare directions
- How to allot time
- How to respond (pick best alternative, etc.)
- How and where to record answers (circle, etc.; same vs. separate page)
- How guessing will be treated (or whether to answer all questions)
- How extended essays will be evaluated (accuracy, organization,
etc.)
- Reproduce the test
- Leave ample white space on every page
- List multiple choice options vertically
- Keep all parts of an item on the same page
- The introduction to an interpretive item may be on a facing page
- When not using a separate answer sheet, provide spaces for answering down one side of the page (preferable the left)
- When using a separate answer sheet, consult the example on p. 354
- Number items consecutively
- Proofread
OH 3
Administering the Test
The guiding principle
- Provide conditions that give all students a fair chance to show what they know
Physical conditions
- Light, ventilation, quiet, etc.
Psychological conditions
- Avoid inducing test anxiety
- Try to reduce test anxiety
- Don’t give test when other events will distract
Suggestions
- Don’t talk unnecessarily before the test
- Minimize interruptions
- Don’t give hints to individuals who ask about items
- Discourage cheating (see hints on p. 358)
- Give students plenty of time to take the test.
OH 4
Scoring the Test
Selection-type items
- Prepare stencils when useful
- When using stencil with holes, make sure that students marked only one alternative
- When response wrong, put red mark through correct answer
- Apply formula for guessing only when a test is speeded
- Weight all items the same (doing otherwise seldom makes a difference and only confuses scoring)
Supply-type items
- Use your carefully developed rubrics!
OH 5
Appraising the Test
Pre-administration appraisal
Post-administration appraisal—Selection-type tests
Post-administration appraisal—Essay and performance tests
- If possible, at least the shadow of an item analysis
OH 6
Item Analysis
Aim
To determine whether each item was effective
- Functioned as intended?
- Appropriate difficulty?
- Free of clues and technical defects?
- All distracters effective?
Value
- Basis for efficient class discussion of results
- Can better focus discussion to hit trouble spots in learning
- Can short-circuit complaints by pointing out (and rescoring) defective items
- Basis for remedial work
- May see a need to revisit some topics
- Basis for changing instruction
- When material consistently too easy or difficult
- Persisting errors
- Basis for greater skill in test construction
- Feedback on what worked and what didn’t
How to conduct a simple item analysis
- See the instructions on the syllabus under Week 10
- Note: Those instructions cover material that the textbook does not. Specifically, they discuss maximum possible discrimination and how to use those numbers to evaluate an item’s discrimination index.
- Note also, to avoid confusion, that the format for recording item analysis results is different in the textbook.
OH 8
Cautions in Interpreting Item Analysis Results
Cautions
- Do not assume that better discriminating power means higher
validity
- Why?
- Because we almost never have proof that the total score is valid, so
we can’t conclude that items that correlate with it are valid
either.
- Do not assume that low discriminating power means low validity.
- Why?
- Because items don’t have to be defective to have low discrimination indexes
- Because low indexes are the rule in classroom tests
- Why are so many indexes low?
- Because tests usually measure different learning outcomes (knowledge vs. understanding) that are not highly correlated. This means that items from the smaller segment of a test cannot possibly correlate highly with the total score (have high discrimination indexes) when the total score results mostly from other types of learning outcomes.
- Because an item’s ability to discriminate depends on its difficulty
level (we have discussed this in class)
- Keep non-defective items with low discrimination indexes when they are
needed to cover the domain of learning outcomes. Discarding them
lowers validity.
- Don’t put too much weight on results from small samples (chance
factors make them unstable)
- Remember that discrimination and difficulty depend on many factors—the
mix and level of abilities in a classroom, instruction given, etc.—and may
not apply to your next group.
Bottom line
- An item is technically adequate if:
- It discriminates positively
- All its alternatives are functioning (they also discriminate)
- It has no apparent defects
- But technical adequacy isn’t enough
- What is most important?
- Whether it measures an important learning outcome!
OH 9
Applying Item Analysis to Performance-Based Assessments
Applicability is limited
- Why?
- Because there are usually so few items (i.e., there is no separate total score to correlate an item’s responses with)
What can be done, sometimes
- When tasks are given a score on a scale (say, from 1-6) and
- When there is more than one task (so there is a meaningful total score),
- Then see whether the averages on each task differ for low and high scorers on the full assessment
OH 10
Building an Item Bank
Simple advice
- Build one. Later, you’ll be very happy you did.
- Item banks are especially useful for the hardest-to-construct items (understanding and application)
- Consult the websites listed on pp. 372-373 for useful information