By Peter Greene
Is there such a thing as a useful standardized test?
To have this conversation, we have to get one thing out of the way. If you believe (and I think some school reformers sincerely do) that the only reason that teachers oppose the current high stakes test-and-punish status quo is because their self-serving union tells them to, you are blinding yourself to some real issues.
First, there is a real gulf between national union leadership and rank-and-file teachers precisely because union opposition to reformer policies has usually been tepid. Teacher opposition to testing comes first and foremost from teachers who have been watching testing become a toxic, destructive element in our classrooms that interferes with our ability to deliver real education. It’s detrimental to our students. And it is used in many places to deliver a professional verdict on our schools and ourselves with an accuracy no greater than a roll of the dice.
Opposition to testing also comes from other people who see how it plays out on the ground: parents. The Opt Out movement — in which hundreds of thousands of parents have not allowed their children to take state-mandated accountability standardized tests — was not created by teachers. It is not led by teachers — and in some places, it is actually potentially damaging to teachers under the current bizarre test-driven accountability system.
So if you imagine that test opposition is some sort of political ploy engineered top-down by unions, you are kidding yourself.
None of That Answers the Question, so Let’s Get Back To It
If I am such a dedicated opponent of standardized testing, what do I propose as an acceptable substitute?
Let’s first clarify our rather fuzzy terms.
Come to think of it, we’d better clarify “test” as well. For many folks, it’s only a “test” if the student is answering questions. A five-page paper assignment, for instance, is usually not called a test. In fact, the more open-ended the assessment, the less likely folks are to call it a “test.”
“Standardized” when applied to a test can mean any or all (well, most) of the following: mass-produced, mass-administered, simultaneously mass-administered, objective, created by a third party, scored by a third party, reported to a third party, formative, summative, norm-referenced or criterion-referenced.
This broad palate of definitions means that conversations about standardized testing often run at cross-purposes. A teacher talking about performance assessment task piloting in New Hampshire may think she is making a case for standardization, while I think that performance assessment is pretty much the opposite of standardized testing. There’s a lot of this happening in testing debates — people arguing unproductively because they have very different things in mind.
Acceptable substitute for what purpose?
The confusion is further exacerbated by a myriad of stated and unstated purposes for standardized testing. This confusion about purpose has emerged as a huge issue in the ed debates because far too many of the amateurs designing testing policy don’t understand this at all. At. All.
It’s not just that corporate school reformers argue that you can make the pig gain weight by measuring it. It’s that they also assert that the scales used for weighing the pig can also be used to measure the voltage of your house’s electrical system and the rate of water flow in the Upper Mississippi.
If we want to find an acceptable test, we have to first declare what the test is going to be used for.
Ranking schools, students and teachers
This is where purpose becomes important. I can’t think of a good test for achieving the goal of ranking students, teachers and schools — for which many of today’s “accountability” tests are used — because I don’t think these goals are worth achieving. As a teacher, I don’t need to know how my student compares to students in Idaho. I don’t need to know that as a parent, either.
It’s a fool’s game to compare teachers to other teachers, schools to other schools, and students to other students. First of all, I can only make the comparison based on a narrowly defined criteria. Otherwise I’m reduced to deciding if my insensitive smart flabby artist student ranks lower or higher than my sensitive tall winning cross-country racer student. The comparison only has meaning if it is based on narrow criteria (which student answered the most math problems correctly on Tuesday) — but what good is a narrowly defined comparison?
If I find that my smart, funny wife is not as smart and funny as some other woman, should I be unhappy in my marriage? If this delicious steak is not as delicious as the steak I had last night, should I spit it out? If all the teachers in my school are great, should it be closed down because some other school has greater ones?
The signature feature of a ranking system is that it locates losers. But what decent teacher would stand in front of a class of 30 on the first day of school and say, “Five of you will turn out to be losers.”
Ranking and rating means that even if everyone is excellent, the least excellent must be marked Below Basic or Underperforming or Just Not Good Enough. A system based on ranking and rating is a system that assumes that in every endeavor, there are people who just aren’t good enough. I reject that view of the world, and so I reject any testing system designed to reinforce that view. If everybody in my classroom does a great job, everybody in my classroom gets an A.
Providing feedback for parents
Here we have a standardization problem because not all parents want the same feedback. Is she getting an A? Is she passing? Is she developing a better grasp of abstract language particularly as used in classic literature? Is she okay? Does she seem happy? These are all types of feedback I’ve been asked for by some parents. What one measuring tool would satisfy all those questions?
Standardized testing is repeatedly sold with the myth of the clueless parents, the parents who have no idea how their students are doing. But the solution to this problem is transparency, the levels of which can be controlled by the parents.
For example, the electronic gradebook. Our parents can look up their students any time and see exactly what I see when I pull up the gradebook. Some of my parents look every day. Some look never. Some look and then call or email me to ask, “So what exactly was this one assignment.”
When we control the available information, we do parents a disservice. Only revealing the grade at report card time is a disservice. But anyone who has taught at a school with big detailed portfolio gradeless systems can tell the story of the parent who looked at all that data and said, “Look, can you just tell me what grade she’s getting?”
Parents deserve just as much feedback as they want. Standardized testing has nothing to do with providing that.
Feedback for teachers
Any decent teacher generates this kind of data daily. Any lousy teacher will have no use for standardized test data even if it arrives on gold-clothed ponies.
You are dodging the question
Okay, yes. I’ve laid out my usual assortment of objections to standardized testing, but I still haven’t said what would be an acceptable substitute. If you’re still here, I’ll try to address that now.
What qualities would an acceptable-to-me standardized test have?
If I ever were to find a standardized test that I could live with (or even date regularly), this is what it would look like.
Criterion-based (and so, objective)
If I’m going to measure my students against a standard, not against each other, I can use the test to answer the question, “Do my students know how to find verbs” or “Can my students identify dependent clauses?” If every student in my class can’t potentially get a top score, I’m not interested. And if it’s not objectively scoreable, it’s no help. That means that no standardized test is going to be used for any higher-order critical thinking-type skills.
(This is part of the whole point of Depth of Knowledge testing love — it creates the illusion that higher order stuff can be scored objectively. But it can’t).
It is possible to come up with standardized questions. I once had a textbook with great literature questions — but I still had to evaluate the answers myself.
In fact, I can only see using a standardized test for checking the lowest levels of simple operations — simple recall, basic application.
As Close to Authentic as Possible
I want a task that actually assesses what it claims to assess. Multiple-choice questions don’t assess writing skills. Click-and-drag questions don’t assess critical thinking.
This ought to go without saying, but if I, the teacher, don’t get to see the questions, the answers, and the exact results from my students, then, no, thank you. I can do better myself.
I rarely re-use my own test-like assessments; instead, I make new ones each year to fit the class and the instruction. Particularly when I’m working summative assessments, I’ll create something that focuses on the issues with which we are addressing. For instance, if we’re solid on spotting infinitive phrases but have trouble picking out gerunds used as direct objects, I can design a test that will help both me and my students.
Expertise and Convenience
There are lots of things I don’t know. Materials prepared by people who are experts in particular areas are a necessary aid, and those sometimes include assessments. I’m happy to have an expert in a particular field in my classroom.
And at some points, I can use the convenience of having something pre-built to save me some time.
So, the acceptable alternative…?
Do I really think that there are no necessary standardized tests?
Well, it is true that we all use standardization because we don’t completely individualize everything from assessment through evaluation — but that’s a hugely broad definition of “standardized.”
By that standard, everything used with two or more students is a standardized test — and it may be useful to think of standardization as a sliding scale. The more we broaden the reach of the assessment, the more students we try to make it each, and the more we try to make the grading of the test be quick and uniform, the less useful the assessment becomes. A test that you can give to every student in America and which can be scored in just a week will by necessity be inauthentic and measure little.
So for best classroom assessment, we stay as close to the individualized specifics end as we possibly can. The more that an assessment is developed in response to specific instruction by a specific teacher of specific students, the more useful that assessment will be in performing the most useful function of any test — telling students and teachers where their strengths and weaknesses lie.
Yes, that information is not what the policymakers would really like to have. But the information they would like to have is completely useless to me in the classroom (and so far, they’ve found no reliable method for either collecting or using such information anyway). I’m not convinced that information can be collected by standardized tests anyway.
I’m not saying that all standardized tests are evil. And stripped of baseless high-stakes consequences, their awfulness is greatly reduced. There are standardized tools that are tolerable, and a few that might rise to the level of useful.
Peter Greene, a veteran teacher of English in a small town in Pennsylvania, wrote on his lively Curmudgucation blog that he has found himself in conversations about standardized testing that go something like this: people who like standardized testing defend it to the max while he counters that the number of standardized tests necessary for students to take is zero.
For taking that position he writes, he has been called a “union shill,” lectured that data from these tests are the life blood of education, and asked to be explain what the alternative to standardized testing is. Here in this post, he explains his thinking. This is a shortened version of the original, which you can find here.