9/22/09

The Lack Of Usefulness In Testing

Interesting and somewhat revealing look at the efficacy of standardized testing in education.

Testing Vet Reveals How to Fix Standardized Tests

Todd Farley has a new book: "Making the Grades: My Misadventures in the Standardized Testing Industry." It was an intriguing read, but I told him it didn't go far enough. He had dramatized the weaknesses in the many tests he graded, but did not explain to us poor realists what we should put in their places. At first he resisted my suggestion, but I told him I was sure, if he thought about it, he would come up with something. He did:

1) The reason I wrote my book is because I think no one has any idea how totally ridiculous large-scale assessment is (especially the open-ended items). That's what I hope my book reveals, a system that is just staggeringly, laughably ineffective. I think the efforts to make that process "standardized" or "objective" have taken all meaning away from the work, and the end result is that now all the testing industry produces are numbers, random numbers. I really do think that information is the most important aspect that I bring to the debate about testing. Having said that, I do believe there are some things that can be done to make that assessment much more effective.

2) This is simply a logistical issue, but I think from now on student tests need to be scored by one person at one time. Currently, most large-scale assessment (including the vaunted NAEP) are chopped into bits, with a student's mutliple-choice answers going one place, their short answers going another, and their long answers somewhere else. That means if a student answers ten questions about "Charlotte's Web," for example, question 1 might be read and scored by Bob on Monday, question 2 by Mary on Wednesday, question 3 by George on the NEXT Thursday, etc etc. Sometimes weeks go between the scoring of questions 1 and 2 or 3 and 4, which seems to me to take so much away from what a student might be trying to say. While this is done for various reasons in the testing industry (training, money, deadlines, etc.), it also means a student's test is scored almost entirely without context.

Surely this is done so that each answer is given an "objective" read by some dispassionate employee (not to mention the fact you can then train unqualified people to do the simple task of scoring by having them search for random words), but it also means we are reading an answer to question 2 without knowing how a student answered question 1. In my opinion, this totally takes away from a broad understanding of a student's knowledge. Decisions end up being made based more on picayune things like what words show up on the paper, not so much what those words might mean (i.e. we accept "bubbles" but not "sizzles" in a question about the definition of "boiling", etc.). In the current system that means 5 or 10 or 15 or people might all end up doing some of the scoring/grading on each kid's test. That is not being "objective." It is an unrealistic assessment of a child's understanding.

In the scoring centers, it also takes away from the sense of responsibility that we feel about kids: if I was scoring one student's entire test, I'd become invested in it, but in the current set-up I'd just be scoring Question 2 (i.e., "What is the theme of this story?") for about three straight days and would completely lose any feel that I was assessing actual children. It just totally becomes a muddled mess of words at that point, not students. Ergo, I think what has to happen is that some person completely qualified in a subject area (such as an English teacher reading English tests, math and math, etc.) should read and score each test in its entirety, not just chop them all up into bits. If it costs more to hire actual educators instead of random people off the street, that's still what I think makes more sense.

3) And so, what I think SHOULD happen is what happened on the best assessment I ever worked on: the state of Washington's Goal 2 classroom-based assessments. Interestingly enough, Washington has a state test (Washington Assessment of Student Learning) for reading and math, the usual high-pressure, mandatory tests that many teachers/parents argue against AND that also happen to be part of the horrible system my book and I impugn (in fact, a lot of my early scoring career was working on WASL reading and writing). Of less importance to the state is the Goal 2 tests (for History, Civics, Health/PE/, and the Arts--Music, Theatre, Visual Arts, Dance), but those tests were Classroom-Based Assessments that were written and scored by the state's teachers in conjunction with Riverside Publishing--they weren't just handed off to the test company with no idea what was really happening next. For those tests, scoring systems were established on the state-level, and then local teachers in those subject areas were entrusted to read the tests, view the performances and assess the results. It seemed to me this way you had some sort of central management (state gov't providing standards on what should be learned and what constituted acceptable and unacceptable results), plus teacher participation in the scoring process that to me means qualified people would give serious reviews of student work, a massive improvement on the current state of bored, unqualified temps making snap judgments based on the fleeting glances they give student work. Even if we don't think it's a good idea for teachers to assess their own students' work, then teachers can cross-grade within a district (which happens at the college-level).

Jay, I don't know that my suggested system is perfect, but it is a massive improvement on the foolishness that now occurs.

Total Pageviews