The Heart of the Matter: The Good Test
Lou Spaventa, US
Lou Spaventa teaches and trains in California, the USA. He is a regular contributor to HLT - The Heart of the Matter series. E-mail:spaventa@cox.net
“Those are my principles. If you don’t like them, I’ve got others.”
Grouch Marx
The other day I was talking to a colleague whose child attends the same public elementary school as two of my own children had. My colleague expressed her frustration with the school’s lack of enrichment and child-friendly activities. Such activities used to be part of the general positive atmosphere of this particular elementary school. My colleague wanted her child to be part of a creative project such as the end of year musical review that used to be a great hit with parents and family and brought the community into the school. She broached this to the principal of the school. His reply was that since they stopped doing the musical review and started testing, that test scores had risen. For him, this represented achievement, a rise in test scores. What exactly the test scores showed or why they would be important was likely not as relevant as the fact that the scores had gone up. Students had improved. In what? In choosing answers from a standardized test pegged to grade level from a curriculum that was designed to be “teacher proof,” as if individuality and variety were the enemy of education. Quite the contrary I’d say.
For many years, testing has been dominated by the measurers, test makers whose desire is to maximize variation on tests to reflect the theory of the normal curve in test making; that for any group of people, there will be a spread of scores such that most are in the middle, and fewer lie on either side of that middle range. Fewest are the outliers, the As and the Fs. Scoring against a base group and among the test-takers, normative scoring, has been part of education for a very long time. Criterion focused testing, in which test takers must achieve an agreed-upon score in order to be certified as being successful for any particular criterion has also been a part of testing. For the recent spate of Bush era tests associated with the “No Child Left Behind” federal policy in education, criteria were established for minimum achievement on standardized tests by each state in the United States. School children in the U.S. are used to tests being high stakes games and serving a gate keeping function. If a child’s parents wanted the child to enter the Gifted and Talented Education Program known as GATE in a California school, the child needed to achieve a cut off score or above. Parents became good at gaming the system; they asked for and were granted the right to present other kinds of evidence for their children if the children did not qualify for the GATE Program: for example, a history of good grades and good academic citizenship – turning in homework, cooperating in class, taking part in group activities.
In the workaday world of ESOL classes, teachers usually have little control over testing on a grand scale, but they do have control over their own tests, teacher-made tests, in the language classroom, and they have control over how they arrive at grades. Teacher made tests are most often achievement tests, meant to measure learning over a certain period of time or for a certain part of the curriculum. Teachers also have the option, in most cases, of creating criterion-based tests rather than tests based on maximizing variation in scores, i.e. following the normal curve. Such criterion-based testing makes it clear to students what level of mastery they must achieve in order to control the language they are learning. A common percentage for asserting that a student has mastered a given structure, lexical set, pragmatic usage or the like, is 80%. In other words, a student must use language correctly eight out of ten times in order to be said to have mastered it. This way of thinking about testing simply eliminates the expectation of a spread of scores in imitation of a normal curve. It posits that the criterion to be measured is attainable by students within the time frame given for learning. It also assumes that all students can meet the criterion, but perhaps not all at the same point in time. Therefore, criterion-based testing also lends itself to assessment based upon student progress in the language. In that way, it more accurately reflects a student’s history in a given class. A student’s final score or grade for a course can be summative based on progress over time, that is the student’s developmental history in the language.
Another aspect of testing which is crucial for a good test is that of backwash. Backwash is a hydra of phenomena, all based upon the effect that the test has on those who take it, those who give it, the program administration, and any other interested parties such as parents or employers. A good test should make both the student and the teacher, as a minimum, feel that something worthwhile has happened in terms of teaching and learning, that the test has been a fair and accurate challenge for the student, that it gives good and usable information to both student and teacher, and that it assures the student that the teacher is testing him or her on representative content in proportion to how much time is spent on that content. These all contribute to positive backwash. Positive backwash should boost student confidence in the teacher. An example is perhaps needed here to make this point more clearly. If a class is labeled English for Everyday Communication and is orally based, the student should be tested both in the manner in which he or she was taught, and proportionally on the time spent on each part of the test content – if most of class time was devoted to conversation, then students should be tested by conversing in English – as opposed to a written test of conversational ability. Positive backwash is perhaps the most desirable outcome for any single test with the caveat that it is not the reason for testing, but a by-product of a good test.
What is a practical problem to be solved in the language classroom, for example, how a single teacher teaches conversation to a group of fifteen English language learners is analogous to what is a practical problem in testing language achievement. The retreat to paper and pen to test the four skills: reading, writing, listening, and speaking is seldom wise and seldom well thought out. We teach how we speak, listen, read and write, and we test how we teach. Oral lessons demand oral tests, not pencil and paper tests. Additionally, it is somewhat contradictory to make a test of reading into a test of memory by having students answer questions from memory with books closed rather than from a text in front of them. We are living in a world of massive information availability. What we need to do is not to remember everything, but to remember how to get what we need.
So, for me, a good test is one that tests directly the skill being measured. A good test is one that has a positive effect on those who give it and those who take it. A good test is criterion-based and allows for assessing student development over time. Finally, quite often a good test is one that involves students in its creation. I recently gave a reading test on Jeannette Walls’ memoir The Glass Castle. I asked students to create five questions each that they would want to write about on a five question short essay test – each answer being a paragraph of around ten sentences. From the long list of questions created by a class of 28 students, students discussed and chose five questions for the test. I used as much of student language as I could while I reshaped the questions grammatically or clarified content. In the formation of the questions themselves, I was able to understand where students were in their level of understanding of the book. This proved positive backwash for me as a teacher of the book, and positive backwash to the students as shapers of their own inquiry. Students took the tests with books open, journals open, notes available, and dictionaries at the ready. Oh, and 27 of 28 students met the criterion standard of 80%.
Please check the Methodology and Language for Primary Teachers course at Pilgrims website.
Please check the Methodology and Language for Secondary Teachers course at Pilgrims website.
|