HLT Magazine (January 2002)

Humanising Language Teaching
Year 4; Issue 1; January 2002

Sharing the power: action research into learner and teacher co-evaluation

by Christoph Rühlemann, Bavaria, Germany
Secondary and general

This paper presents action research into criteria-based learner and teacher co-evaluation of writing, conducted with 22 eighth-graders and 18 tenth-graders. It outlines and analyses three different co-evaluation tasks and investigates the learners' ability to effectively take part in assessment, their rating attitudes and their acceptance of the teacher role. The paper concludes that co-evaluation benefits learners and teachers alike.

If you do not want to read this article sequentially you can click to part that interests you most on this headings menu:

Introduction Co-evaluation and Learner autonomy Concerns about co-evaluation Learner self-determination Co-constructing grades Safety device- Teacher veto Writing tasks and co-evaluation criteria Findings Can learners be trusted to handle the fredon and power? Do learners accept the transfer ofroles in co-evaluation? Do teachers and learners benefit from co-evaluation?

Introduction

'Being a teacher could be so much fun – if only there weren't these dull corrections. And dealing out Es and Fs doesn't make you feel happy either.' Sighs like these are common in staff rooms. The loneliness, boredom and bad conscience that may accompany our marking system seem to be the price teachers pay for the joys of teaching. Those, however, who have had the rare occasion to assess student works in a team, e.g. in the evaluation of leaving exams, have certainly felt the relief of sharing the work with a peer. Here the business of correcting and assessing, normally seen to with aversion and in seclusion, is turned into an act of cooperation and communication. Unfortunately, peer co-evaluation is not practised but under a few exceptional circumstances. Yet, co-evaluation is possible more often – with the learners as co-evaluators.

Co-evaluation and learner autonomy

The concept of learner autonomy requires responsible learners who are able to 'consciously monitor their own learning processes' (Scharle et al. 2000:3). Do learning processes comprise grades? In practice, the grade often marks the end of one learning episode and the beginning of a new one; learning and learning monitoring efforts vary accordingly: they are high in pre-test phases, low in post-test phases. As long as grades are seen as sober assessments of past learner performances, this up and down of learner commitment is inevitable. Grades, however, need not merely provide retrospective evaluation. Grades can, and preferably should, function as prospective evaluation providing conclusive assessment as well as constructive information the learners can capitalize on to improve their performances. To achieve this the evaluation needs to be based either on comprehensive and supportive comments (Berg: 1999) or a well differentiated set of evaluation criteria. Such comments or criteria-based evaluations have the great advantage of identifying weaknesses and strengths in detail and are therefore, rather than a closure of a past chapter, motivation for sustained present and future learning efforts. As such, grades form a pivotal part of the learning process. Consequently, we cannot exempt marks from learner involvement. Taking learner autonomy seriously forces us to allow learners even into the evaluation and grading process.

Concerns about co-evaluation

Evidently, there will be quite some opposition to this. Involving learners in evaluating and grading ? Absurd, some will say, learners aren't competent to do this; only the teacher is. Others will object that learner evaluators are likely to let their judgements be influenced by personal likes and dislikes of learner evaluees. And a few will suspect that learners may be tempted to misuse their say in grading as an easy way to get good marks. And, finally, many may doubt that learners are willing to share the teacher's role in the first place. These concerns need to be taken seriously. Therefore, the central questions this paper investigates are the following: Are learners qualified to evaluate each other? Can learners be trusted to handle the freedom and power given through co-evaluation in a responsible way? Do learners accept the transfer of roles? Do teachers and learners benefit from co-evaluation?

The design of the co-evaluation project

Given the above uncertainties the design of the co-evaluation project concedes the learners different degrees of self-determination within a teacher-controlled framework. Three different co-evaluation (CE) tasks were designed: 'equally shared CE', 'learner-dominated CE' and 'learner-autonomous CE'.

Figure 1: degrees of learner and teacher-say in CE tasks

Degrees of learner self-determination

The extent to which the teacher and the learners shared influence in the co-evaluation process varied (s. figure 1).
'Equally shared CE': Text 1 was evaluated twice in that the learners and the teacher evaluated the text using the same number and types of criteria; the peer-evaluator's total score was then added to the teacher's total score and divided by 2, thus yielding an average score and granting learners and the teacher a 50% say in the co-evaluation.
'Learner-dominated CE': The criteria to be used with text 2 were split, as 3 out of 4 evaluation criteria were alloted for the peer-evaluators while the teacher's evaluation scope was restricted to 1 criterium, leaving him with 25% of influence.
'Learner-autonomous CE': The learners' say was increased to 100% in evaluating text 3 since all of the criteria were negotiated and applied by the peer-evaluators completely independently (), thus granting the learners full self-determination.

Writing tasks and co-evaluation criteria

'Equally shared CE': The text co-evaluated with a group of 10^th-graders was a letter of application. Seven criteria were used: coherence, cohesion, content, convention, conviction, accuracy and variety. For each of these criteria - as well as those chosen in the other co-evaluations - a maximum score of 5 points could be achieved (s. appendix).

'Learner-dominated CE': The 8^th-graders were currently dealing with the gerund. Accordingly, their writing task – a portrait of a person - required them to use gerund-related language material. Since the gerund had only been introduced some lessons before, it was felt appropriate this time to save accuracy for the teacher. The learners were asked to peer-evaluate the texts by coherence, cohesion and content.

'Learner-autonomous CE': The same class was given a selection of theatrical nuclei and asked to expand them into full-bodied theatrical scenes. This time, the evaluation criteria, instead of being prescribed by the teacher, were negotiated by the learners themselves. The negotiation process proved to be unexpectedly lengthy, taking up almost two full lessons and requiring teacher help in coining the criteria. The class eventually agreed to use four familiar-appearing criteria - coherence, correctness, conviction content, - and feeling description, the only non-familiar criterium, meant to cover information on emotional states as provided by stage directions.

Co-constructing grades

To embark on co-evaluation in the belief that learners will evaluate texts in just the same way as the teacher does is futile. Realistically, they won't. Consequently, co-evaluation entails compromise. This concerns the rating of the criteria and, potentially, the grades obtained. Thus, co-evaluation pre-supposes the teachers' readiness to accept the learners' differing conclusions. Such power-sharing readiness seems justified considering that, while some evaluation criteria are unambiguous – such as accuracy , numerous others are clearly not – such as content, conviction and convention, to name just a few highly relevant to text quality. Here, very little remains of the unambiguity typical of accuracy problems. Here the ground we tread is shaky and, as anybody, we teachers might fall – or, at least, be just as right as the learners who have concluded differently.

Safety device: teacher-'veto'

However, a safety device was felt indispensable to prevent flawed learner evaluation from distorting the grading outcome. Therefore the learners dealing with texts 2 and 3 were informed that the teacher would 'veto' and overturn the peer-evaluation in such cases where substantial rating disagreements occurred. The margins of rating disagreement within which the teacher veto was to come into effect varied. Whenever a 2-point or above-2-point-rating disagreement between the peer-evaluators and the teacher emerged in the co-evaluation of text 2, the learners' evaluation of this individual criterium was replaced by the teacher's evaluation; all below-2-point-discrepancies by contrast were accepted. In co-evaluating text 3, however, not only the points alloted by the peer-evaluators for any individual criterium were rejected, but the entire learner evaluation was vetoed and replaced by the teacher's whenever a 3-point or above-3-point-rating disagreement occurred; all 1 or 2-point-rating differences, however, went untouched. With text 1 no vetoing procedure was thought necessary as the high number of criteria to be used (7) and the equal share for the teacher and the peer-evaluators in the final score guaranteed that serious rating discrepancies would be considerably mitigated or neutralized.

Findings

Are learners qualified to evaluate each other?

All three co-evaluations yielded persuasive rating agreement between the teacher and the peer-evaluators: complete or almost-complete rating harmony, i.e. 0 or 1-point rating discrepancies, exceeded the 70% threshold invariably (s. table 1)(1).

Table 1: Rating differences

High rating discrepancies were, on average, scarce with a noticeable rise in 3-point rating differences occurring in 'learner-autonomous CE'; thanks to investigation of rating disagreement per criterium and learner feedback these differences could be identified as owing to insufficient clarity of the criteria.

In 'learner-autonomous CE', coherence (11 points total rating difference) and conviction (9) saw the greatest rating disagreement, followed by content and correctness (7 and 6 respectively); feeling description proved the least problematic (3). The immediate inference was that the class and the teacher shared the concept of feeling description to a high extent and the concepts of coherence and conviction, in contrast, were either unclear to many learners or the concept they had in mind deviated distinctly from the teacher's. This assumption was confirmed by learner feedback. Asked which of the criteria they had had most difficulty with, 8 respondents ticked coherence, closely followed by conviction (7). The lead of theses two criteria over the remaining criteria feeling description (3), correctness (2), and content (1) is noticeable. This remarkable concurrence of data from rating differences and from learner feedback distinctively support the estimation that the rise in 3-point-rating differences owes to the learners feeling shaky about their own evaluation criteria. Apparently, the negotiation process had failed to generate clear enough criteria.

In conclusion, the answer to the above question is yes, learners are competent co-evaluators provided they have acquired distinctive concepts of their criteria - either through solid preparation or successful negotiation.

Can learners be trusted to handle the freedom and power given by co-evaluation in a responsible way?

One prediction was that learners would let potential negative feelings towards their peers influence their peer-evaluations. Conclusive evidence emerged from 'equally shared CE', conducted in a socially dysfunctional 10th class. The class had earned itself an extremely bad reputation as several incidents of deviant behaviour had occurred both outside and inside the school and relationships were characterised by resentments. Therefore the obvious concern was that learners who disliked others would 'down-grade' their opponents. Against the odds, the learners behaved supportively towards their peers as shown by the total minus-point-rating-difference (55) outweighing the total plus-point-rating-difference (31) distinctly (s. table 2). This clearly indicates that, on average, the peer-evaluators applied noticeably 'evaluee-friendlier' standards than the teacher did. This supportive rating attitude was confirmed in questionnaires. The overwhelming majority of the class gave their peer-evaluators credit either for having 'tried hardest' (13, 64%) or for having 'tried to' evaluate their text 'justly' (55,55%). These figures clearly confirm that the learners had successfully resisted the temptation to misuse co-evaluation for their interpersonal dislikes.

Another concern was that co-grades might be mistaken as easy grades. Here the results are not completely devoid of ambiguity. The 10^th-graders clearly adopted low standards, which raised the suspicion they indeed misunderstood co-evaluation as a cheap way to boost their pre-exam marks. The 8th-graders, on the other hand, invariably went by very high standards, mostly rating their peers more strictly than the teacher in both co-evaluation tasks.

Table 2: Total rating differences

Hence, it seems to depend on the learners and their level of responsibility whether or not they take undue advantage of co-evaluation. Clearly, however, it can be concluded that misuse need by no means occur necessarily.

Do learners accept the transfer of roles in co-evaluation?

When asked how they perceived co-evaluation the classes widely expressed contentedness. The overwhelming majority of both classes rated co-evaluation either 'very good' or displayed cautious appreciation ('OK'). A minority expressed hesitation or rejection (s. table 3).

Table 3: Learner acceptance of co-evaluation

The learners also came up swiftly with numerous advantages of co-evaluation such as improving one's own writing, and seeing other learners' and one's own mistakes. Some learners expressed enjoyment from having a say in the evaluation and grading of writing, others responded they liked getting their peers' texts to read and noticed an increase in their understanding of evaluation criteria: 'I found the learner-autonomous CE project good because now I understand the criteria better than before ...'.

Interestingly, while all the 8^th-graders could think of advantages, 7 in 18 respondents explicitely noted they could see no disadvantages at all. Those who did feel that co-evaluation had downsides mostly feared that interpersonal troubles might distort their peers' evaluations: 'I liked the evaluation of the Nuclei but if 'X' doesn't like 'Y' he or she can give 'Y' a bad mark'; others criticised careless rating: 'We read the story fast one time and give a grade of our first impression. We do not really care, I think !' Perceptions of the kind should be taken very seriously; however, they mark just the extreme negative end of a continuum of reactions which were decidedly positive.

However welcoming the overall reaction of these learners, it should not be overlooked that other learners may well reject co-evaluation. As could be shown in research with learners in Hongkong (Sengupta: 1998), strictly teacher-centered classrooms tend to reduce the learners' eagerness to take on the role of the co-evaluators dramatically. So, acceptance of co-evaluation depends not only on the learners themselves but on the learning environment being conducive to learner autonomy as a whole.

Do teachers and learners benefit from co-evaluation ?

The answer is a clear yes. The obvious benefit for the teacher relies in the diagnostic exploitability of rating disagreements. In 'Equally shared CE' the most rating deviations between the learners and the teacher were expected for accuracy. Astonishingly, however, accuracy turned out to be an area of relative rating harmony. For accuracy, and content as well, the average rating difference was 0,83 points, just slightly above coherence where the peer-evaluators' and the teacher's judgements converged the most with 0,73 points of average rating difference. The rather unsuspicious looking criterium of variety ranked first (1,33), followed by convention and conviction (1,11) and cohesion (1,06) in the midfield. This surprising turn of events helped diagnose the real learner needs. It became evident that the criterium of variety had not yet been sufficiently well taught and learnt – an insight which contrasted sharply with the teacher's expectations. So, investigating these rating differences may greatly help identify learner weaknesses and define areas of additional learning and teaching.

Moreover, less obvious but by no means less important, co-evaluation provides an occasion of genuine learner and teacher cooperation in a field where, traditionally, teacher-autonomy is paid for by teacher-seclusion.

Co-evaluation benefits learners too. Firstly, getting to read their classmates' texts puts them in the place of the audience, which establishes writing as a communicative act – rather than a language exercise - and creates a real purpose to write for (Porto 2001). Interestingly, for learners to accept their peers as 'real readers' it is prerequisite that the evaluating and grading is not the prerogative of the teacher but shared by the classroom community (Sengupta: 1998).

Secondly, co-evaluation constitutes an 'authentic task' (Guariento et al. 2001:350) in two respects: the learners clearly realize the relevance of their writing and peer-evaluations for their school carreers, and the communication arising in co-evaluation-related group work and negotiation will not fail to 'engage' (ibid.) the learners.

Thirdly, the learners' internalization of the evaluation criteria is deepened: 'By becoming proficient evaluators of others` work, the students are better able to critically, thoroughly, and objectively evaluate their own writing' (Moore 1986:23). This suggests that the quality of the learners' own writing is enhanced too.

Finally, co-evaluation greatly contributes to learner autonomy and responsibilty. Learners construct the evaluation of learning progress as a classroom community. Paradoxically, granting the learners this considerable responsibility contributes to the learners developing responsible attitudes – with responsibility being 'the means as well as the aim for the development of learner autonomy' (Dam 2000:49). It also helps them 'develop good relationships with their classmates' (Moore 1986:23) – a benefit we could observe with the trouble-stricken group of 10^th-graders.

In conclusion, it seems legitimate to state that, although much further research needs to be done, co-evaluation opens up a promising path towards learner involvement in an area that seemed, and to many still seems, sacrosanctly learner-free and teacher-owned: evaluating and grading. One decisive question remains to be answered: Are we teachers ready to share our grading power ? True, allowing learners a say in evaluating and grading would mean a loss of teacher autonomy. This loss, however, is outweighed by far by what we would gain: a relationship with learners that is beginning to resemble a partnership. Notes:

(1) Analogously high rating agreement emerged in research by Stern (2000), albeit with respect to learner self-evaluation.

References:

Becker, G., C. von Ilsemann, M. Schratz (eds.). 2001. Qualität entwickeln: evaluieren, Friedrich Jahresheft XIX 2001

Berg, E.C. 1999. 'Preparing ESL students for peer response'. TESOL Journal 8/2: 20-25

Dam, L. 2000. 'Evaluating autonomous learning' in B. Sinclair, I. McGrath and T. Lamb (eds.). Learner Autonomy, Teacher Autonomy: Future Directions. Harlow: Longman

Guariento, W. and J. Morley. 2001. 'Text and task authenticity in the EFL classroom.' ELT Journal 55/4: 347-353, Oxford: Oxford University Press.

Moore, L.K. 1986. 'Teaching students how to evaluate writing'. TESOL Newsletter 10/86: 23-24

Porto, M. 2001. 'Collaborative writing response groups and self-evaluation.' ELT Journal 55/1: 38-46, Oxford: Oxford University Press.

Rea-Dickins, P. and K.Germaine. 1992. Evaluation. Oxford: Oxford University Press

Scharle, A. and A.Szabó 2000. Learner Autonomy. A guide to developing learner responsibility. Cambridge: Cambridge University Press

Sengupta, S. 1998. 'Peer evaluation:'I am not the teacher'', ELT Journal 52/1: 19-28, Oxford: Oxford University Press.

Sinclair, B., I. McGrath and T. Lamb (eds.). 2000. Learner Autonomy, Teacher Autonomy: Future Directions. Harlow: Longman

Stern, T. 2001. 'Beurteilungsmaßstäbe aushandeln. Erfahrungen mit einem Notenvertrag' in G.Becker, C. von Ilsemann and M.Schratz (eds.) Qualität entwickeln: evaluieren, Friedrich Jahresheft XIX 2001

Appendix : evaluation grids

'equally shared CE'

'learner-dominated CE'

'learner-autonomous CE'