Linguistic ideas from the corpora

	CONTENTS

	EDITORIAL

	MAJOR ARTICLES

	JOKES

	SHORT ARTICLES

	CORPORA IDEAS

	LESSON OUTLINES

	STUDENT VOICES

	PUBLICATIONS

	AN OLD EXERCISE

	COURSE OUTLINE

	READERS LETTERS

	PREVIOUS EDITIONS

	BOOK PREVIEW

	POEMS

Would you like to receive publication updates from HLT? Join our free mailing list

Pilgrims 2005 Teacher Training Courses - Read More

IDEAS FROM THE CORPORA

On Teaching about the Corpus - Getting the Message Across

Hanna Kryszewska

Hanna Kryszewska is teacher, teacher trainer, trainer of trainers. She is co-author of resource books : Learner Based Teaching, OUP, Towards Teaching, Heinemann, The Syamdby Book, CUP, Language Activities for Teenagers, CUP and a course book series for seccondary schools: ForMat, Macmillan. She is also co-author of a video based teacher training course: Observing English Lessons. She is based in Poland at the University of Gda?sk, Poland. Hania is a Pilgrims trainer and editor of HLT Magazine. E-mail: hania.kryszewska@pilgrims.co.uk

Background

In this short article I would like to share with you some of my observations and concerns about introducing the idea of the Corpus to teachers, both novice pre-service teacher trainers as well as in-service teachers, and invite you to take part in a discussion on how to go about implementing the developments relating to corpus analysis of the language and its pedagogical implications in TT.

Some time ago, at Pilgrims, I taught a course on, among others, the Lexical Approach. As time went by the teachers on the course seemed more and more worried. They were concerned and felt that much of the expertise they had come with to the course was being questioned. They were not sure they understood the implications and relevance of what they were presented with. Then the brave teacher said: " Hania, what's a chunk?". So I explained yet again word partnerships and possible lengths of chunks etc. But the question bounced back "What's a chunk", I explained and the question bounced back again, again, and again. I realized I was not communicating, not getting the message across, although we had spent much time exploring chunks in a practical way that could be used in class ( see my articles also with Paul Davis in past issues of HLT).

I suppose I had the same feeling that Richard Feynman, the Nobel prize winner in physics, must have experienced in one of his famous lectures. Having spend a lengthy period of time on an aspect of physics, he asked the audience if they understood. Everyone nodded except for one student. Feynman explained the problem for this particular student, yet the student still did not understand. The situation repeated a few times. Feynman was getting more and more annoyed when finally explaining the umpteenth time he froze at the blackboard and said "Oh, now I understand". The same thing happened to me. The insistent question from this one teacher at Pilgrims made me understand the idea of a 'chunk' and on the spur of the moment the answer came. I asked them to imagine a loaf of bread or cheese and what chunks it can be broken into: a big handful size, a mouthful, a crumb, still being bread or cheese. When I said that all the teachers in the group got it then and I am grateful to the insistent teacher who had the courage to keep voicing the question that was on everybody's mind.

Having learned my lesson, now I remember to check what message I am getting across, especially when presenting new trends and approaches to ELT. In the Teacher Training College where I work with undergraduates, future ELT teachers, I spent 4 sessions each 90 minutes long on the Corpus and corpus analysis of the language. These were very practical classes and everyone seemed to enjoy the experience. Also the students studied the relevant chapters in Jeremy Harmer's " The Practice of English Language Teaching" (Longman Pearson 2001). Anticipating basic questions from the students like: "So what is the Corpus ?" or "What is concordance?" at a certain point I pre-empted them and asked the pre-service teachers to write the definitions of the two major concepts in computer analysis of the language to show what they have understood. Below you find their definitions and please note the original grammar and spelling have been kept and that the students were given limited space and time to coin their own definitions.

Task for the HLT Reader

Before you read what the students' replies were, please write your own definition of the two terms in question: 'Corpus' and 'Concordance'. Try to limit the definitions to 2-3 lines, the space the students were given. Then compare with the definitions the undergraduates have written in their course.

Corpus

huge language database which comprises of different sources

a computer program that language from a source spoken and written, not involving poems or songs

examples of languages from different sources stored in the computer produced by native speakers

body of language that shows authentic use of words, is encoded, taken from various sources

a 'bank' of words in language stored in the computer as data and consisting of over 100 millions of words from different resources: books, articles, speeches, shows how frequent words are used and in what linguistic context

the large language bank of data stored on computer, consisting of various kinds of text, e.g. British National Corpus

contains millions of words, it says when to use one word, how many meanings it has, how it works in longer phrases, etc. It's a database in computers

collection of words from various sources like books, magazines, newspapers, they provide evidence of how language is really used

it's a computer database of a language

body of language, group of words collected from the spoken and written language stored on computers

BNC British National Corpus, words taken from scientific articles, magazines, recorded conversation and put together in corpus, we find a word and different usages of the word there

a bunch of words / sentences taken from various sources : magazines, nooks, articles or everyday speech

a collection of lg from different sources; newspapers, magazines, books, ( not poetry or songs) stored on computers. Lg is shown in real context

a base of great number of words, taken from books, articles, typed speaking etc.

a collection of lang. data on the computer from newspapers, books, poems, songs

a bank of words taken from different resources

list of sentences with a searched word that gives the meaning of the word in context

around 100 mln words combined together - you use it, e.g. to check which meaning of the word appears the most often

collection of language from different sources, stored on the computer, produced by native speakers ( without songs and poems)

a collection of millions of texts, scanned from poetry, scientific articles, books etc.

Concordances

similar things / meanings for example two words

a corpus shown in a way that the key word is in the middle of a screen, page

words - different meaning of words with part of the sentence in from and after the words, words are in columns, there are no whole sentences

similar to corpus but doesn't show full sentences and layout is different, it is put in alignment

shows 'words in context', a part of corpus showing frequency and context in which certain word is used

search engine which after asking about particular word or phrase gives us a set of sentences or parts of the text where we see our entry in the context

that's what appears on the screen after we insert a certain word into the corpus, we can see different use of that word

we have them on corpus, the collection of words in different contexts

a set of words with their immediate contexts

Comments

I cannot comment on your own definitions from the 'Task for the HLT Reader' section, but analysing the students' definitions as a whole we can see that the main key words and concepts have been captured. While in some cases the understanding is spot on, in some it is wishy washy, partially wrong or simply wrong. Bearing in mind that the sessions were highly practical and experiential with an element of theoretical background form the book by Jeremy Harmer, I would like to raise a pedagogical question how we, trainers, introduce the concept of the Corpus to novice teachers to make them see the benefits of this development in language analysis and its pedagogical implications. Your voices on the subject will be more than welcome.