Humanising Language Teaching Magazine for teachers and teacher trainers

	CONTENTS

	EDITORIAL

	MAJOR ARTICLES

	JOKES

	SHORT ARTICLES

	CORPORA IDEAS

	LESSON OUTLINES

	STUDENT VOICES

	PUBLICATIONS

	AN OLD EXERCISE

	COURSE OUTLINE

	READERS’ LETTERS

	PREVIOUS EDITIONS

	BOOK PREVIEW

	POEMS

	C FOR CREATIVITY

Would you like to receive publication updates from HLT? Join our free mailing list

Pilgrims 2005 Teacher Training Courses - Read More

IDEAS FROM THE CORPORA

A Film Corpus, A Mad Corpus?

Hanna Kryszewska, Poland

Hanna Kryszewska is a teacher, teacher trainer, trainer of trainers. She is a senior lecturer at the University of Gdańsk, Poland. She is co-author of resource books: Learner Based Teaching, OUP, Towards Teaching, Heinemann, The Standby Book, CUP, Language Activities for Teenagers, CUP, The Company Words Keep, DELTA Publishing, and a course book series for secondary schools: ForMat, Macmillan. She is also co-author of a video based teacher training course: Observing English Lessons. Hania is a Pilgrims trainer and editor of HLT Magazine. E-mail: hania.kryszewska@pilgrims.co.uk

A language corpus is a body of language coming from a single text or compiled from a number, sometimes a great number, of texts. A corpus could be created on the basis of religiously or culturally significant texts, for example the Bible, the Quran or Shakespeare’s works, or a collection of texts of various genres usually produced by native speakers of a given language for example the British National Corpus or Brown’s Corpus.

This body of language is then used to analyse various aspects of language such as lexis and lexical patterns, grammar or sound patterns. The first corpora or corpuses were compiled in ancient times to study Vedas, Hindu texts written in Sanskrit. In the mid-20th century there was a distinct revival of interest in corpus analysis, which was enhanced by access to computational analysis. First written texts were analysed, then transcribed spoken texts became the subject of studies leading to the distinction between Spoken and Written Grammars.

A relatively recent development is the availability of a film corpus used to analyse dialogue systems. It contains 1068 text files and 960 dialogue scripts. This corpus is subdivided into film genres such as action, adventure, animation, biography, thriller, and western. (More at https://nlds.soe.ucsc.edu/fc2)

However, a very unusual treatment of a film corpus comes from Matt Bucy who took a well known classic

and turned it into

which some of the viewers might find the most ingenious, entertaining or ludicrous idea. What he did is the following: he took the soundtrack of the film, painstakingly cut the film up and arranged the utterances in alphabetical order from A-Z. In this way he created a completely new film - an alphabetised version which is very intriguing to say the least. You can watch the film at https://vimeo.com/150423718

I really wonder what you will think of it… You can also read a review of the outcome at
www.cinemablend.com/new/Some-Madman-Edited-Wizard-Oz-Alphabetical-Order-I-Can-t-Look-Away-103867.html

Website design and hosting by Ampheon