Mark Hancock has a nice write-up of a talk given by Mike McCarthy on spoken English. The write-up concludes with an interesting metaphor of a corpus being a corpse, language that is no longer alive. It asks whether only using corpus examples is the best way of trying to improve a learner’s use of English.
Very few corpus folks would suggest only using corpus examples, and furthermore a lot of corpus work goes beyond the purely quantitative to also consider the teaching implications.
For example there is a great paper Listening for needles in haystacks: how lecturers introduce key terms by Ron Martinez, Svenja Adolphs, and Ronald Carter on the spoken language of academic lecturers.
They extracted lexical bundles from a spoken corpus of 1.7 million words and then went through those manually to keep only the pedagogically interesting ones. e.g. in other words (kept) vs er this is a (discarded).
Manual review of the list also showed them a hitherto under-emphasized aspect of spoken lectures – the introduction and definition of new terms.
Their analysis split these up into the more transparent but less frequent cues such as call and mean, e.g. …what theorists call.., …what do we mean by… and the less transparent but more frequent cues like basically and essentially e.g. …which are basically…, …so it’s essentially…
Further they also showed how complex the delivery of a lot of the definitions or concepts were i.e. there was a lot of rephrasing sometimes using the word or but many times using no signposting language and key definitions usually came at the end of a series of connected points (back-loading).
In addition they found that often lecturers did not explicitly refer to their power point slides which could make it difficult for students to pick out the key terms.
A corpus may be like a corpse but like on the crime show CSI there is an awful lot that dead bodies can reveal.
Habeas corpus, you should have the body! 🙂