Mark Hancock has a nice write-up of a talk given by Mike McCarthy on spoken English. The write-up concludes with an interesting metaphor of a corpus being a corpse, language that is no longer alive. It asks whether only using corpus examples is the best way of trying to improve a learner’s use of English.
Very few corpus folks would suggest only using corpus examples, and furthermore a lot of corpus work goes beyond the purely quantitative to also consider the teaching implications.
For example there is a great paper Listening for needles in haystacks: how lecturers introduce key terms by Ron Martinez, Svenja Adolphs, and Ronald Carter on the spoken language of academic lecturers.
They extracted lexical bundles from a spoken corpus of 1.7 million words and then went through those manually to keep only the pedagogically interesting ones. e.g. in other words (kept) vs er this is a (discarded).
Manual review of the list also showed them a hitherto under-emphasized aspect of spoken lectures – the introduction and definition of new terms.
Their analysis split these up into the more transparent but less frequent cues such as call and mean, e.g. …what theorists call.., …what do we mean by… and the less transparent but more frequent cues like basically and essentially e.g. …which are basically…, …so it’s essentially…
Further they also showed how complex the delivery of a lot of the definitions or concepts were i.e. there was a lot of rephrasing sometimes using the word or but many times using no signposting language and key definitions usually came at the end of a series of connected points (back-loading).
In addition they found that often lecturers did not explicitly refer to their power point slides which could make it difficult for students to pick out the key terms.
A corpus may be like a corpse but like on the crime show CSI there is an awful lot that dead bodies can reveal.
Habeas corpus, you should have the body! 🙂
Agree with you Mura that Needles in Haystacks was a good example of how corpora can be judiciously used. My issue is only with corpus rigor mortis – does that over-stretch the metaphor?. I guess corpus is evidence, not blueprint.
hi Mark
you have a case in the sense that we don’t see many papers like the needles in haystack one. though i do love Ray Carey’s interpretation below of corpus as heavenly choir.
ta
mura
hi Mura
Needles in haystacks is a very good metaphor for the problem of spotting definitions in lectures, and the research paper illuminates a complex but crucial area in EAP. There is a similar problem with written academic texts, although here of course the reader is not forced to process the information in real time. I found definitions were ubiquitous but often cryptic in written academic texts in the Heriot Watt University Science and Engineering (HWUSE) and Business Studies (HWUBS) corpora. Yet they were crucial to understanding the unfolding explanations of key concepts.
I found be + predicate by far the most common way of presenting definitions; whereas called, termed, named, defined as, said to be and refers to, although relatively unambiguous signals, were less commonly used to cue them. To add to the problem, language exponents that generally signal description, such as consists of, involves, is concerned with were also used to define, adding to the likelihood that key definitions would escape the reader’s notice. Added to this, can be described as was quite commonly used to define! Most easy to miss were running, or parenthetical, definitions, where the definition is slipped in almost as an aside – sometimes in brackets or commas, sometimes following or. All this leaves EAP students without much lexical support for spotting definitions.
However, by working with written text (as well as transcripts of spoken text) students can learn to identify definitions in academic discourse and to produce their own definitions. When we were writing an EAP course to support students on distance / blended learning courses at Heriot Watt, we developed some strategies to help, and we also used these in tasks in Access EAP Frameworks.
1. Students can be alerted to the commonalities in the structure of definitions, for example class nouns are almost always used in definitions:
an X is a [class noun] + defining quality
a [class noun] + defining quality is termed an X
2. They can practise identifying common types of definitions. An extremely important category is stipulative (or working) definitions, signalled by such phrases as in the context of this research. There are also negative and contrastive definitions and definition by constituents, by purpose or function.
3. The structures and lexical components of definitions will vary between disciplines, so it’s always useful to get students to bring in their own texts to analyse in this way.
4. Students can explain to the teacher and to each other the key concepts in their fields
hi Sue
thank you for this very edifying comment 🙂
are the HWUSE or HWUBS corpuses available to the public?
are class nouns like shell nouns?
is there any more info on your work available to read, i only found this BALEAP handout from 2008 http://www.baleap.org.uk/media/uploads/pim/sargent-handout.pdf?
ta
mura
Hi Mura, Sorry, I got so excited by the paper you cited that I didn’t pay attention to the style of the forum you were posting in — I come across as very dry and dusty! But it’s interesting to see how corpus linguistics is developing. Unfortunately, very little published teaching material seems to apply the research in tasks.
How I got into it was, in 2001 I was given a temporary post at Heriot Watt University to write support EAP materials for their very large distance learning degree project. Because of the nature of the courses, there was a vast amount of written lecturer input in electronic form and I had been using MicroConcord (Tim Johns’s MS-DOS search tool from the mid 80’s) for many years in my previous teaching. So I and some colleagues used bits of the texts as authentic reading for the EAP course and also made the whole collection of first year texts into a corpus to research the usage and lexis. I think lots of people are doing this kind of thing now.
In those days we couldn’t access any of the big ones, like BNC, and anyway it was hard to find academic corpora. The 2 HWU corpora still exist. I suppose they are dated, but they were spot on for what we wanted at the time. They aren’t available for the public, but anyway, I now tend to use BYU BNC.
We didn’t publish in any journals — too busy writing the materials, but there were eventually three of us, me Olwyn Alexander and Jenifer Spencer, and we all spoke at conferences and BALEAP PIMs. The bit that you have found was from a 2008 PIM ‘Putting the E back in EAP’ and I was talking about developing a critical voice as a writer (I’ll attach the powerpoint, I don’t know why it wasn’t put up with the other stuff). Most of what we learned found its way into the books we published with Garnet Education — EAP Essentials, Access EAP Foundations and Access EAP Frameworks.
General or class nouns have lots of different names, shell nouns being one of them. In these definition examples the class noun is ‘measure’:
pH, a measure of the concentration of H+ ions in solution, is the negative log10 of the H+ ion concentration.
Hardness is a measure of the resistance of a material to localised plastic deformation.
I’ll also attach a list from the HWU corpora.
Best wishes, Sue
hi Sue
hehe yes well i understand your excitement about the Martinez et al paper, i think he is one of the most interesting researchers and writers at the moment in this area
and i did not mean to imply your previous comment was dry or dusty, simply very informative!
i just finished watching your BC seminar here http://englishagenda.britishcouncil.org/seminars/eap-how-it-different-other-forms-elt , the notion of graduate attributes is very useful
ta
mura
and speaking of Tim Johns u may be interested in this about his Contexts program https://plus.google.com/104940199413423400545/posts/EdaXQdUQcJv
ta
mura
Good stuff, clunky technology but excellent content
Hi Mura,
The corpus as corpse analogy is interesting, but considering that a corpus is made up of many language producers, it’s more like a mass grave. On the other hand, I don’t think the language is any more dead than the people in a photograph; it’s not as if the capturing of a moment in time causes its subjects to cease to exist.
I see a corpus as a linguistic snapshot in time, but since we know that a great deal of language is formulaic and a reflection of our cumulative experience of it, a corpus can’t really be seen as “dead”, in the sense of “never to be heard from again”. Corpus speakers keep talking and writers keep writing, forming the input for others’ cumulative experience of language.
Corpus as corpse might make sense to a generativist, but from an emergentist perspective, it consists of the trails of many individuals’ always-shifting language experience, but fixed in time and immortalized. So it’s just as easy to see a corpus as a heavenly choir instead of a pile of death. 🙂
hi Ray
that is a great point of view thanks for that! 🙂
ta
mura
Hi Mura,
Some interesting thoughts (both in your post and Mark’s). For me, I think it comes down to the skill of the corpus researcher. I’ve certainly come across some corpus researchers (dare I say at the more academic end of the scale) who seem to be rather forensically picking through the evidence in a way that may not be particularly relevant to the average language learner, But then, to be fair, that’s often not their primary audience. As a corpus researcher who’s also a language teacher and materials writer, I certainly look at the corpus with two hats on – I want to look objectively at the data to see what insights it can reveal, but then I also consider what might be useful to transfer to a teaching context and I make professional judgements. And of course, how you treat corpus data will also vary depending on the level of your target learners – for lower levels you’ll take quite a broad brush approach, whereas for advanced learners, you’ll get closer to ‘telling it like it is’. It’s all down to interpretation …
Julie
hi Julie
yes important point about adapting to learner levels, there is some good work relevant to language teaching going on, i hope to summarise another one in a bit more depth that is about DDL soonish
ta
mura
Hello everyone,
From what I gathered from talks by Nick Ellis and Averil Coxhead a few months back, the trend with some corpus research lately is to take the results of the study and interview teachers directly about what they think of the results to see it it is applicable to the context of the classroom. I think this is a fine practice to put life into a corpse, ooops I mean corpus. 😉
hi Peter thanks that’s interesting to note, looking forward to any papers they write related to this
ta
mura