Guy Aston talks speech corpora

I had the pleasure of chatting to Guy Aston as he was staying in Paris on his way back to Italy, where he works at the University of Bologna. Guy has been an active researcher in corpora over the years. Here he recollects one significant event  that encouraged him to pursue his interest in corpora and mentions his current area of investigation (best heard using headphones):

Regular readers may know that I have been using the TED Corpus Search Engine a few times recently to get my students to work on phonetic transcriptions. Multi-media corpora offer the possibilities to examine the prosodic features of language and this is what interests Guy with speech corpora.

For example the phrase Thank you was found to have a falling tone most of the time and frequently occurring phrases such as don’t you, last year, I don’t know and I don’t care are expected to have a fast rhythm (examples taken from The prosody of formulaic expressions in the IBM Lancaster Spoken English Corpus by Phoebe Lin).

Guy went on to detail some requirements and challenges involved when setting up speech corpora:

In the next audio Guy gives some examples of a learner using a speech corpus:

The following phone camera video of Guy’s TED speech corpus using Mike Scott’s Wordsmith (version 5 or later) illustrates listening to concordances of matter of fact:

Finally I asked Guy the old favourite about the misplaced early optimism of using corpuses in the language classroom:

Guy hinted that a version of his corpus may be available for AntConc (it is currently only compatible with WordSmith) but at the same time hinting not to hold your breath waiting for one🙂.

Once again thanks to Guy for sharing some of his current work. Check out some of his publications.

Do also check out an interview with another corpus linguist Costas Gabrielatos.

Thanks for reading.

