HVPT or minimal pairs on steroids

It was by chance as these things tend to happen on the net that I read about High Variability Pronunciation Training (HVPT). What are the odds language teachers know about HVPT?

My extremely representative and valid polling on Twitter and G+ gave me a big fat 2 out of 24 teachers who knew the acronym. Of the two who said yes one had looked up the acronym and the other is an expert in pronunciation.

I would put good odds that most language teachers have heard of and use minimal pairs, i.e. pairs of words which differ by one sound, the famous ship/sheep for example.

HVPT can be seen as a souped up minimal pairs where different speakers are used and sounds presented in different contexts. Learners are then required to categorize the sound by picking a label for the sound. Feedback is then given on whether they are correct.

Pronunciation research has shown that providing a variety of input in terms of speakers and phonetic contexts helps learners categorize sounds. That is the V of variability in the acronym. Furthermore such training focuses learners on the phonetic form and thus reduces any effect of semantic meaning since it has been shown that attending to both meaning and form reduces performance.

Currently there is one free (with registration) program that helps with Canadian pronunciation it is called EnglishAccentCoach.1 This web and IOS program is developed by Ron Thompson a notable researcher in this field. It is claimed that it can significantly help learners in only 8 short training sessions and effects last for up to a month. There is a paid program called UCL Vowel Trainer2 which claims learners improved from 65% accuracy to 85% accuracy over 5 sessions.

Another (open source) program is in development called Minimal Bears which is based on PyPhon.3 MinimalBears aims to build up crowdsourcing feature so that many languages can be accommodated. Interested readers may like to see a talk about HVPT from the developers.4

So it is quite amazing as Mark Liberman from Language Log pointed out how little is known by language educators about HPVT. One of the commenters to the Language Log post suggested association with drill and kill stereotypes of language learning may have tainted it. No doubt more research is required to test the limits of HPVT. Hopefully this post will pique interest in readers to investigate these minimal pairs on steroids.

Many thanks to Guy Emerson for additional information and to the poll respondents.


1. EnglishAccentCoach
2. UCL Vowel Trainer
3. PyPhon  I have yet to be able to get this working
4. (video) High Variability and Phonetic Training – Guy Emerson and Stanisław Pstrokoński

Further reading:

Thomson, R. I. (2011). Computer assisted pronunciation training: Targeting second language vowel perception improves pronunciation. Calico Journal, 28(3), 744-765. Retrieved from http://www.equinoxpub.com/journals/index.php/CALICO/article/viewPDFInterstitial/22985/18991
Liberman, M. (2008, July 6) HVPT [Blog post]. Retrieved from http://languagelog.ldc.upenn.edu/nll/?p=328

What to teach from corpora output – frequency and transparency

Frequency of occurrence is the main way for teachers to choose what to teach when using corpora however as Andrew Walkley discusses in “Word choice, frequency and definitions” using just frequency is not without limitations. In addition to frequency we can use semantic transparency/opacity,  that is, how the meaning of the whole differs from its individual parts. This is also sometimes referred to as how idiomatic a phrase is. Martinez (2012) offers a Frequency Transparency Framework that teachers can use to help them choose what phrases to teach. Using four collocates of take he presents the following graphic:

The Frequency-Transparency Framework (FTF) using four collocates of the verb take (Martinez, 2013)

The numbered quadrants are the suggested priority of the verb+noun pairs i.e. the most frequent and most opaque phrase would be taught first (1), then the most frequent and transparent phrase (2), followed by the less frequent but opaque phrase (3) and last the least frequent and most transparent phrase (4). As said this is only a suggested priority which can be changed according to the teaching context. For example a further two factors (in addition to word for word decoding) can be considered when evaluating transparency:

  • Is the expression potentially deceptively transparent? – “every so often” can be misread as often; “for some time” can be misunderstood as short amount of time (Martinez & Schmitt, 2012, p.309)
  • Could the learner’s L1 negatively influence accurate perception?

Applying the framework to the binomials list from my webmonkey corpus – I would place up and running in quadrant 1, latest and greatest in quadrant 2, tried and true in quadrant 3 and layout and design in quadrant 4. Note that I did not place drag and drop, the most frequent and somewhat opaque phrase since it is so well-known with my multimedia students (similar to cut and paste) that it would not need teaching. Thanks for reading.


Martinez, R. (2013). A framework for the inclusion of multi-word expressions in ELT. ELT Journal 67(2): 184-198.

Martinez, R. & Schmitt, N. (2012). A Phrasal Expressions List. Applied Linguistics 33(3): 299-320.

This corpora-bashing parrot has ceased to be

Hugh Dellar’s recent What have corpora ever done for us post dismisses the hype behind corpora that was prevalent a few years back with typical gusto. I would like to look at some of the issues raised.

It is curious that his support of teacher intuition over the use of corpora seems to contrast with his support of coursebooks over teacher intuition in his dogme posts. Gabrielatos (2005) describes the example of when a teacher’s intuition that tag questions belonged to the “bowler-hat” past of English use clashed with a finding that one in four questions in dialogues was a question tag.

Another of Dellar’s objections echoes Widdowson’s dichotomy between genuine texts and authentic texts, as cited in Tribble (1997). Concordance lines from corpora represent instances of genuine language use, the products of language communication. This language contrasts with discourse texts which are authentic and represent the process of language communication. Learners need to construct a relationship with language materials so concordance lines need to be filtered so as to be useful in the classroom, what Widdowson calls pedagogic mediation.

A related concern is between indirect uses of corpora by commercial publishers and direct uses by learners and teachers.

Both of these concerns are being addressed by specific corpora such as the Backbone pedagogic corpora for content and language integrated learning; MICASE corpus of academic spoken English, and by the wider availability of general corpora such as COCA (corpus of contemporary American English); BNC (British national corpus).

For instance Dellar’s question regarding [get on with it] and [let’s get down to business] can be answered by using the Phrases in English tool which uses the BNC. Here we find that [get on with it] appears 401 times (4.11 instances per 1 million words) vs 2 times (0.02 instances per million words) for [let’s get down to business].

The Backbone collection is very interesting as it provides a thematically focused database of spoken text for 5 languages plus English as lingua franca, backed up with an assortment of learning resources. The English corpus includes 50 interviews which are annotated for topic, grammar and lexis. This annotation goes some way to address the problem of the way text is coded.

Braun (2005) describes using a small corpus as a way to mediate pedagogically between corpora and learners using “coherent and relevant content, a restricted size, a multimedia format and a pedagogic annotation of the corpus”(Braun, 2005, p61).

The use of home-made corpora is another way to attack the issue of authenticity. I will detail my use of the TextSTAT tool and similar software to build up a corpus of material for multimedia students in a later post (Update: see this series of posts). Although it takes some work teachers can build up formal databases to complement their experience-based intuitive database.

Two other criticisms not mentioned by Dellar are that corpora promote both a bottom up processing of text (vs a top down processing) and an inductive (vs deductive) approach to learning. Flowerdew (2009) discusses these and concludes that top down processing can be used with corpus data and that a mixed approach be used combining elements of a deductive approach into the inductive approach.

Finally turning to learning effects, Oghigian and Chujo (2010) found beginner students improved significantly on all six question types in pre/post test scores in a class using a contrastive (Japanese/English) corpus compared to a class using a listening CD who improved only on three types of questions.

Hopefully this short response shows that the corpora-bashing parrot has shuffled off this metaphorical coil. 🙂


Braun, S. (2005). From pedagogically relevant corpora to authentic language learning contents. ReCALL 17(1): 47-64.

Gabrielatos, C. (2005). Corpora and language teaching: Just a fling, or wedding bells? TESL-EJ 8(4), A1, 1-37.

Flowerdew, L. (2009). Applying corpus linguistics to pedagogy A critical evaluation. International Journal of Corpus Linguistics 14:3 (2009), 393–417

Oghigian, K. & Chujo, K. (2010). An Effective Way to Use Corpus Exercises to Learn Grammar Basics in English. Language Education in Asia, 2010, 1(1), 200-214.

Tribble, C. (1997). Improvising corpora for ELT: Quick-and-dirty ways of developing corpora for language teaching. In J. Melia. & B. Lewandowska-Tomaszczyk (Eds.), PALC ’97 Proceedings, Practical Applications in Language Corpora (pp. 106-117). Lodz: Lodz University Press.


Be sure to read Leo Selivan’s response What corpora have done for us to Hugh Dellar’s post.

What the research says – Feedback

The first and very likely last of my attempts to see how easy/difficult it is for a classroom teacher to research evidence using only online resources on various ELT questions.

Many teachers are interested in the question posed by Chris Wilson/@MrChrisJWilson – What makes feedback good? (Wilson, 2013).

The commenters to the post all more or less agreed that generally feedback should aim to improve the learner – what is known as formative feedback.

What does the research tell us?

A 2008* review article titled Focus on formative feedback by Valerie Shute used between 170-180 sources resulting in some interesting tables of recommendations based on the unequivocal results that were reviewed. e.g. one recommendation discourages the use of praise:

from Table 3, p 31, Shute (2007)
from Table 3, p 31, Shute (2007)

Another advises against interrupting a student whilst they are engaged in a task which contrasts somewhat with Adrian Underhill on giving feedback during tasks (Underhill, 2012):

from Table 3, p 31, Shute (2007)
from Table 3, p 31, Shute (2007)

One table lists some guidelines related to learner characteristics. It is difficult to tell Chris’s learner’s characteristics but let’s assume that this learner’s goal is to show her competence rather than increase her competence, one of the guidelines state:

from Table 5, p 33, Shute (2007)
from Table 5, p 33, Shute (2007)

There is a lot to dig in Shute’s article which I may come back to as updates to this post.

A more immediate classroom conceptualization is provided by Tony Lynch** who talks about making the distinction between slips and errors and getting students to notice the difference (Lynch, ?).

Students are able to correct slips on their own whereas errors need the help of a teacher.

His work recommends the use of student self-transcription of speaking tasks and student recorded audio logs. Teachers can then use the transcriptions and logs to give feedback.

*I initially found the Shute 2008 article on JSTOR repo, but during my search came across the Shute 2007 report for ETS, which the screenshots of the tables are taken from.

**I found initial references to work by Lynch from a search on the British Council directory of UK ELT research 2005-10. It’s a shame that I can get access to a US researcher’s paper but not to a UK one. My google-fu is weak, found article eventually.


Lynch, T. (?) Tips from the Classroom: Student-responsible correction of spoken English. Retrieved 10 January 2013, from http://www.sfu.ca/heis/archive/20-2_lynch.pdf

Shute, V. J. (2007). Focus on formative feedback. ETS Research Report, RR-07-11 (pp. 1-47), Princeton, NJ.

Shute, V. J. (2008). Focus on formative feedback. Review of Educational Research, 78(1),153-189.

Underhill, A. (2012 Dec 19) Demanding higher in a conversation class [Web Log Post]. Retrieved from http://demandhighelt.wordpress.com/2012/12/19/demanding-higher-in-a-conversation-class/

Wilson, C. (2013 Jan 7) Your accent is terrible – Destructive feedback [Web log Post]. Retrieved from http://www.eltsquared.co.uk/your-accent-is-terrible-destructive-feedback/

Update 1:

In the comments to this post by Chiew Pang/@@aClilToClimb, Dale Coulter/@dalecoulter gives some solid reasons why immediate feedback is necessary.

This reminded me that I did not include any information on timing in my initial post.

The review found that immediate feedback is preferred choice, particularly for relatively difficult tasks but research has also shown that delayed feedback helps with transfer of learning, so one should match feedback with learning goals:

Shute 2008 feedback-timing
from Table 4, p 32, Shute (2007)

Update 2:

Thanks to a tweet by @cdelondon that linked to article on Where are university websites hiding all their research I learnt of this great tool – Institutional Repository Search which claims to look through 130 UK repos. Fab!

Update 3:

Yazikopen, an online directory linking to more than 4000 modern languages articles, brought to my attention by Alannah Fitzgerald ‏/@AlannahFitz.