#IATEFL 2016 – Corpus Tweets 2

This is a storify of tweets by Sandy Millan, Dan Ruelle and Leo Selivan on the talk Answering language questions from corpora by James Thomas. Hats off to the tweeters I know it’s not an easy task!
Β Edit

IATEFL 2016 Corpus Tweets 2

Answering language questions from corpora by James Thomas as reported by Sandy Millin, Dan Ruelle & Leo Selivan

  1. James Thomas on answering language questions from corpora. Did not know Masaryk uni was home of Sketch Engine!
  2. JT has written a book about discovering English through SketchEngine with lots of ways you can search and use the corpus
  3. JT trains his trrainees how to use SketchEngine, so they can teach learners how to learn language from language
  4. JT Need to ensure that tasks have a lot of affordances of tasks and texts
  5. We live in an era of collocation, multi-word units, pragmatic competence, fuzziness and multiple affordances – James Thomas
  6. JT Why do SS have language questions? Are the rules inadequate? It’s about hirarchy of choice…
  7. JT Not much choice in terms of letters or morphemes, but lots of choice at text level
  8. JT Patterns are visible in corpora. They are regular features and cover a lot of core English
  9. JT What counts as a language pattern? Collocation, word grammar, language chunks, colligation (and more I didn’t get!)
  10. JT Students have questions about lexical cohesion, spelling mistakes, collocations: at every level of hierarchy
  11. JT Examples of q’s: Does whose refer only to people? Can women be described as handsome? Any patterns with tense/aspect clauses?
  12. JT q’s: Does the truth lie? What is friendly fire? What are the collocations of rule?
  13. JT introduces SKELL: Sketch Engine for Language Learning http://skell. (don’t know!)
  14. “Rules don’t tell whole story” – James Thomas making an analogy w/ Einstein who said same about both the wave & the particle theory
  15. JT SKELL selects useful sentences only, excludes proper nouns, obscure words etc. 40 sentences
  16. Β http://skell.sketchengine.co.ukΒ 

    Nice simple interface – need to play with it more. #iatefl

  17. JT searched for mansplain in SKELL and it already has 7 or 8 examples in there
  18. JT Algorithm to reduce amount of sentences only works when there are a lot of examples. With a few, sentences often longer
  19. Sketch Engine is a pretty hardcore linguistic tool, but I can see the use of Skell for language learners. #iatefl
  20. JT Corpora can also teach you more about grammar patterns too, for example periphrasis (didn’t get definition fast enough!)
  21. JT Can search for present perfect continuous for example: have been .*ing
  22. JT You can search for ‘could of’ in SKELL – appears fairly often, but relatively insignificant compared to ‘could have’
  23. Can use frequency in corpus search results to gauge which is β€œmore correct” / β€œthe norm”. #iatefl
  24. JT SKELL can sort collocations by whether a noun is the object or subject of a word for example. Can use ‘word sketch’ function
  25. Unclear whether collocation results in Skell are sorted according to β€œsignificance” / frequency or randomly #iatefl
  26. JT See @versatilepub for discounts on book about SKELL

 

IATEFL 2016 – corpus related mini-interviews

Following on from what could be described as a corpus carnival this year, some of those presenters kindly answered 5 questions. Oh and if any other corpus related presenters want to be added let me know. I list the mini-interviews in approximately chronological order:

Teaching the pragmatics of spoken requests in EAP
Christian Jones (University of Liverpool, UK),

Answering language questions from corpora
James Thomas (Masaryk University), @versatilepub

Using English Grammar Profile to improve curriculum design
Geraldine Mark (Gloucestershire College/Cambridge University Press) & Anne O’Keeffe (Mary Immaculate College, Limerick/Cambridge University Press), @TEFLclass

Electronic theses online – developing domain-specific corpora from open access
Alannah Fitzgerald (Concordia University) & Chris Mansfield (Queen Mary University of London),

Guiding EAP learners to autonomously use online corpora: lessons learned
Daniel Ruelle (RMIT University Vietnam), @danrmitvn

Teacher-driven corpus development: the online restaurant review
Chad Langford & Joshua Albair (University Lille 3, France)

Christian Jones
1. Who are you?
I am a Senior Lecturer in Applied Linguistics and TESOL at the University of Liverpool.
2. Who should come to your talk?
EAP or EFL teachers interested in research into spoken language (in this case the speech act of requesting) and the implications for teaching.
3. Why should they come?
Well, hope it will be interesting (!) and will make people think about their own teaching in regard to spoken language. In EAP in particular, a lot of attention us given to writing and reading and while this is understandable, I also think that the way learners interact when they speak in academic settings is important. I’m notΒ  giving a workshop but I hope I will apply theory to practice in a useful way.
4. Which talks are you looking forward to?
I can only attend on the day I am speaking but two things I would like to see are Mike McCarthy talking about spoken language in EAP, my ex-colleague Tania Horak talking about lexical profiling in tests. I will catch up with things I miss online.
5. Top tip
I don’t go to that many conferences so can only give fairly obvious advice: 1) Don’t try to see everything – pick 3 or 4 key talks a day and go from there 2 )The conversations you have in the breaks and the people you meet are a key part of the experience 3) Caffeine is vital! [back]

James Thomas
1. Who are you?
I’m a university teacher trainer, who doesn’t only talk about aspects of teacher development, but in our department, we actually do it: our trainees working with real live students for a whole semester. Internal Practice Teaching’s a buzz for everyone concerned. I’m also the author of a book that does something no other does. This should be my big expose at IATEFL.
2. Who should come to your talk?
Teachers who are interested in the L in ELT and TEFL and TESOL, etc. Everyone knows that dictionaries, grammars and intuition are not enough to answer every language question. By searching for answers in corpus data, we are in effect, asking thousands of native speakers at the same time.
3. Why should they come?
– The audience will observe language activities that involve learning about language, as well as learning language.
– They will see guided discovery activities in action.
– They will seeΒ  another avenue for using internet tools in the classroom.
– We develop strategies for dealing with students saying “but I’ve seen it somewhere”
4. Which talk(s) are you looking forward to?
Well, the last time I heard Mr Crystal speak, he impressed a lot of people, so I’m looking forward to being reimpressed. The work of Diane Larsen-Freemen has been quite pivotal in our field, so hearing it straight from the horse’s mouth … And my colleague, Nikki Fortova, has a poster about our Internal Practice Teaching, which uses a bit of augmented reality, so I’m keen to see how people react to the medium as well as the message!
5. Top conference going tip?
I saw Jan Blake perform about 10 years ago at a NILE event – outstanding. [back]

Anne O’Keeffe
1. Who are you?
I am an academic who has an EFL background. As an academic, I work in the area of Corpus Linguistics, at Mary Immaculate College, University of Limerick, Ireland. I am particularly interested in the applications of Corpus Linguistics to language learning.

I regularly give talks about the application of Corpus Linguistics to language teaching as I think that it important to spread the word to those who do the real work of teaching languages. If research is to have any impact, then we need to think about what our findings mean for the classroom. My most recent work has been with Cambridge University Press. This has involved working with Geraldine Mark on a four-year research project which entailed looking in great detail at learner grammar, across the CEFR, using the 55 million word Cambridge Learning Corpus. This has led us to create the open resource English Grammar Profile http://www.englishprofile.org/english-grammar-profile/egp-online – Please check it out and let us know what you think!
2. Who should come to your talk?
My talk, which is co-presented with Geraldine Mark, is about the English Grammar Profile resource. We will talk about how its findings can help inform syllabus design. Essentially, we looked at the Cambridge Learner Corpus and identified over 1,200 different grammar competencies in the learners’ writing across the six levels of the CEFR. The database has some surprises about what learners know and when they know it. It also sheds light on what advanced (C level) students can do with grammar and pragmatics.

This talk will be of interest to 1)Β  anyone who is interested in researching learner grammar competency; 2) anyone who is interested in findings about what grammar learners know at different levels; 3) language teachers who want to hear about this new resource which might be of use to them in their syllabus design.
3. Why should they come?
If you are interested in knowing more about the English Grammar Profile and how it can help you think more strategically about what grammar you teach, this talk is for you. If you are doing MA or PhD research into learner grammar or learner corpora, this talk might give you some ideas and if you are interested in getting a different perspective of grammar syllabi, there is something in this talk for you too.
4. Which talk(s) are you looking forward to?
The plenaries include some really big names! They are definitely not to be missed. The programme looks so interesting. There are so many talks and so little time!
5. Top conference going tip?
Don’t try to overdo it by attending a session at every slot. Allow time to just mingle around the exhibition area and meet people. IATEFL is a very friendly conference and you can make some new friends from around the world. [back]

Alannah Fitzgerald
1. Who are you?
I am an open education practitioner and researcher working in the area of technology-enhanced English language education. Being somewhat nomadic, I have gained experience and understanding from learning, teaching and researching across different educational contexts, including Higher Education institutions in the United Kingdom, Canada, Korea, and New Zealand (my country of origin). Increasingly, I have been drawn to devising and delivering online language learning interventions that can be scaled and assessed across both formal and informal education.
2. Who should come to your talk?
Language teachers, language learners, subject specialists, instructional design and e-learning support teams who want to build their own language collections.
3. Why should they come?
See what you can do with open content for building dynamic online English language collections for any target learner group. Our latest open collection in collaboration with the British Library is made up of 50,000 PhD abstracts for learning English for Specific Academic Purposes.
4. Which talk(s) are you looking forward to?
Unfortunately, I’m only going to be there for the interactive fair as I have two open education conferences on either side of that day. If I were going to be there for the whole gig I’d most likely want to attend a variety of sessions to get a sense of what the wider ELT community is currently concerned with.
5. Top conference going tip?
My greatest experience and tip for conference going is to find people you can work with on projects who are at different schools or institutions. This will help you to get a wider sense of your field or how different fields can intersect in interesting ways, for example, FLAX brings computer science and language education together. [back]

Daniel Ruelle
1. Who are you?
I’m a EAP / IELTS preparation teacher and program coordinator at the Vietnam campus of an Australian university called RMIT – Royal Melbourne Institute of Technology.Β  It’s actually one of the – if not the – largest offshore universities in the world with two campuses in Vietnam and over 5,000 students.Β  During my time here I have become quite interested in vocabulary acquisition, especially using corpora in the classroom to encourage learners to autonomously use English more naturally and focus on collocations.
2. Who should come to your talk?
Those who are interested in hearing about my experience training learners to autonomously use free online corpora.Β  Actually I am one of three presenters in a forum on corpora on Friday, April 15th at 14:10 – 15:15 in Hall 11a, and my co-presenters will be presenting about two complementary themes: “Learning academic vocabulary through a discovery-based approach” and “Exploring EAP teachers’ familiarity and experiences of corpora.
3. Why should they come?
There has been a lot of research on corpora and they are frequently used by researchers, but to date there has been quite a disconnect between research and practice.Β  These tools have come a long way and are much more user-friendly and intuitive then what many teachers may have experienced in the past.Β  This forum on corpora will hopefully demystify these incredibly useful tools for both teachers and learners, and give teachers some ideas on how they can use them with their learners.
4. Which talk(s) are you looking forward to?
The keynote speakers this year all look incredible and of them, I’m most excited to see Dr. David Crystal and Scott Thornbury – both legends in the field!Β  I’m still perusing the myriad parallel sessions and it’s proving very difficult to choose just one in each slot.Β  I plan to finalise my choices on the long flight to Birmingham.
5. Top conference going tip?
As a presenter – prepare handouts (paper or digital – I often share a link to a Google Drive folder with materials that I can update after the conference) and try to stick around after your presenter for questions.
As an attendee – don’t feel the need to attend every single session.Β  It can be quite fatiguing rushing from one room to another and sometimes 45 minutes of idle time to check out poster sessions or publishers’ offerings, or just grab a coffee, can really rejuvenate you. [back]

Chad Langford & Joshua Albair
1. Who are you?
We are both linguists working in France; we work primarily as EFL teachers in a university setting with different kinds of learners of all levels. We’ve recently become more interested in developing in-house materials and in moving towards our learners’ needs and away from external syllabi and course books. We find corpora fascinating and really useful.

We teach a variety of courses, but it is our experience in teaching General English to adult learners in the Adult Education department of our university that has provided us the greatest stimulation for this project. Our experience as linguists means that we’re interested in research, and we both have had experience working with large corpora. This is the first time we’ve embarked on creating a corpus of our own, and we’re really excited about it and what it can bring to the profession.
2. Who should come to your talk?
Anyone who is interested in how the manipulation of data can give teachers a better idea of how the language they teach works will find our talk interesting, as will anyone interested in genre studies and in exposing learners to different genres as readers and writers.

The online restaurant review is an interesting genre to look at: on the one hand, some of our more basic intuitions about it are borne out; on the other hand, an actual searchable corpus reveals important characteristics of the genre that might otherwise not be so obvious. Finally, anyone interested in corpora and in the possibility of creating their own corpus should come – we would definitely like to encourage others to pursue same sort of project.
3. Why should they come?
The best reason to come would be for us to meet each other. We would love to meet, and then stay in contact, with other people who are interested in what we’re interested in and in developing a network where we can help each other and share experiences and materials.
4. Which talk(s) are you looking forward to?
We are looking forward to the plenaries, of course. Others corpus-related talks are at the top of our list, too. Grammar and discourse are two other areas we’re interested in. It’s going to be hard to choose, and I think we’ll be pretty busy listening to other people for a lot of the time we’re there.
5. Top conference going tip?
You can’t do everything. Take the time to meet people, and exchange contact information with them. And do it when you meet them – you’re never sure to run into them later, and there will be huge number of delegates this year. [back]

A huge thanks to all the presenters, break a leg folks : )

Discovering English with SketchEngine – James Thomas interview

2015 seems to be turning into a good year for corpus linguistics books on teaching and learning, you may have read about Ivor Timmis’s Corpus Linguistics for ELT: Research & Practice. There is also a book by Christian Jones and Daniel Waller called Corpus Linguistics for Grammar: A guide for research.

This post is an interview with James Thomas,, on Discovering English with SketchEngine.

1. Can you tell us a bit about you background?

2. Who is your audience for the book?

3. Can your book be used without Sketch Engine?

4. How do you envision people using your book?

5. Do you recommend any other similar books?

6. Anything else you would like to add?

1. Can you tell us a bit about your background?^

Currently I’m head of teacher training in the Department of English and American Studies, Faculty of Arts, Masaryk University, Czech Republic. In addition to standard teacher training courses, I am active in e-learning, corpus work and ICT for ELT. In 2010 my co-author and I were awarded the ELTon for innovation in ELT publishing for our book, Global Issues in ELT. I am secretary of the Corpora SIG of EUROCALL, and a committee member of the biennial conference, TALC (Teaching and Language Corpora).

My work investigates the potential for applying language acquisition and contemporary linguistic findings to the pedagogical use of corpora, and training future teachers to include corpus findings in their lesson preparation and directly with students.

In 1990, I moved to the Czech Republic for a one year contract with ILC/IH and have been here ever since. Up until that time, I had worked as a pianist and music teacher, and had two music theory books published in the early 1990s. Their titles also beginning with “Discovering”! πŸ™‚

2. Who is your audience for the book?^

The book uses the acronym DESKE. Quite a broad catchment area:

  • Teachers of English as a foreign language.
  • Teacher trainees – the digital natives – whether they are doing degree courses or CELTA TESOL Trinity courses.
  • People doing any guise of applied linguistics that involve corpora.
  • Translators, especially those translating into their foreign language. (Only yesterday I presented the book at LEXICOM in Telč.)
  • Students and aficionados of linguistics.
  • Test writers.
  • Advanced students of English who want to become independent learners.

3. Can your book be used without Sketch Engine?^

No. (the answer to the next question explains why not).

Like any book it can be read cover to cover, or aspects of language and linguistics can be found via the indices: (1) Index of names and notions, (2) Lexical focus index.

4. How do you envision people using your book?^

It is pretty essential that the reader has Sketch Engine open most of the time. Apart from some discussions of features of linguistic and English, the book primarily consists of 342 language questions/tasks which are followed by instructions – how to derive the data from the corpus recommended for the specific task, and then how to use Sketch Engine tools to process the data, so that the answer is clear.

Example questions:
About words
Can you say handsome woman in English?
Do marriages break up or down?
How is friend used as a verb?
Which two syllable adjectives form their comparatives with more?
Do men say sorry more than women?

About collocation
I’ve come across boldly go a few times and wonder if it is more than a collocation.
It would be reasonable to expect the words that follow the adverb positively
to be positive, would it not?
Is there anything systematic about the uses of little and small?
What are some adjectives suitable for giving feedback to students?

About phrases and chunks
Does at all reinforce both positive and negative things?
What are those phrase with lastleast; believeears; leadhorse?
How do the structures of to photograph differ from take a photo(graph),
guess with make a guess, smile with give a smile?
Which –ing forms follow verbs like like?

About grammar
How do sentences start with Given?
Who or whom?
Which adverbs are used with the present perfect continuous?
Do the subject and verb typically change places in indirect questions?
How new and how frequent is the question tag, innit?

About text
Are both though and although used to start sentences? Equally?
How much information typically appears in brackets?
Does English permit numbers at the beginning of sentences?
Is it really true that academic prose prefers the passive?
In Pride and Prejudice, are the Darcies ever referred to with their first names?

There is an accompanying website with a glossary – a work eternally in progress, and a page with all the links which appear in the footnotes (142 of them), and another page with the list of questions, which a user might copy and paste into their own document so that they can make notes under them.

5. Do you recommend any other similar books?^

The 223 page book has three interwoven training goals, the upper level being SKE’s interface and tools, the second being a mix of language and linguistics, while the third is training in deriving answers to pre-set questions from data.

AFAIK, there is nothing like this.

6. Anything else you would like to add?^

In all the conference presentations and papers and articles that I have seen and heard over the years in connection with using corpora in ELT, with very few exceptions teachers and researchers focus on a very narrow range of language questions. When my own teacher trainees use corpora to discover features of English in the ways of DESKE, they realise that the steep learning curve is worth it. They are being equipped with a skill for life. It is a professional’s tool.

Sketch Engine consists of both data and software. Both are being constantly updated, which argues well for print-on-demand. It’ll be much easier to bring out updated versions of DESKE than through standard commercial publishers. I’m also expecting feedback from readers, which can also be incorporated into new editions.

My interests in self-publishing are partly related to my interest in ICT. This book is printed through the print-on-demand service, Lulu.com. One of the beauties of such a mode of publishing is the relative ease with which the book can be updated as the incremental changes in the software go online. This is in sharp contrast to the economies of scale that dictate large print runs to commercial publishers and the standard five-year interval between editions.

There is a new free student-friendly interface which has its own corpus and interface, known as SKELL which has been available for less than a year. It is also undergoing development at the moment, and I will be preparing a book of worksheets for learners and their teachers (or the other way round). I see it as a 21st cent. replacement of the much missed “COBUILD Corpus Sampler”.

Lastly, I must express my gratitude to Adam Kilgarriff, who owned Sketch Engine until his death from cancer on May 16th, at the age of 55. He was a brilliant linguist, teacher and presenter. He bought 250 copies of my book over a year before it was finished, which freed me up from other obligations – a typical gesture of a wonderful man, greatly missed.

Many thanks to James for taking the time to be interviewed but pity my poor wallet with some very neat CL books to purchase this year. James also mentioned that, for a second edition file, Chapter 1 will be re-written to be able to use the open corpora in SketchEngine.