Recreate Live Air Traffic Control

Live Air Traffic Control recordings are available from sites such as LiveATC. In the Aviation English literature when transcripts of traffic are published most of the time the recordings are not generally available. There is a way to recreate these.

First use a text to speech site to get an audio file of a transcript you want to use. This one called FreeTTS for example is free with a number of voice options. Note this one does have limits on number of times you can use it in any one week. Another thing to note is that some words which have various pronunciations may be a sticking point. For example “wind” is pronounced as the verb and I want the noun. So I had to use “winnd” in the text.

Then use an audio program such as Audacity to process the audio. You need to apply first a high-pass filter and then a distortion effect. If you want to add a polish you can use what is called a radio beep/bleep noise and/or a static noise background. You may also need to normalise audio if volume is too low due to previous high pass filter and distortion effects.

Here is an example of a recreated audio (transcript from Prado et al. 2019) –

References

Prado, M., Roberts, J., Tosqui-Lucks, P., & Friginal, E. (2019). The development of aviation English programs. In E. Friginal, E. Mathews, & J. Roberts (Eds.), English in global aviation: Context, research, and pedagogy (pp. 215–246). Bloomsbury.

Advertisement

H5P, image choice, motion verbs

RIP Hotpotatoes

During lessons with (general aviation) air traffic controllers, students were often puzzled about the verbs plummet and hurtle. Although some managed to guess at their meanings after examining the context carefully, most could not:

“having plummeted down in a deadly spiral, flight KAL 007 slams into the ocean”
“the airliner continues to hurtle through the skies above the Sea of Japan”

David Rooney, About Time: A History of Civilization in Twelve Clocks, 2021: 3

H5P has a content type called Image Choice – this allows you to make quizzes involving images.

I had also seen this interesting presentation titled Satellite or Verb Framed: How to improve manner of motion verb dictionary entries? by Tan Arda Gedik who found some benefits of using animated GIFs in definitions of (manner of) motion verbs over dictionary definitions and concordance examples.

Although the extent of the applicability of typology of manner of motion verbs in French is debateable, using animated gifs to illustrate motion verbs seems worth exploring.

And this is what I came up with:

Image Choice example 1

I tested an early version of the above with a student and they seemed to appreciate the format. Apologies for not having a re-use option on the above (when you click through) as the Lumi app does not seem to support exporting with re-use. I can supply H5P file if required.


Some points to note regarding use of H5P:

I used Iframe Embedder to integrate Image Choice into a Column content. The limitation with this is on mobile phones the iframe embedder is not responsive so you need to switch phones to desktop mode, nor is it good for accessibility.

You can compress animated gifs to reduce image memory loads, this one is good as it allows you to skip frames.

I used Lumi H5P app to generate the html file. If you are a beginner with H5P, this resource can get you started: H5P sample activities for language instruction.

Some content types that are not yet official  can be found here .

Here is another example using Image Choice (plus Course Presentation and Column):

Image Choice example 2

I would be interested in seeing what people are doing with H5P. Do please share.

Here is an example using Accordion, Drag & Drop, Multiple Choice, Column.
An example using Agamotto, Question Set, Column.

A Virtual Tour (360) example, works best in Chrome browser but theoretically other browsers should work.

An example with Image Juxtaposition, Column.

Using Memory Game with audio for decoding practice (plus Dictation, Column).

Thanks for reading.

Update:

Good folk writing about H5P to check is Vedrana Vojković Estatiev at her blog tagged H5P.

Neil McMillan describes some inventive ways to use Drag & Drop.

Picture labelling with Hot Potatoes

This post is adding to what has been written already on how to do picture labelling exercises with Hot Potatotes. It assumes you know how to make a JCloze exercise.

The following video shows a sample of my current favourite kind of exercise to make with Hot Spuds:

The maker of the software has called this Smart positioning or How to overlay drop down lists on a background picture.

You can use the following html code to copy paste into a JCloze exercise, it creates a picture labelling of 6 terms.

<table style="border-style: solid; border-width: 0px;  width: 640px; "><tbody>
<tr>
<td style=""height: 80px; text-align:right; ">Label 1</td>
<td style="width: 480px; height:430px;" rowspan="3" ><img src="name-of-image" alt="name-of-image" title="image-title" width="416" height="352" style="display: block; margin-left: auto; margin-right: auto; text-align: center;"/></td>
<td style="height: 80px; text-align:left; ">Label 2</td>
</tr>
<tr>
<td style="height: 80px; text-align:right; ">Label 3</td>

<td style="height: 80px; text-align:left; ">Label 4</td>
</tr>
<tr>
<td style="height: 80px; text-align:right; ">Label 5</td>

<td style="height: 80px; text-align:left; ">Label 6</td>
</tr>
</tbody></table>

If you want to label 8 terms just add another row and increase rowspan to 4. You will need to adjust image dimensions appropriately depending on your picture.

The following 2 videos details how I go about cropping the image using screenshotting and picture editing to add arrows on OSX, you can find similar tools for your systems:

You may well have to go back to your picture editing software to modify your image until you are satisified that the arrows match up with the drop down boxes. You can also try to change the height: 80px for the cell that is not aligned but I just fiddle with the image editing program.

The following code is to make the table responsive i.e. table will wrap around when displayed on phones:

<div style="overflow-x:auto;">
//put in the table code here//
</div>

Hope this post is of help as information on creating such an exercise is hard to find now on the web. Happy to take questions.

Create your own interactive transcript

Interative transcripts are where text appears next to a video or audio and the words being spoken are highlighted on the text as the video or audio plays.

This video-post is about making your own. I assume you 1) have your own website, 2) have already transcribed your media and 3) use OSX Mac or Linux.

The programs in order of use are – Gentle forced aligner (https://github.com/lowerquality/gentle), Hyperaudio convertor (https://hyperaud.io/converter/) and Hyperaudio Lite (https://github.com/hyperaudio/hyperaudio-lite).

Note that if you don’t use OSX/Linux then, in order to get an appropriate file to feed into the Hyperaudio convertor, you can use one of the online transcription services that have free minutes such as Maestra (https://maestrasuite.com/).

Or there is an online demo of the Gentle forced aligner https://lowerquality.com/gentle/ though not sure what file size limit is for that.

I apologise for the noise that appears later in the video! Thanks for watching and don’t hesitate to ask me any questions.

PirateBox is dead! Long live PirateBox!

The main developer of PirateBox (Matthias Strubel) announced the shutting down of the PirateBox forums recently. Fortunately PirateBox is still being developed for the wrong router. This is great news. I believe the wrong router uses a GL.iNet Mango GL-MT300N-V2 which connects at 300Mbs (3 times as fast as the TP-Link MR3020 router) and so means video sharing is very fast now.

Note: please do order a wrong router (and support development of PirateBox) if you are not comfortable with digging into router specifics.

One of the advantages of the wrong router mod of PirateBox is the use of html pages to serve files. Even though this was possible with the original PirateBox several other steps had to be taken to disable features that were not needed (e.g. I rarely used the upload or chat facility). And with html5 one can now share videos with subtitles (in .vtt form), something that is very useful when sharing videos in a language learning class.

In order to get the wrong router version working one needs to flash a PirateBox image first so that the auto-install function can work (for the Mango router the image to use is found here http://development.piratebox.de/target_thewrong_ramips-mt76x8/). Use the router’s original web UI to install the firmware image (on a slow USB stick this could take up to 45mins). Once done you can then follow the wrong router instructions to install the full wrong router modification of PirateBox.

Below are some screen shots of connecting to the router using my phone. Note the screen shot showing a video playing with subtitles. Nice!

Thanks for reading and here’s to 2020.

Related PirateBox posts:

The browser rulez or another reason why PirateBox is boss

Cutting your PirateBox jib

Piratebox, a way to share files in class

Offline (doku) wiki writing

TESOL France 2014 – thoughts, poster, handout and links

Related links – PirateBox development images

Grassroots language technology: Adam Leskis, grammarbuffet.org

Language learning technology can be so much more than what commercial groups are offering right now. The place to look is to independent developers and teachers who are innovating in this area. Adam Leskis is one such person and here he discusses his views and projects.

1. Can you tell us a little of your background.

I started out in my first career as an English teacher, and it was clear to me that there were better ways we could both create and distribute digital materials for our students. As an example, during my last year of professional teaching (2015), the state of cutting edge tech integration was taking a powerpoint from class and uploading it to youtube.


What struck me in particular was the way in which technology was being used primarily only in a capacity to reproduce traditional classroom methods of input rather than actually taking advantage of the advanced capabilities of the digital medium. I saw paper handout being replaced by uploaded PDFs, classroom discussions replaced by online forums, and teacher-fronted lectures replaced by videos of teachers speaking.


I knew I wanted to at least try to do something about it, so I set off teaching myself how to use the tools to create things on the internet. I eventually got good enough to be hired to do web development full time, and that’s what I’ve been doing ever since.

2. In what ways do you feel technology can help with learning languages?

Obviously, given the very social nature of education and human language use, technology could never fully replace a teacher, and so this isn’t really what I’m setting out to do. Where I see technology being able to make an enormous impact, though, is in its ability to automate and scale a lot of the things on the periphery that language learning involves.


As an example, vocabulary is a very important component to being able to use and understand language. Thankfully, we now have the insights from corpus-based methods to help us identify which vocabulary items deserve primary focus, and it’s a fairly straightforward task to create materials including these.


However, what this means in practice is either students need to pay for expensive course books containing materials created with a corpus-informed approach to vocabulary, or the teachers and students themselves need to spend time creating these materials. Course books tend to be very expensive, and even those which come with online materials aren’t updated very frequently. Teachers and students creating their own materials are left to scour the internet for items to then analyze and filter for appropriate vocabulary inclusion, and then beyond that they need to construct materials to target the particular skill areas they would like to use the vocabulary for (eg, writing, listening), and which target the authentic contexts they are interested in, which is a very time-consuming manual process.


Technology has the ability to address both of these concerns (lack of updates and requirements of time). As one example, I created a very simple web app that pulls in content from the writing prompts sub-reddit (https://www.reddit.com/r/WritingPrompts/) and uses it to help students work on identifying appropriate articles (a/an/the) to accompany nouns and noun phrases. The content is accessed in real time when the student is using the application, and given the fast turnover in this particular sub-reddit, this means that using it once a day would incorporate completely different content, essentially forming a completely new set of activities.
One of the other advantages to this approach is the automated feedback available to the user. So in essence, it’s a completely automated system to that uses authentic materials (created largely by native speakers for native speaker consumption) to instantly generate and assess activities focused on one specific learning objective.


The approach does still have its shortcomings, in that this particular system is just finding all the articles and replacing them with a selection drop-down, so it’s only able to give feedback on whether the user’s selection is the same as the original article. Also, since this is a very informal genre, the language used might not be suitable for all ages of users.


3. What are your current projects?


I wish I had more time do work on these, since I currently only have early mornings and commuting time on the train to use for side projects, but there are a few things I’m working on that I’m really excited about.


Now that I have one simple grid-based game up and running (https://www.grammarbuffet.org/rhyme-game/), I’m thinking about how I can re-use that same user interface to target other skills. If, instead of needing to tap on the words that rhyme, we could just have the users say them, that would be a much more authentic way to assess whether the user is able to “do something” with their knowledge of rhymes. There is an HTML5 Speech API that I’ve been meaning to play around with, so that could be a potential way to create an alternate version based on actual speaking skills rather than just reading skills.


Another permutation of the grid-based game template would be integrating word stress instead of rhymes. I’m currently trying to get a good dataset containing word stress information for all the words in the Academic Word List (Coxhead, 2000), which I suppose is a bit dated now as a corpus-based vocabulary list, but it was my first introduction to the power of a corpus approach, and so I’ve always wanted to use it to generate materials on the web. The first version of this will probably also just involve seeing the word and using stress knowledge to tap it, rather than speaking, but I’m also imagining how we could use the capabilities of mobile devices to allow the user to shake or just move their phone up and down to give their answers on word stress. Once that’s up and running it’s  very simple to incorporate more modern corpus-based vocabulary lists (eg, the Academic Spoken Words List, 2017). Moreover, since this is all open source, anyone could adapt it for their particular vocabulary needs and deploy a custom web app via tech like Netlify.


Beyond these simple games, I’m also starting to work on a way to take authentic texts (possibly from a more academic genre on reddit like /r/science or text of articles on arXiv) to create cloze test types of materials using the AWL. The user would need to supply the words instead of select, which is a much more authentic assessment of their ability to understand and actually use these words in written English.


4. I really like the idea of offline access, how can people interested in this learn more?


The technology that enables this is currently referred to as Progressive Web Apps (PWAs), and relies on the technology of Web Workers. Essentially, because website development relies on javascript, we’re able to put javascript processes between the user’s browser and the network to intercept network requests and just return things that have already been downloaded. So for applications where all the data is included in the initial page load, this means that the entire website will work offline.


It’s a very relevant concept for our users who either have very unreliable network access, or even relatively expensive network costs. If we’re discussing applications that users engage in every single day, the network access becomes non-trivial, especially if it’s using the old website model of full page reload on every change in the view, rather than a modern single page app, written in either Angular or React. So absolutely, I would say it matters whether modern learning materials are using the latest technology to enable all of these enhancements to traditional webpages.

Much of this movement towards “offline-first” is informed by the JAMstack, which itself is a movement towards static sites that are deployable without any significant backend resources. This speaks to one of the goals of the micromaterials movement, which is the separation of getting that data from actually doing something with it in the web application. One early attempt in terms of setting up a backend API to be consumed is https://micromaterials.org, which just returns sentences from the WritingPrompts subreddit. It’s admittedly very crude (and even written in python 2, yuck!), but shows what could eventually be a model for data services that could feed into front-end micromaterials apps.


 5. Ideas/Plans for the future?


These disadvantages are a lot more obvious if this remains one of only a few such applications, but imagine if there were hundreds or even thousands of these forming something much more like an ecosystem. And then extrapolate that further to imagine thousands of backend server-side APIs for each conceivable genre of English enabling a multitiude of frontend applications to consume the data and create materials for different learners. As soon as you have one server-side service providing data on AWL words, that allows any number of web applications to consume and transform that data into activities.


The plan all along was not for me to create all of these applications, but to inspire others to begin creating similar type of micromaterials. It hasn’t yet caught on, and clearly, expecting teachers to take up this kind of development is not sustainable. I’m hoping that other developers see the value in these and join the movement.


In a sense, the sever-side API’s are a bigger prerequisite to getting this whole thing off the ground, so I’m very happy to work with any backend developers on what we need going forward, but I’m also going to continue developing things myself until we have a big enough community to take over.


I think whether all of these micromaterials exist under the umbrella of one single sign-on with tracking and auditing is beyond the scope of where we’re currently at, though I’m imagining a world where users could initiate their journey into the service, take a simple test involving all four of the main skills (reading, writing, speaking, and listening), and then be recommended a slew of micromaterials to help them out. 


For some users that might focus more on the reading and writing components, whereas for others that might focus more on the speaking and listening ones. The barrier to this currently being available is not at all significant and just involves getting development time invested in crating the materials. If I had them all created right now, I would be able to deploy them today with modern tooling like Netlify.


The problem is more one of availability and time, and I’m more than happy to work with other developers and teachers to bring this closer to a reality for our students.

Thanks for reading and many thanks to Adam for sharing his time on this blog; you can follow Adam on his blog [https://micromaterialsblog.wordpress.com/] and on twitter @BaronVonLeskis.

Please do read the other posts in the Grassroots language technology series if you have not done so already.

Grassroots language technology: Fred Lieutaud, FLGames, Planet Alert

I recently used the soccer game from this open source set of language games designed by Fred Lieutand – FLGames. Worked really well in a revision class, I often forget how competitive my first year engineering students are.

With no further ado here is Fred talking about technology and language learning. Many thanks to Fred and if you are interested in talking about this area do get in touch.

1. Can you share some of your background?
I’m a French teacher of English working in a middle-school in the North East of France. I’ve been teaching in the same school for 18 years now. I am interested in computers, but mostly in knowledge sharing through open-source licenses and I like creating things, so I started developing my own tools to teach.

2. You mentioned your current project, Planet Alert, can you explain
that a little?

Planet Alert is the answer I’ve found so far to the problem of student’s motivation. One of my goals is to have the kids take pleasure in coming to class and learning English. I have a feeling that this is possible through the use of ‘games’ (hence my FLGames – sources on GitHub).

Planet Alert is then a sort of ‘game’ providing class managing tools and trying to keep in mind as much as possible that technology should serve the classroom and help students improving their skills. If it doesn’t fulfill these goals, it shouldn’t be used in class.

In Planet Alert, students have their own avatar, and they need to take care of it to help the team (i.e. the class) succeed in the ‘game’. The scenario is not that interesting in itself : Humans wanted to conquer Mars, but Martians were first: they have invaded the Earth, and they have emptied human brains. To resist and free the planet, humans have to re-learn a language (English !).

The game is strongly connected to the classroom in many ways. Lots of ‘real’ actions have an effect on the avatars. Participation, group work, individual exercises, helping other kids. Each positive action increases player’s experience (XP), but also increases his or her gold coins (GC). Each negative action causes health points loss and might also cause GC loss. Thanks to the GC, a player can free places throughout the world (famous monuments – the goal is to have them developing geographical skills), free people, buy equipment (to earn even more), buy protections (to lose less than normal), buy potions (no homework for 1 lesson, changing seat in class for 1 lesson, assisting the teacher), donate to another player (to help him buy a health potion, for example). Once bought, players get the element to stick in their copybook and scores are updated for real in the classroom and on the website.

I try to encourage team work with special elements (group items) such as the Memory helmet or the Book of Knowledge : the first is a helmet giving access to online exercises (created inside Planet Alert), the second gives access to lessons that can be copied in the copybook (to validate an extra-homework, which gets credited with extra XP and extra GC). This gives also the kids a possibility to work outside of the classroom and revise vocabulary or go a little further than what has been done in class. For students not having an easy internet access, they can also do extra-work in their copybook. When shown in class, they get credited of a positive action.

At the beginning of the lesson, I often check the ‘Main Office’ page so we have the recent news and discuss things (someone needs help, monuments). An exercise in class becomes a ‘Group mission’, a test is a ‘Monster Attack’ and so on. Most things are related to the ‘game’.

Some roles exist : ‘Ambassadors’ for players having 10 positive actions in a row, ‘Captains’ for players having the best karma in each group. This is useful in class, for example to start an activity : Captains first !

Anyway, I guess you get the picture. It’s hard to be concise since Planet Alert offers many possibilities. It is really a way to manage class differently. Teachers can also generate reports over selected periods and see who has done extra-work, who has forgotten their material, who has participated. This is a great help for parents’ meetings.

Well, I could go on for hours about everything that is behind this website. But from my own (much biased !) point of view, the results are encouraging. If you want to have a look, the official website is https://planetalert.tuxfamily.org.

3. How do you decide on whether to use technology or not in class?
From what I’ve answered from the preceding question, you can imagine that using computers in the classroom is often a necessity for me. Although my focus is not to use the tool in itself for the sake of using it. I want to use it to share with the class. It has to prove its added value: either in helping communication, or in helping students learn. Planet Alert is an example of a common sharing, but the FLGames are another example for helping memorizing (Soccer for increasing speed, Grammar Gamble to improve the written skills, Car Race to encourage group work and cooperation). I believe technology in class should always be a means to promote real interaction. It should trigger some sort of desire to work, to speak, to get involved.

4. What kinds of tools (apart from your own) have you found most useful?
As you can see, I mostly use my own tools. But I also use OpenBoard to manage all my documents on my interactive whiteboard. I exclusively use open-source things for many years now and that is  something very important for me. With Planet Alert, I try to initiate students to open-source licenses : they have already drawn some of the monsters used in the game and accepted to share them on Open Clipart Library :). Other important aspects are the possibility to customize the tools and the ability to do so quickly (I like working with simple .txt files as data source).

5. Anything else you would like to comment on about technology in language learning?
I have a feeling it would be hard to do without technology when teaching, but this is a personal opinion. It is fundamental to understand that teaching relies much more on the teacher than on technology ! Some teachers are not ‘techies’ but they still do extraordinary work. I think a teacher has to find his or her way of teaching. And all sorts of teaching may work !

 

The Prime Machine – a new concordancer in town

One of the impulses behind The Prime Machine was to help students distinguish similar or synonymous words. Recently a student of mine  asked about the difference between “occasion” and “opportunity”. I used the compare function on the BYU COCA to help the student induce some meaning from the listed collocations. It kinda, sorta, helped.

The features offered by The Prime Machine promises much better help for this kind of question. For example in the screenshot below the (Neighbourhood) Label function shows the kind of semantic tags associated with the words “occasion” and “opportunity”. Having this info certainly helps reduce time figuring out the differences between the words.

Neighbourhood Labels for the comparison of occasion and opportunity

One of the other sweet new features brought to the concordancer table, is a card display system as seen in the first screenshot below. Another is information based on Michael Hoey’s lexical priming theory such as shown in the second screenshot below.

Card display for comparison of words occasion and opportunity

Paragraph position of the words occasion and opportunity

The developer of the new concordancer Stephen Jeaco kindly answered some questions.

1. Can you speak a little about your background?

Well, I’m British but I’ve lived in China for 18 years now.  My first degree was in English Literature and then I did my MA Applied Linguistics/TESOL and my PhD was under the supervision of Michael Hoey with the University of Liverpool.

I took up programming as a hobby in my teens.  If I hadn’t got the grades to read English at York, I would have gone on to study Computer Science somewhere.  In those days the main thing was to choose a degree programme that you felt you would enjoy.  Over the years, though, I’ve kept a technical interest and produced a program here or there for MA projects and things like that.

I’ve worked at XJTLU for 12 years now.  I was the founding director of the English Language Centre, and set up and ran that for 6 years.  After rotating out of role, I moved into what is now called the Department of English where I lecture in linguistics to our undergraduate English majors and to our MA TESOL students.

2. What needs is The Prime Machine setting out to fill?

I started working on The Prime Machine in 2010, at the beginning of my part-time PhD.  At that time, I was interested in corpus linguistics but I found it hard to pass that enthusiasm on to my colleagues and students.  We had some excellent software and some good web tools, but internet access to sites outside China wasn’t always very reliable, and getting started with using corpora for language learning usually meant having to learn quite a lot about what to look for, how to look for it, and also how to understand what the data on-screen could mean.

Having taught EAP for about 10 years at that time, I felt that my Chinese learners of English needed a way to help them see some of the patterns of English which can be found through exploring examples, and in particular I wanted to help them see differences between synonyms and become familiar with how collocation information could help them improve their writing.

I’d read some of Michael Hoey’s work while doing my MA, and in his role of Pro Vice Chancellor for Internationalization I met him at our university in China.  His theory of lexical priming provided both a rationale for how patterns familiar in corpus linguistics relate to acquisition and it also gave me some specific aspects to focus on in terms of thinking about what to encourage students to notice in corpus lines. 

The main aim of The Prime Machine was to provide an easy start to corpus linguistic analysis – or rather an easy start to using corpus tools to explore examples.  Central to the concept were two main ideas: (1) that students would need some additional help finding what to look for and knowing what to compare and (2) that new or enhanced ways of displaying corpus lines and summary data could help draw their attention do different patterns.  Personally, I really like the “Card” display, and while KWIC is always going to be effective for most things, when it comes to trying to work out where specific examples come from and what the wider context might be, I think the cards go a long way towards helping students in their first experiences of DDL.

Practically speaking, another thing I wanted to do was to start with a search screen where they could get very quick feedback on anything that couldn’t be found and whether other corpora on the system would have some results. 

3. What kind of feedback have you got from students and staff on the corpus tool?

I’ve had a lot of feedback and development suggestions from my students at my own institution.  Up until a few weeks ago, The Prime Machine was only assessable to our own staff and students.  The majority of users have been students studying linguistics modules, mostly those who are taking or have taken a module introducing corpus linguistics. However, for several years now I have also had students using it as a research tool for their Final Year Project – a year-long undergraduate dissertation project where typically each of us has 4 to 5 students for one-to-one supervision.  They’ve done a range of projects with it including trying to apply some of Michaela Mahlberg’s approaches to another author, exploring synonyms, exploring the naturalness of student paraphrases or exam questions.  People often think of Chinese students as being shy and wanting to avoid direct criticism of the teacher, but our students certainly develop the skills for expressing their thoughts and give me suggestions!

In my own linguistics module on corpus linguistics, I’ve found the new version of The Prime Machine to be a much easier way to get students started at looking at their own English writing or transcripts of their speech and getting them to consider whether evidence about different synonyms and expressions from corpora can help them improve their English production.  Personally, I use it as a stepping stone to introducing features of WordSmith Tools and other resources.

In terms of staff input, I’ve had a couple of more formal projects, getting feedback from colleagues on the ranking features and the Lines and Cards displays.  I’ve also had feedback by running sessions introducing the tool as part of a professional development day and a symposium.  Some of my colleagues have used it a bit with students, but I think while it required access from campus and before I had the website up, it was a bit too tricky even on site. 

On the other hand, I’ve given several conference papers introducing the software, and received some very useful comments and suggestions.

I need to balance my teaching workload, time spent working towards more concrete research outputs and family life, but if we can get over some of the connectivity issues and language teachers want to start using The Prime Machine with their students, I’m going to need as much feedback as possible.  I’d like to hope I could respond and build up or extend the tool, but at the same time there’s a need to try to keep things simple and suitable for beginners. 

 4. You have some extra materials for students at your institution, could you describe these?

There’s nothing really very special about these.  But having the two ways of accessing the server (offsite vs. on-site) means if corpus resources come with access restrictions or if a student wants to set up a larger DIY corpus for a research project I’m able to limit access to these.

Other than additional corpora, there are a few simple wordlists which I use in my own teaching and some additional options for some of the research tools.

5. What developments are in the pipeline for future versions of The Prime Machine?

One of the main reasons I wanted The Prime Machine to be publically available and available for free was so that others would be able to see some of the features I’ve written about or presented about at conferences in action.  In some ways, my focus has changed a bit towards smaller undergraduate projects for linguistics, but I still have interests and contacts in English language teaching.  Given some of the complications of connecting from Europe to a server in China, unless someone finds it really interesting and wants to set up a mirror server or work more collaboratively, I don’t think I can hope to have a system as widely popular and reliable as the big names in online concordancing tools.  But having interviews like this and getting the message out about the software through social media means that there is a lot more potential for suggestions and feature requests to help me develop in ways I’ve not thought of.

But left to my own perceptions and perhaps through interactions with my MA TESOL students, local high schools and our language centre, I’m interested in adding to the capabilities of the search screen to help students find collocations when the expression they have in mind is wildly different from anything stored in the corpus.  At the moment, it can do quite a good job of suggesting different word forms, giving some collocation suggestions and using other resources to suggest words with a similar meaning.  But sometimes students use words together in ways that (unless they want to use language very creatively) would stump most information retrieval systems.

Another aspect which I could develop would be the DIY text tools, which currently start to slow down quite rapidly when reading more than 80,000 words or so.  That would need a change of underlying data management, even without changing any of the features that the user sees.  I added those features in the last month or two before my current cohort of students were to start their projects, and again, feedback on those tools and some of the experimental features would be really useful.  On the other hand, I point my own students to tools like WordSmith Tools and AntConc when it comes to handling larger amounts of text!

The other thing, of course, is that I’m looking forward to getting hold of the BNC 2014 and adding another corpus or two.  Again, I can’t compete with the enormous corpora available elsewhere, but since most of the features I’m trying to help students notice differ across genre, register and style, I am quite keen on moderately sized corpora which have clearly defined sub-corpora or plenty of metadata.

One thing I would like to explore is porting The Prime Machine to Mac OS, and also possibly to mobile devices and tablets.  But as it stands, using The Prime Machine requires the kind of time commitment and concentration (and multiple searches and shuffling of results) that may not be so suitable for mobile phones.  I sometimes think it is more like the way we’d hunt for a specialist item on Taobao or Ebay when we’re not sure of a brand or even a product name, rather than the kind of Apps we tend to expect from our smart phones which provide instant ready-made answers.  Redesigning it for mobile use will need some thought.

Personally, I’m hoping to start one or two new projects, perhaps working with Chinese and English or looking more generally at Computer Assisted Language Teaching.  

Now that The Prime Machine is available, while of course it would be great if people use it and find it useful, more importantly beyond China I think I’d hope that it could inspire others to try creating new tools.  If someone says to the developer working on their new corpus web interface, “Do you think you could make a display that looks a bit like that?”, or “Can you pull in other data resources so those kinds of suggestions will pop up?”, I think they wouldn’t find it difficult, and we’d probably have more web tools which are a bit more user-friendly in terms of operation and more intuitive in terms of support for interpretation of the results. 

6. What other corpus tools do you recommend for teachers and students?

Well, I love seeing the enhancements and new features we get with new versions of popular corpus tools.  And at conferences, I’m always really impressed by some of the new things people are doing with web-based tools.   But one thing that I would say is that for the students I work with, I think knowing a bit more about the corpus is more useful than having something billions of words in size; being able to explore a good proportion of concordance lines for a mid-frequency item is great.  I think having a list of collocations or lines from millions of different sources to look at isn’t going to help language learners become familiar with the idea that concordance lines and corpus data can help them understand, explore and remember more about how to use words effectively. 

Nevertheless, I think those of us outside Europe should be quite jealous of the Europe-wide university access to Sketch Engine that’s just started for the next 5 years.  I also really like the way the BYU tool has developed.  I was thrilled to get hold of the MAT software for multidimensional analysis.  And I think I’ll always have my WordSmith Tools V4 on my home computer, and a link to our university network version of WordSmith Tools in my office and in the computer labs I use.

Thanks for reading. Do note if you comment here I need to forward them to Stephen (as he is behind the great firewall of China) and so there may be a delay in any feedback. Alternatively contact Stephen yourself from the main The Prime Machine website.

Also do note that the current available version of The Prime Machine may not work at the moment but wait a few days for a fix to be applied by Stephen and try again then.

Grassroots language technology: Wiktor Jakubczyc, vocab.today

It’s been a while since the last post on teachers doing it for themselves technology wise. Do check those out if you have not or need a reminder. The teacher/developer who kindly answered questions for this post, Wiktor Jakubczyc, I stumbled across when looking for a github source on vocabulary profilers. And what a find his github pages are.

I think there are good reasons for teaching and education to have a default “inertia” regarding “innovation” (which Wiktor laments in one of his responses) but I won’t discuss this here. Maybe readers may prod me on this in the comments? 😁 I would like to refer to a (pdf) point I’ve made before – that there is a middle ground for teachers to explore regarding grassroots technology.

Anyway enough of my rambling here’s Wiktor and there is a marvelous bonus at the end for all you CALL geeks:

1. Can you explain your background a little?

I’m an English teacher with over 10 years of experience and an IT freelancer. I’ve taught English all over Europe, in London, Moscow, Warsaw, Bratislava, Sevilla and Wrocław, my home town in Poland. Since I was a kid I’ve loved computers – and that was in the ’80s when an Atari couldn’t really do very much. I passionately want teachers to make the most of digital technologies.

2. What was the first tool you designed for learning languages?

The first tool I designed to help students learn English was a dictionary lookup program for Windows, way back in 2007. Back then, there were good dictionaries you could get for your computer, but I wanted to be able to look up a word in many dictionaries at once. That option simply didn’t exist, so I created The Ultimate Dictionary (http://creative.sourceforge.net) . I got great feedback from my students, fellow teachers and friends – they still use it, and they love it! It’s a very rewarding feeling to create something of value for other people, and to be able to give it to them for free.

A few years later, I discovered that another developer, Konstantin Isakov, had the same idea and made an even better dictionary application – GoldenDict. I used his source code as the base for a redesign of my dictionary, now called Nomad Dictionary. Nomad Dictionary now has Windows, Android and MacOS editions, all available to download at http://dictionaries.sf.net.

My second project was a Half a Crossword creator. Half a crossword is a type of communicative activity for ESL classrooms which emphasizes speaking and vocabulary, two key skills in speaking a language. Students get half a crossword each, split evenly between two students, and have to ask each other for missing information and give definitions for the words they have in their crossword. It’s a fantastic way to revise and recycle vocabulary, while practicing the much-needed skills of asking for and giving information. And students love it!

Again, no such tool existed, which is why I decided to create one. I first made a version of Half a Crossword for Windows (http://creative.sourceforge.net) because at the time Delphi was the only language I could program in. I found it immensely useful in my classes – it was a perfect activity to check how many words students knew before moving on to new material. I tried to get other teachers involved, to spread the word and encourage them to use it, but I found a lot of people were resistant. They loved the idea, but few actually decided to use it in their classrooms.

A few years later, thinking that maybe the problem was accessibility – you needed to download a program, install it, write a wordlist in word and then save it… it was a bit complicated – I decided to create an online version written in JavaScript. I posted the code for Half a Crossword Online on GitHub (https://github.com/monolithpl/half-a-crossword). Despite the fact that it wasn’t advertised anywhere, quite a few people found out about it, and two people even contributed code! Teachers I talked to also found the online version easier to use, and came to use them with their classes.

3. What do you think of as a relevant tool?

That’s a very good question, which is to say a very hard question. I think a relevant tool has to be both personally important enough for the creator to design it (especially if it’s a hobby project) at the same time good enough so that other people later also find it useful to them. It’s rare for these two things to coincide.

Another difficulty lies in the fact that the world of teaching, broadly speaking, is averse to innovation. Very few teachers care to experiment with new methodologies, paradigms or teaching tools. There’s extreme inertia. So getting teachers to change their habits and try something new is very challenging, especially when it comes to technology.

Relevant tools, in my mind, would be those that embrace the DOGME/Teaching Unplugged methodology, the Lexical Approach, personalized teaching, the explosion of mobile computing, just to name a few – all the radical new ideas that have appeared in the last 10 years in language teaching. And they would have to be loved by students, teachers and administrators alike.

4. Do you create tools for languages other than English?

I would love to, someday. I simply don’t have the time to do that now. This is a hobby, after all. The language learning tools I create are useful to my students, my colleagues and myself in learning and teaching English, which is what we do everyday. So that is the priority for now.

I hope other people around the world will find the time and be inspired to create tools for their languages. Unfortunately, there is a huge gap between the English-speaking world and the rest of the people out there when it comes to technology: just compare the size of the English Wikipedia versus editions in other languages. The same is true for language data: there are far fewer corpora, frequency wordlists, audiovisual materials etc for languages other than English. There’s lots of catching up to do.

I also think that the world needs a world language, so that we can all start to understand things not just around us, in our local environment, but on a more global level. For that, we need English, so I can understand why most of the interesting developments in language teaching are designed for English students. It’s simply the largest market and user base.

5. What tools are you working on at the moment? What do you have planned for future developments?

Right now I’m working on projects related to wordlists. I have a new version of a Vocabulary Profiler (https://github.com/monolithpl/range.web) almost ready. It’s an app that visualizes word frequency in a text, or in more practical terms tells a teacher how difficult a text is and which words are going to be most challenging for their students. Developing it was an incredible learning experience as I had to figure out how to compress large wordlists so that the app could work on mobile phones and discovered trie algorithms, which are a super clever concept of packing words into a small space. I’d like to mention the groundbreaking work of Paul Nation on teaching and researching vocabulary, especially his Range program (https://www.victoria.ac.nz/lals/about/staff/paul-nation#vocab-programs), which I tried to recreate for the modern web.

My most ambitious project to date is an extension of this work – it’s an app to highlight collocations, chunks etc. in a text called Fraze Finder (https://github.com/monolithpl/fraze-finder). It takes the concept of profiling vocabulary to the next level by analyzing multi-word elements, like phrasal verbs, which students most often struggle with. The idea is to help students and teachers notice collocations, to identify them and understand their importance in written and spoken language. The difficulty here is building a good library of these expressions and accurately finding them (with all their variations) in texts. I have lots of ideas for future projects, which I’ve tried to gather together on my personal website vocab.today (https://vocab.today/teacher). I hope one day to complete them all!

6. Are there any tools (not yours) that you yourself use for learning languages?

Over the years, I’ve tried and experimented with dozens of language learning solutions. Let me focus on three main areas:

Language Management Systems (LMSs) – these are content delivery platforms, basically, websites where teachers upload material for their classes and students do their homework, complete tests, review their progress and exchange messages with one another.

I gave Moodle a try, but it was just horrible to use for both teachers and students, and I think other people agreed with me for it seems to be fading away into a well-deserved oblivion.

Later, I tried Edmodo, which was a lot easier to use, and obviously inspired by Facebook, which was just starting to be the big thing at the time. I ran into numerous limitations using it, and finally, out of sheer frustration, just gave up. It was very pretty on the surface, but you couldn’t do much with it. And students prefered to use Facebook for their day-to-day communication, so it was difficult to make them use something else.

So today, I create Facebook groups for my students and use Google Drive, Forms and Docs to share documents and tests. It’s still not a perfect solution, but it has the advantage of being familiar to everyone and easy to use. Unlike the many solutions I’ve used before, I think these are versatile enough to do the job and are actively being developed and improved.

Flashcards – There are hundreds of apps and websites that help students learn through flashcards. I’ve tried many of them with my students, including Anki (which is a great piece of software). However, I’ve found that Quizlet is the most easy to set up and easy to use. And there’s a huge library of flashcards made by talented teachers around the world available for anyone to use. It’s quite amazing, and it’s free.

Mobile Apps – I’ve also experimented with several dozen different learning tools for mobile phones. This is a very new market, as the iPhone only came out ten years ago. There is currently much hype around apps like DuoLingo, Babbel or Memrise, but personally I found them to be quite boring. The activities are very repetitive, and apart from situations where I would be forced to use them (on a crowded train with nothing else to do), I can’t imagine myself ever using them long-term.

This is still a very experimental field, which is why I find it shocking that the three biggest apps offer just two types of activities: multiple choice or fill-in-the-gap exercises. I would love to see more variety. There’s also the fact that due to their novelty, the claims of effectiveness these apps advertise with is often greatly overstated – just see what happened to all the “brain training” apps like Lumosity which now have to pay multi-million dollar fines for lying to their customers (https://arstechnica.com/science/2016/06/billion-dollar-brain-training-industry-a-sham-nothing-but-placebo-study-suggests/). There’s definitely room for improvement.

7. Any advice for people interested in learning to design such tools?

The most important thing is to have an idea on what to create: something that would be useful for you or your students that doesn’t yet exist, a faster and better way of doing something you do every day or a radical improvement on a tool or solution you currently use.

Programming skills are secondary and you can always find people who can help you out with technical stuff on StackOverflow. I’ve met a few programmers who after completing their studies had no idea what they wanted to create. Knowing what you’d like to create is the key.

It’s much easier to get into hobby development than it was 5 or 10 years ago. GitHub makes it super easy to upload your code and create a website for your project – all for free! It’s also a great way to discover other projects, make use of ready-made components and participate in the open source community by commenting or finding bugs.

JavaScript is one of the easiest programming languages you can learn, and it’s everywhere – on PCs, Macs, iPhones and Androids. With just one language, you can design for almost any device out there – the developments on the technological front are simply amazing.

On the teaching side, I could recommend no better than Scott Thornbury’s excellent article How could SLA research inform EdTech? (https://eltjam.com/how-could-sla-research-inform-edtech) which describes the needs of language learners and offers a list of requirements that should be met in order to create a truly excellent, cutting-edge language learning tool. To my knowledge, no such tool exists. Not by a long shot. It’s a great opportunity for creative minds.

8. Anything you want to add?

Thank you for noticing my work and giving me an opportunity to speak about it. Up until now I’ve been working on my projects almost in secret. It would be amazing if this interview inspired creative young minds to design new tools for language teaching, especially in languages other than English. I hope teachers will discover new tools that will help them teach better with less effort.

Technology has so much to offer in the field of learning languages, and there’s so much innovation to come. I’m looking forward to the bold new ideas of the future. Follow my work at vocab.today or on github!

Many thanks to Wiktor for spending time answering these questions. And here is the bonus link – Wiktor is compiling classic CALL programs that you can run in your browser, how awesome is that?! I am sure Wiktor would be glad to take some suggestions of some classic gems.

CORE blimey – genre language

A #corpusmooc participant in answering a discussion question on what they would like to use corpora for replied that they wanted a reference book that shows various common structures in various genres such as “letters of condolence, public service announcements, obituaries”.

The CORE (Corpus of Online Registers) corpus at BYU along with the virtual corpora feature allows a way to reach for this.

For example, the screenshot below shows the keywords of verbs & adjectives in the Reviews genre:

Before I briefly show how to make a virtual corpus do note that the standard interface allows you do to a lot of things with the various registers. The CORE interface shows you examples of this. For example the following shows the distribution of the present perfect across the genres:

Create virtual corpora

To create a virtual corpus first go to the CORE start page:

Then click on Texts/Virtual and get this screen:

Next press Create corpus to get this screen:

We want the Reviews Genre so choose it from the drop down box:

Then press Submit to get the following screen:

Here you can either accept these texts or say you want to build only a film review corpus manually look through links and filter for film reviews only. Give your corpus a name or add it to an already existing corpus. Here we give it the name “review”:

Then after submitting you will be taken to the following screen which shows you all your virtual corpora collection we can see the corpus we just created at number 5:

Now you can list keywords.

Do note that the virtual corpora feature is available in most of the BYU collection so if genre is not your thing maybe the other choices of corpora might be useful.

Thanks for reading and do let me know if anything appears unclear.