Bill Louw – I intend to revive logical positivism

My last two posts (Locating collocation and Thin word lists and fat concordances) have used the ideas of Bill Louw, who kindly agreed to talk about his work. (Note if you are reading this from a mobile device you may need to refresh a few times to get all the audio to load)

The title of this post indicates his overall goal to revive logical positivism 1 (Schlick refers to Moritz Schlick one of the founders of logical positivism):

Revive logical positivism

He describes how he is doing this by merging Firthian ideas with logical positivism via the shared idea of context of situation (semantic prosody is a type of contextual meaning):

Hand over to science

Louw claims that another of the founders of logical positivism Rudolf Carnap was prevented from continuing his work on induction and probability when Carnap moved to the USA. Apparently this is evident from letters between Carnap and American philosopher Willard VO Quine. The significance of induction was highlighted by Bertrand Russell who stated that we can’t have science without induction. A very common representation of induction is the “All swans are white” example or more generally “All A’s are B’s” however Moritz Schlick saw induction differently:

Schlick on induction

Louw goes on to add how Schlick describes the relation between thinking and reality:

Schlick on thinking

The above clip is important to understand how Louw critiques the idea of collostruction. Collostruction is a way to measure collocation as it relates to grammar and Louw points out the weakness in such an approach in terms of the “given” i.e. reality/experience (Gries refers to Stefan Th. Gries inventor of collostruction):

Collostruction and the given

Another way Louw illustrates his project to revive logical positivism is how he derives the idea of subtext from Bertrand Russell’s idea of a perfectly logical natural language:

Subtext 1

He then describes how Firthian collocation needs to be brought in to augment subtext if languages like Chinese are to be studied:

Subtext 2

For some reason until I started reading Louw I did not quite get the idea of progressive delexicalisation – that words have lots of meanings that differ from their literal meanings. Previously I was only thinking of delexicalisation with respect to verbs such as ‘make’ and ‘do’. And further that many words we may think have mostly literal meanings in fact have mostly delexical meanings. Louw & Milojkovic (2016: 6) give the example of ‘ripple’, where only one form in ten occurred with ‘water’ and ‘surface’ using the Birmingham University corpus.

Louw describes how John Sinclair described this as the blue-jeans principle:

Sinclair’s blue jeans

In the early 90’s Louw tested the idea of Sinclair’s that every word has at least two meanings:

Lexical-Delexical

The start of the 80’s recalls how Louw encountered the idea of a computer writing a dictionary:

Computer writing

Louw gives an example of how the computer can help using US presidents Trump & Biden:

Computer reassurance

Louw is keen to distinguish collocation from colligation:

Deceptive colligation

Louw admits his self-obsession on the idea of bringing together Firth and the Vienna school:

Firth & Vienna

Louw’s conviction of his project reflects the certainty of the logical positivists and despite that stream of thought no longer being the force it was Louw’s drive recalls Richard Rorty (without condoning the sexist language) as quoted in Goldsmith & Laks (2019: 443):

“The sort of optimistic faith which Russell and Carnap shared with Kant – that philosophy, its essence and right method discovered at last, had finally been placed upon the secure path of science – is not something to be mocked or deplored. Such optimism is possible only for men of high imagination and daring, the heroes of their times”

Thanks for reading & listening and many thanks to Bill Louw for taking time to chat with me.

Notes

  1. Wikipedia Logical Positivism https://en.wikipedia.org/wiki/Logical_positivism

References

Goldsmith, J. A., & Laks, B. (2019). Battle in the mind fields. University of Chicago Press.

Louw, B., & Milojkovic, M. (2016). Corpus stylistics as contextual prosodic theory and subtext (Vol. 23). John Benjamins Publishing Company.

Locating collocation

The Wikipedia entry on collocation says:
“..a collocation is a series of words or terms that co-occur more often than would be expected by chance”. 1
This is the description of collocation that is linked to by leaders in corpus tools SketchEngine in their syllabus for a new online course.2

Note the statistical aspect in the definition “more often than would be expected by chance”.

The wiki entry then reads “There are about six main types of collocations: adjective + noun, noun + noun (such as collective nouns), verb + noun, adverb + adjective, verbs + prepositional phrase (phrasal verbs), and verb + adverb. “

Note the emphasis on the grammar aspect of collocation.

Bill Louw would place this wiki definition (alongside Goran Kjellmer’s definition of collocation – ‘sequence of words that occurs more than once in identical form…and which is grammatically well structured’) at the bottom of the diagram below:

(Louw & Milojkovic 2016: 53)

The diagram shows two dimensions, the vertical dimension is how restrictive a view of collocation is with the most restrictive at the bottom and the least at the top. The horizontal dimension shows how much of the language a view of collocation covers, the top bulb of the diagram is larger than the bottom bulb.

Louw & Milojkovic (2016) argue that the link of collocation to context of situation is of great importance in applications of corpora in literature studies i.e. corpus stylistics.

Context of situation was illustrated by Firth in the following way:

“In his article ‘Personality and language in context’ Firth offers us what he calls a typical Cockney event in ‘one brief sentence’.
‘Ahng gunna gi’ wun fer Ber’. (I’m going to get one for Bert)
What is the minimum number of participants? Three? Four? Where might it happen? In a pub? Where is Bert? Outside? Or playing darts? What are the relevant objects? What is the effect of the sentence? ‘Obvious!’ you say. So is the convenience of the schematic construct called ‘context of situation’. It makes sure of the sociological component.” (Firth 1957: 182 as quoted in Louw & Milojkovic, 2016:61, emphasis added)

Awareness of the importance of context of situation is reflected in the following small Twitter poll where a majority of the 24 respondents opted for “meanings have words” over “words have meanings”:

Twitter poll

Although Louw concedes a view of collocation such as ngrams can reveal contexts of situation, opportunities to do so will be much rarer than if collocation is located near the top of the diagram – “abstracted at the level of syntax” as Firth put it.

Context of situation is also of great importance in language teaching and learning. For example task based teaching can be said to lay great weight on context of situation.

As Louw & Milojkovic (2016:26) put it :

“The closer collocation’s classifications are to context of situation, the more successful and enduring will be the approach of the scholars who placed them there. The more the term is constrained by the notion of language ‘levels’ and the linearity and other constraints of syntax, the less such classifications and the theories perched upon them are likely to endure. The reason for this is, as we shall see, that collocation takes us directly to situational meaning and acts as what Sinclair refers to as the ‘control mechanism’ for meaning”

Thanks for reading.

Notes

  1. Wikipedia Collocation https://en.wikipedia.org/wiki/Collocation
  2. Boot Camp online https://www.sketchengine.eu/bootcamp/boot-camp-online/#toggle-id-2

References

Louw, B., & Milojkovic, M. (2016). Corpus stylistics as contextual prosodic theory and subtext (Vol. 23). John Benjamins Publishing Company.

Thin word lists and fat concordances

One of the aspects of the proposed changes in the GCSE modern foreign language, MFL, syllabus in the UK is the use of corpus derived word lists 1. Distribution of words when counted follow a power law. A common power law is Pareto in economics – “Pareto showed that approximately 80% of the land in Italy was owned by 20% of the population” 2 . Similarly in any piece of text a large percentage of it comes from a relatively small amount of words – the top 100 words in English accounts for 50% of any text. The MFL review wants to use wordlists of the most frequent 2000 words – which would cover about 80% of any text.

Currently the MFL syllabus is topic based, so one issue here is that most words one can use for any particular topic will be limited to that topic. Or another way to say it is that although the word may be frequent within a topic it won’t have range and appear in other topics. The NCELP, National Centre for Excellence for Language Pedagogy in Vocabulary lists: Rationales and Uses writes “For example, many of the words for pets or hobbies will be low frequency words which are not useful beyond those particular topics. ” 3

There have been many critics of this wordlist driven proposal who have pointed out various weaknesses, see – AQA Exam board 4, ASCL, Association of School and College Leaders 5, Transform MFL 6 , Linguistics in MFL Project 7.

I want to take a different tack and argue that the wordlist driven approach is a half-hearted version of what could be a full blooded corpus approach to vocabulary content.

Corpus stylist Bill Louw writes that he “has become suspicious of decontextualised frequency lists” (Louw & Milojkovi, 2016:32). He calls such lists thin lists because they tend to cover things rather than events (Louw 2010). Events are states of affairs, what one of the originaters of the notion of meaning by collocation JR Firth has called context of situations. Looking at collocates of things in concordance lines allows us to “chunk the context of situation and culture into facts” (Louw 2010).

A concordance line brings together and displays instances of use of a particular word from the widely disparate contexts in which it occurs. To cover events one would need to examine collocates in concordances hence the term fat concordances.

The most frequent words are often bleached out of their literal meanings. Compare the word “take” on its own, most people would think of the meanings of “the act of receiving, picking up or even stealing” (Louw & Milojkovi, 2016:5), to a collocation such as “take place”, we see that the meaning here is distant from the literal meaning of “take” 8. When the NCELP say “Very high frequency words often have multiple meanings.” they are describing the notion of delexicalisation.

To demonstrate context of situation and context of culture, reproduced below is corpus linguist John Sinclair’s PhraseBite pamphlet which is reproduced in Louw (2008):

When she was- – – – – Phrasebite© John Sinclair, 2006.

  1. The first grammatical collocate of when is she
  2. The first grammatical collocate of when she is was
  3. The vocabulary collocates of when she was are hair-raising. On the first page:
    diagnosed, pregnant, divorced, raped, assaulted, attacked
    The diagnoses are not good, the pregnancies are all problematic.
  4. Select one that looks neutral: approached
  5. Look at the concordance, first page.
  6. Nos 1, 4, 5, 8,10 are of unpleasant physical attacks
  7. Nos 2, 3, 6, 7, 9 are of excellent opportunities
  8. How can you tell the difference?
  9. the nasties are all of people out and about, while the nice ones are of people working somewhere.
  10. Get wider cotext and look at verb tenses in front of citation.
  11. In all the nasties the verb is past progressive, setting a foreground for the approach.
  12. In the nice ones, the verb is non-progressive, either simple past or past-in-past.

Data for para 4 above.
(1) walking in Burnfield Road , Mansewood , when she was approached by a man who grabbed her bag
(2) teamed up with her mother in business when she was approached by Neiman Marcus , the department store
(3) resolved itself after a few months , when she was approached by Breege Keenan , a nun who
(4) Bridge Road close to the Causeway Hospital when she was approached by three men who attacked her
(5) Drive , off Saughton Mains Street , when she was approached by a man . He began talking the original
(6) film of The Stepford Wives when she was approached by producer Scott Rudin to star as
(7) bony. ‘ ‘ Kidd was just 15 when she was approached to be a model . Posing on
(8) near her home with an 11-year-old friend when she was approached by the fiend . The man
(9) finished a storming set of jazz standards when she was approached by SIR SEAN CONNERY . And she
(10) on Douglas Street in Cork city centre when she was approached by the pervert . The man persuaded

As Louw (2008) puts it:

“The power of this publication, coming as it did so close to Sinclair’s death, is to be found in the detail of his method. By beginning with a single word, she, from the whole of the Bank of English, Sinclair simply requests the most frequent collocate from the Bank of English (approximately 500 million words of running text). The computer provides it: when. The results are then merged: when+she. A new search is initiated for the most frequent collocate of this two-word phrase. The computer provides it: was. The concordances are scrutinized and cultural insights are gathered.”

The ASCL quotes applied linguist Vivian Cook:

“While word frequency has some relevance to teaching, other factors are also important, such as the ease with which the meaning of an item can be demonstrated (’blue’ is easier to explain than ‘local’) and its appropriateness for what pupils want to say (‘plane’ is more useful than ‘system’ if you want to travel)”

Blue is easier to explain than local because most collocates of blue are its literal colour meaning e.g. “blue eyes”. Yet consider this from a children’s corpus:

“There, I feel better. I’ve been needing a good cry for some time, and
now I shall be all right. Never mind it, Polly, I’m nervous and tired;
I’ve danced too much lately, and dyspepsia makes me blue;” and Fanny
wiped her eyes and laughed.” (An Old-fashioned Girl, by Louisa May Alcott)

So while it is true that blue is often associated with color, it also associates with mental states where the colour meaning is delexicalised, or washed out.

To conclude, the MFL proposals on using corpus derived word lists to drive content is not taking full advantage of corpora. They are promoting thin wordlists when they could also be promoting fat concordances.

Thanks for reading.

Notes

  1. MFL consultation – https://consult.education.gov.uk/ebacc-and-arts-and-humanities-team/gcse-mfl-subject-content-review/supporting_documents/GCSE%20MFL%20subject%20content%20consultation.pdf
  2. Pareto – https://en.wikipedia.org/wiki/Pareto_principle
  3. NCELP – https://resources.ncelp.org/concern/resources/t722h880z?locale=en)
  4. AQA – https://filestore.aqa.org.uk/content/our-standards/AQA-GCSE-MFL-POLICY-BRIEFING-APRIL-2021.PDF
  5. ASCL – https://www.ascl.org.uk/ASCL/media/ASCL/Our%20view/Consultation%20responses/2021/Draft-response-Consultation-on-the-Revised-Subject-Content-for-GCSE-Modern-Foreign-Languages.pdf
  6. Transform MFl – https://transformmfl.wordpress.com/2021/02/15/should-we-learn-words-in-frequency-order/
  7. Linguistics in MFL Project – http://www.meits.org/opinion-articles/article/the-dfe-ofqual-consultation-on-revised-gcse-qualifications-in-modern-foreign-languages-a-view-from-linguistics
  8. Take place – https://eflnotes.wordpress.com/2013/05/06/what-to-teach-from-corpora-output-frequency-and-transparency/

References

Louw, B. (2008). Consolidating empirical method in data-assisted stylistics. Directions in Empirical Literary Studies: In Honor of Willie Van Peer, 5, 243.

Louw, B. (2010). Collocation as instrumentation for meaning: a scientific fact. In Literary education and digital learning: methods and technologies for humanities studies (pp. 79-101). IGI Global.

Louw, B., & Milojkovic, M. (2016). Corpus stylistics as contextual prosodic theory and subtext (Vol. 23). John Benjamins Publishing Company.

Picture labelling with Hot Potatoes

This post is adding to what has been written already on how to do picture labelling exercises with Hot Potatotes. It assumes you know how to make a JCloze exercise.

The following video shows a sample of my current favourite kind of exercise to make with Hot Spuds:

The maker of the software has called this Smart positioning or How to overlay drop down lists on a background picture.

You can use the following html code to copy paste into a JCloze exercise, it creates a picture labelling of 6 terms.

<table style="border-style: solid; border-width: 0px;  width: 640px; "><tbody>
<tr>
<td style=""height: 80px; text-align:right; ">Label 1</td>
<td style="width: 480px; height:430px;" rowspan="3" ><img src="name-of-image" alt="name-of-image" title="image-title" width="416" height="352" style="display: block; margin-left: auto; margin-right: auto; text-align: center;"/></td>
<td style="height: 80px; text-align:left; ">Label 2</td>
</tr>
<tr>
<td style="height: 80px; text-align:right; ">Label 3</td>

<td style="height: 80px; text-align:left; ">Label 4</td>
</tr>
<tr>
<td style="height: 80px; text-align:right; ">Label 5</td>

<td style="height: 80px; text-align:left; ">Label 6</td>
</tr>
</tbody></table>

If you want to label 8 terms just add another row and increase rowspan to 4. You will need to adjust image dimensions appropriately depending on your picture.

The following 2 videos details how I go about cropping the image using screenshotting and picture editing to add arrows on OSX, you can find similar tools for your systems:

You may well have to go back to your picture editing software to modify your image until you are satisified that the arrows match up with the drop down boxes. You can also try to change the height: 80px for the cell that is not aligned but I just fiddle with the image editing program.

The following code is to make the table responsive i.e. table will wrap around when displayed on phones:

<div style="overflow-x:auto;">
//put in the table code here//
</div>

Hope this post is of help as information on creating such an exercise is hard to find now on the web. Happy to take questions.

Create your own interactive transcript

Interative transcripts are where text appears next to a video or audio and the words being spoken are highlighted on the text as the video or audio plays.

This video-post is about making your own. I assume you 1) have your own website, 2) have already transcribed your media and 3) use OSX Mac or Linux.

The programs in order of use are – Gentle forced aligner (https://github.com/lowerquality/gentle), Hyperaudio convertor (https://hyperaud.io/converter/) and Hyperaudio Lite (https://github.com/hyperaudio/hyperaudio-lite).

Note that if you don’t use OSX/Linux then, in order to get an appropriate file to feed into the Hyperaudio convertor, you can use one of the online transcription services that have free minutes such as Maestra (https://maestrasuite.com/).

Or there is an online demo of the Gentle forced aligner https://lowerquality.com/gentle/ though not sure what file size limit is for that.

I apologise for the noise that appears later in the video! Thanks for watching and don’t hesitate to ask me any questions.

Funky images

In my last post one of the comments (by nmwhiteport) was skeptical about the notion of core meaning of words as I used it to describe the verbs make and do in Collocations need not be arbitrary. One issue here is how to define core (of which the definitions I used may be debatable) and the other is even if people agree on definitions of core meaning is it more effective than learning words by memorisation?

Taking the first issue, Verspoor & Lowie (2003) give one definition of core (taken from a dictionary) as :

“The core meaning is the one that represents the most literal sense that the word has in modern usage. This is not necessarily the same as the oldest meaning, because word meanings change over time. Nor is it necessarily the most frequent meaning, because figurative senses are sometimes the most frequent. It is the meaning accepted by native speakers as the one that is most established as literal and central.”

Verspoor & Lowie, 2003: 555

Note that Tyler & Evans (2003) give a more rigorous approach in identifying what they refer to as primary sense.

Using the Verspoor & Lowie (2003) definition one can say the literal meaning of make is to create something new from nothing and that of do is to execute an activity. A diagram could be presented to illustrate this over time:

(Tsai, 2014: 94)
(Tsai, 2014: 94)

Perhaps nmwhiteport’s puzzle loving student would be less likely to produce ‘make a crossword’ having seen the above diagrams and noted how make involves “a nothing to a something” compared to do which has a “something to a samething”?

One clue to the second issue of the efficacy of core meaning is seen in Verspoor and Lowie (2003) who found that students who were given a core meaning were better able to interpret extended meanings better than students who were given translated meanings of a more peripheral sense. This difference held when students were tested 2 weeks later.

Similarly when lexical items overlap as described in the last post with the example of high and tall, Beréndi, Csábi & Kövesces (2008), provided central senses of hold and keep to one group of students (key idea of hand in hold and control in keep) and asked another group of students to translate various hold and keep sentences from English into Hungarian. The first group of students who got the core meanings did better than the second group in both immediate and delayed post-tests.

A core meaning approach has been used with prepositions (Tyler & Evans, 2004), phrasal verbs (Condon & Kelly, 2002, as cited in Tyler, 2012) and article use (Thu, H. N., & Huong, N. T., 2005).

One interesting thing to note is that the addition of images in cognitive linguistics studies seem to be very helpful in learning performance. Hence I have started a database of images that could be useful in language teaching, mainly for English but other languages can be added. So please do let me know or do share link with people who may be interested.

References

Beréndi, M., Csábi, S., & Kövecses, Z. (2008). Using conceptual metaphors and metonymies in vocabulary teaching. In F. Boers & S. Lindstromberg (Eds.), Cognitive linguistic approaches to teaching vocabulary and phraseology (pp. 65–100). Berlin: Mouton de Gruyter.

Thu, H. N., & Huong, N. T. (2005). Vietnamese learners mastering English articles (Published doctoral dissertation).

Tsai, M. H. (2014). Usage-based cognitive semantics in L2 collocation: A microgenetic analysis of EFL students’ collocational knowledge (Unpublished doctoral dissertation).

Tyler, A. (2012). Cognitive linguistics and second language learning: Theoretical basics and experimental evidence. Routledge.

Tyler, A., & Evans, V. (2003). The semantics of English prepositions: Spatial scenes, embodied meaning and cognition. Cambridge, UK: Cambridge University Press.

Verspoor, M. H., & Lowie, W. (2003). Making sense of polysemous words. Language Learning, 53, 547–586.

Why the pineapple?

This post can be considered a follow on from the post Collocations need not be arbitrary.

One response that proponents of the lexical approach in language teaching could make to the issue of looking at meanings and collocations is simply to define collocation as one level of meaning. John Firth, as cited by Joseph (2003), put it thus:

“The statement of meaning by collocation and various collocabilities does not involve the definition of word meaning by means of further sentences in shifted terms. Meaning by collocation is an abstraction at the syntagmatic level and is not directly concerned with the conceptual or idea approach to the meaning of words. One of the meanings of night is its collocability with dark, and of dark, of course, collocation with night.”

Joseph, 2003: 130

Defining collocations as one level of meaning is reasonable but it does not provide an explanation that may be pedagogically useful. Cognitive linguistics claims to provide such a use.

Let’s take the question of the difference between choosing highest mountain and tallest mountain that arose in a class recently. One explanation is based on the distribution of what collocates with tall – that is living things (tall man, tall tree) and man made objects (tall building, tall pole). Tall tends not to collocate with natural objects such as mountains.

That is where a Firthian (and by consequence a lexical) approach stops. A cognitive analysis by Dirven and Taylor (1988) showed that general cognition (in the form of concepts) can explain further.

Highest mountain is preferred as the concept HIGH includes both a meaning of vertical position (positional meaning) as well as vertical length (extensional meaning) whereas the concept TALL only includes the meaning of vertical length. So although you can find tallest mountain people often think of being at the top of a mountain hence the vertical position is emphasised rather than vertical length (see figure below):

Figure 1. after Dirven & Taylor, 1988: 386

Thanks for reading. And do have a read of a less favourable view of cognitive linguistics at a recent Geoff Jordan blog Anybody seen a pineapple?

Update

Marc Jones writes about cueing as a way to learn chunks Pinneapples?

References

Dirven, R., & Taylor, J. R. (1988). The conceptualisation of vertical space in English: The case of tall. In Topics in cognitive linguistics, B. Rudzka-Ostyn (ed), 379. John Benjamins.

Joseph, J. (2003). Rethinking linguistic creativity. In Rethinking Linguistics, H. Davis & T.J. Taylor (eds), 121–150. London: Routledge.

PirateBox is dead! Long live PirateBox!

The main developer of PirateBox (Matthias Strubel) announced the shutting down of the PirateBox forums recently. Fortunately PirateBox is still being developed for the wrong router. This is great news. I believe the wrong router uses a GL.iNet Mango GL-MT300N-V2 which connects at 300Mbs (3 times as fast as the TP-Link MR3020 router) and so means video sharing is very fast now.

Note: please do order a wrong router (and support development of PirateBox) if you are not comfortable with digging into router specifics.

One of the advantages of the wrong router mod of PirateBox is the use of html pages to serve files. Even though this was possible with the original PirateBox several other steps had to be taken to disable features that were not needed (e.g. I rarely used the upload or chat facility). And with html5 one can now share videos with subtitles (in .vtt form), something that is very useful when sharing videos in a language learning class.

In order to get the wrong router version working one needs to flash a PirateBox image first so that the auto-install function can work (for the Mango router the image to use is found here http://development.piratebox.de/target_thewrong_ramips-mt76x8/). Use the router’s original web UI to install the firmware image (on a slow USB stick this could take up to 45mins). Once done you can then follow the wrong router instructions to install the full wrong router modification of PirateBox.

Below are some screen shots of connecting to the router using my phone. Note the screen shot showing a video playing with subtitles. Nice!

Thanks for reading and here’s to 2020.

Related PirateBox posts:

The browser rulez or another reason why PirateBox is boss

Cutting your PirateBox jib

Piratebox, a way to share files in class

Offline (doku) wiki writing

TESOL France 2014 – thoughts, poster, handout and links

Related links – PirateBox development images

Collocations need not be arbitrary

“On the whole, delexicalized verbs are a good way of introducing the concept of collocation to learners of any L1 background. I usually start with make/do and show how one goes with homework while the other goes with mistake (I did my homework; I made a lot of mistakes). Why is it this way and not the other way around? Because words have collocations – they prefer the company of certain other words.(Selivan, 2018:28, emphasis added)

The quote above, from a book published in 2018, reflects a pervasive view in the literature that collocations are arbitrary, that is, there is no particular reason why words “prefer the company of certain other words”, they just do.

Liu (2010) identifies this view of collocation-as-arbitrary as wide-spread amongst scholars, he also demonstrated that it is a common assumption in published teaching materials. Of the books, studies and websites on teaching collocations he observed collocation exercises as mainly noticing and memorising fixed units or in other words form focused exercises.

Example of such exercises are:

“identifying or marking collocations in a passage or in collocation dictionaries; reading passages with collocations highlighted or marked; filling in the blanks with the right word in a collocation; choosing or matching correct collocates; translating collocations from L2 back into L1 or vice versa; and memorization-type activities like repetition and rehearsal” (Liu, 2010:21)

There were fewer exercises on linking collocation forms to their meanings.

In addition to overlooking the motivated aspects of collocations, learners also miss the chance to generalise what they learn (Wray, 2000). That is, collocations also need to be analysed if students are to make the most of them in new situations of use.

To take the examples of “make” and “do”, the core meaning of “make” is create, which is a process that is purposeful and/or more effortful than the core meaning of “do” of completion/the finishing of something, which focuses on the end result of an activity rather than on any effort in the process of that activity. Understanding these core meanings can throw light on the following use of “did a mistake”:

“But I did a mistake in talking about it, you know, the last time and recently”

The larger context of this is from a spoken news report:

weren’t there. Let me handle it. I said, ” Yes, ma’am. ” ROSEN: The rebuke of Mr. Clinton by his wife came after the former president revived the dormant issue of Mrs. Clinton’s own misstatements about her 1996 trip to Bosnia. You’ll recall Mrs. Clinton, in recent months, spoke of sniper fire jeopardizing her landing. But contemporaneous video and eyewitness account revealed there was no such threat, and the senator effectively if belatedly defused the story with an omission of error in late March. SEN-HILLARY-CLINTO: But I did a mistake in talking about it, you know, the last time and recently. ROSEN: But in Jasper, Indiana, Thursday, Mr. Clinton blamed the controversy on the biased news media. B-CLINTON: She took a terrible beating in the press for a few days because she was exhausted at 11:00 at night when she started talking about Bosnia. ROSEN: In fact, Mrs. Clinton related the false Bosnia story numerous times including in a prepared speech delivered freshly at mid morning. B-CLINTON: And then the president (COCA SPOK: FOX SPECIAL REPORT WITH BRIT HUME 6:00 PM EST, 2008, emphasis added)

We could speculate that in using “did a mistake” Hilary Clinton was implying that in her “exhausted” state the “misstatement” was the opposite of a purposeful lie. It was just one of many activities she did that day which happened to be an error.

This can also be seen in another example from COCA – “If I do a mistake, I’m cooked”.

The context is from a written publication this time, although the language in question is in reported form:

three minutes, sometimes the whole roll — eleven minutes. It has an advantage: It takes you to the real tempo of life. Most movies are shot rather quickly and in a way where you can manipulate your reality because of the amount of coverage ” — shooting a scene from many different angles so that the director can choose among them in the editing room. ” Here my manipulation is quite different. I have to build it in with the lighting, with the framing. It requires much more attention at this stage. If I do a mistake, I’m cooked, ” he says with a laugh. # Wings’ visual style may be old-fashioned at heart, but its sound is high-tech all the way. Besides the six channels of top-notch stereo sound broadcast through the theater speakers, Wings audiences will hear two channels of three-dimensional sound through a special headset called the Personal Sound Environment (PSE) distributed to each moviegoer. Developed by Imax affiliate Sonics Associates of Birmingham, Alabama, the PSE incorporates both IMAX 3-D glasses and tiny speakers mounted between (COCA MAG: Omni, 1994, emphasis added)

The person is talking about a number of steps in their work routine in shooting a movie. The use of “do” here is to signal that any disastrous mistake is not to be blamed on the person considering all the other things he has to juggle.

Note that I could only find 3 uses of “do a mistake”, of which 2 are shown here (the third one I can’t offer any speculation on as I suspect more context needs to be chased up than that provided by COCA).

This blog was inspired by a question from a student about why a text had “in many respects” rather than “in many aspects”. I went onto COCA to have a look but could not discern any useful explanation. I just told the student that “aspects” does not seem to prefer “in many” compared to “respects”! Only later when I thought about the root word in common “spect” (meaning see) did a arguably useful explanation present itself – “in many respects” implies that the [re-seeings] have already been understood in some way. While “in many aspects” the reader may not yet know what these [partial-seeings] may be. These meanings could match up with the observation that “in many respects” often comes at the end of a clause or sentence while “in many aspects” may tend to come at the beginning of a clause or sentence.

Thanks for reading.

References:

Davies, M. (2008). Corpus of contemporary American English online. Retrieved from https://www.english-corpora.org/coca/.

Liu, D. (2010). Going beyond patterns: Involving cognitive analysis in the learning of collocations. TESOL Quarterly, 44(1), 4-30.

Selivan, L. (2018). Lexical Grammar: Activities for Teaching Chunks and Exploring Patterns. Cambridge University Press.

Wray, A. (2000). Formulaic sequences in second language teaching: Principle and practice. Applied linguistics, 21(4), 463-489.

Grassroots language technology: Adam Leskis, grammarbuffet.org

Language learning technology can be so much more than what commercial groups are offering right now. The place to look is to independent developers and teachers who are innovating in this area. Adam Leskis is one such person and here he discusses his views and projects.

1. Can you tell us a little of your background.

I started out in my first career as an English teacher, and it was clear to me that there were better ways we could both create and distribute digital materials for our students. As an example, during my last year of professional teaching (2015), the state of cutting edge tech integration was taking a powerpoint from class and uploading it to youtube.


What struck me in particular was the way in which technology was being used primarily only in a capacity to reproduce traditional classroom methods of input rather than actually taking advantage of the advanced capabilities of the digital medium. I saw paper handout being replaced by uploaded PDFs, classroom discussions replaced by online forums, and teacher-fronted lectures replaced by videos of teachers speaking.


I knew I wanted to at least try to do something about it, so I set off teaching myself how to use the tools to create things on the internet. I eventually got good enough to be hired to do web development full time, and that’s what I’ve been doing ever since.

2. In what ways do you feel technology can help with learning languages?

Obviously, given the very social nature of education and human language use, technology could never fully replace a teacher, and so this isn’t really what I’m setting out to do. Where I see technology being able to make an enormous impact, though, is in its ability to automate and scale a lot of the things on the periphery that language learning involves.


As an example, vocabulary is a very important component to being able to use and understand language. Thankfully, we now have the insights from corpus-based methods to help us identify which vocabulary items deserve primary focus, and it’s a fairly straightforward task to create materials including these.


However, what this means in practice is either students need to pay for expensive course books containing materials created with a corpus-informed approach to vocabulary, or the teachers and students themselves need to spend time creating these materials. Course books tend to be very expensive, and even those which come with online materials aren’t updated very frequently. Teachers and students creating their own materials are left to scour the internet for items to then analyze and filter for appropriate vocabulary inclusion, and then beyond that they need to construct materials to target the particular skill areas they would like to use the vocabulary for (eg, writing, listening), and which target the authentic contexts they are interested in, which is a very time-consuming manual process.


Technology has the ability to address both of these concerns (lack of updates and requirements of time). As one example, I created a very simple web app that pulls in content from the writing prompts sub-reddit (https://www.reddit.com/r/WritingPrompts/) and uses it to help students work on identifying appropriate articles (a/an/the) to accompany nouns and noun phrases. The content is accessed in real time when the student is using the application, and given the fast turnover in this particular sub-reddit, this means that using it once a day would incorporate completely different content, essentially forming a completely new set of activities.
One of the other advantages to this approach is the automated feedback available to the user. So in essence, it’s a completely automated system to that uses authentic materials (created largely by native speakers for native speaker consumption) to instantly generate and assess activities focused on one specific learning objective.


The approach does still have its shortcomings, in that this particular system is just finding all the articles and replacing them with a selection drop-down, so it’s only able to give feedback on whether the user’s selection is the same as the original article. Also, since this is a very informal genre, the language used might not be suitable for all ages of users.


3. What are your current projects?


I wish I had more time do work on these, since I currently only have early mornings and commuting time on the train to use for side projects, but there are a few things I’m working on that I’m really excited about.


Now that I have one simple grid-based game up and running (https://www.grammarbuffet.org/rhyme-game/), I’m thinking about how I can re-use that same user interface to target other skills. If, instead of needing to tap on the words that rhyme, we could just have the users say them, that would be a much more authentic way to assess whether the user is able to “do something” with their knowledge of rhymes. There is an HTML5 Speech API that I’ve been meaning to play around with, so that could be a potential way to create an alternate version based on actual speaking skills rather than just reading skills.


Another permutation of the grid-based game template would be integrating word stress instead of rhymes. I’m currently trying to get a good dataset containing word stress information for all the words in the Academic Word List (Coxhead, 2000), which I suppose is a bit dated now as a corpus-based vocabulary list, but it was my first introduction to the power of a corpus approach, and so I’ve always wanted to use it to generate materials on the web. The first version of this will probably also just involve seeing the word and using stress knowledge to tap it, rather than speaking, but I’m also imagining how we could use the capabilities of mobile devices to allow the user to shake or just move their phone up and down to give their answers on word stress. Once that’s up and running it’s  very simple to incorporate more modern corpus-based vocabulary lists (eg, the Academic Spoken Words List, 2017). Moreover, since this is all open source, anyone could adapt it for their particular vocabulary needs and deploy a custom web app via tech like Netlify.


Beyond these simple games, I’m also starting to work on a way to take authentic texts (possibly from a more academic genre on reddit like /r/science or text of articles on arXiv) to create cloze test types of materials using the AWL. The user would need to supply the words instead of select, which is a much more authentic assessment of their ability to understand and actually use these words in written English.


4. I really like the idea of offline access, how can people interested in this learn more?


The technology that enables this is currently referred to as Progressive Web Apps (PWAs), and relies on the technology of Web Workers. Essentially, because website development relies on javascript, we’re able to put javascript processes between the user’s browser and the network to intercept network requests and just return things that have already been downloaded. So for applications where all the data is included in the initial page load, this means that the entire website will work offline.


It’s a very relevant concept for our users who either have very unreliable network access, or even relatively expensive network costs. If we’re discussing applications that users engage in every single day, the network access becomes non-trivial, especially if it’s using the old website model of full page reload on every change in the view, rather than a modern single page app, written in either Angular or React. So absolutely, I would say it matters whether modern learning materials are using the latest technology to enable all of these enhancements to traditional webpages.

Much of this movement towards “offline-first” is informed by the JAMstack, which itself is a movement towards static sites that are deployable without any significant backend resources. This speaks to one of the goals of the micromaterials movement, which is the separation of getting that data from actually doing something with it in the web application. One early attempt in terms of setting up a backend API to be consumed is https://micromaterials.org, which just returns sentences from the WritingPrompts subreddit. It’s admittedly very crude (and even written in python 2, yuck!), but shows what could eventually be a model for data services that could feed into front-end micromaterials apps.


 5. Ideas/Plans for the future?


These disadvantages are a lot more obvious if this remains one of only a few such applications, but imagine if there were hundreds or even thousands of these forming something much more like an ecosystem. And then extrapolate that further to imagine thousands of backend server-side APIs for each conceivable genre of English enabling a multitiude of frontend applications to consume the data and create materials for different learners. As soon as you have one server-side service providing data on AWL words, that allows any number of web applications to consume and transform that data into activities.


The plan all along was not for me to create all of these applications, but to inspire others to begin creating similar type of micromaterials. It hasn’t yet caught on, and clearly, expecting teachers to take up this kind of development is not sustainable. I’m hoping that other developers see the value in these and join the movement.


In a sense, the sever-side API’s are a bigger prerequisite to getting this whole thing off the ground, so I’m very happy to work with any backend developers on what we need going forward, but I’m also going to continue developing things myself until we have a big enough community to take over.


I think whether all of these micromaterials exist under the umbrella of one single sign-on with tracking and auditing is beyond the scope of where we’re currently at, though I’m imagining a world where users could initiate their journey into the service, take a simple test involving all four of the main skills (reading, writing, speaking, and listening), and then be recommended a slew of micromaterials to help them out. 


For some users that might focus more on the reading and writing components, whereas for others that might focus more on the speaking and listening ones. The barrier to this currently being available is not at all significant and just involves getting development time invested in crating the materials. If I had them all created right now, I would be able to deploy them today with modern tooling like Netlify.


The problem is more one of availability and time, and I’m more than happy to work with other developers and teachers to bring this closer to a reality for our students.

Thanks for reading and many thanks to Adam for sharing his time on this blog; you can follow Adam on his blog [https://micromaterialsblog.wordpress.com/] and on twitter @BaronVonLeskis.

Please do read the other posts in the Grassroots language technology series if you have not done so already.