1990 was a good year for music – Happy Mondays, Stone Roses, Primal Scream, James, House of Love. 1990 was also good for what is, in my humble opinion, one of the best pedagogical grammars for article instruction – Peter Master’s paper Teaching the English Articles as a Binary System published in TESOL Quarterly.
It is a pedagogical grammar because it simplifies the four main characteristics of articles definiteness[+/-definite], specificity[+/-specific], countability[+/-count] and number[+/-singular] into two bigger concepts namely classification and identification. So 0 or no article and a/an is used to classify and the used to identify.
As discussed in a previous post the two main features of articles are definiteness and specificity. So the four possible combinations are:
1a. [-definite][+specific] A tick entered my ear.
b. [-definite][-specific] A tick carries disease.
c. [+definite][+specific] The computer is down today.
d. [+definite][-specific] The computer is changing our lives
Master’s binary scheme emphasizes 1b and 1c at the expense of 1a and 1d. That is +identification feature describes [+definite][+specific] and -identification or classification describes [-definite] [-specific].
The effect of ignoring specificity in indefinite uses is saying all uses of no article or a/an is essentially generic. Whether we mean a specific, actual tick as in 1a or a generic one as in 1b we still classify that tick when using the article a. Paraphrased as something that can be classified as a tick entered my ear/carries disease.
The effect of ignoring specificity in definite uses is saying that all uses of the are essentially specific. Although the difference between 1c and 1d is significant we can rely on the fact that generic the is relatively infrequent. Further some argue that generic the is not very different from specific the. The identified quality of a generic noun like the computer is held onto. We do not classify one-of-a-group for computer until we interpret the rest of the sentence. And when we understand the noun as requiring a generic interpretation we seem to see such interpretation through the individual. So generic the is considered as “the identification of a class”
Master goes on to give some advice of teaching classification. For instance, have students sort a pile of objects into categories – These are books/These are pencils/This is paper/This is a pen.
For identification have students identify members in the categories – This is the blue book/These are the red pencils/This is the A4 paper/This is the new pen.
In addition teach them that proper nouns, possessive determiners (my, her), possessive ’s (the girl’s), demonstratives (this, that) and some other determiners (e.g. either/neither,each, every) —> identify; while no article , a/an, and determiners such as some/any one —> classify.
Countability only needs to be considered for classified nouns as identified nouns require the whether they be countable or not.
Master then provides the following chart:
After the concepts of classification and identification are presented and practiced details of use can be shown as in the table below:
I won’t repeat what Master says as I have already done too much of that. Once you read Master’s paper the two figures can be used as a memory aid.
Master says that discourse effects of article use (e.g. given/theme and new/rheme) can be matched onto his binary schema i.e. given info is identification and new info is classification. And that for many noun phrase uses of article such as ranking adjectives, world shared knowledge, descriptive vs partitive of phrases, intentional vagueness, proper nouns and idiomatic phrases there is no need to go beyond the sentence unless first/subsequent mention is a involved.
Thanks for reading.
Master, P. (1990). Teaching the English articles as a binary system. Tesol Quarterly, 24(3), 461-478.
We wanted to explore what successful learners do when they speak and in particular learners from B1-C1 levels, which are, we feel, the most common and important levels. The CEFR gives “can do” statements at each level but these are often quite vague and thus open to interpretation. We wanted to discover what successful learners do in terms of their linguistic, strategic, discourse and pragmatic competence and how this differs from level to level.
We realised it would be impossible to use data from all the interactions a successful speaker might have so we used interactive speaking tests at each level. We wanted to encourage learners and teachers to look at what successful speakers do and use that, at least in part, as a model to aim for as in many cases the native speaker model is an unrealistic target.
2. What corpora were used?
The main corpus we used was the UCLan Speaking Test Corpus (USTC). This contained data from only students from a range of nationalities who had been successful (based on holistic test scoring) at each level, B1-C1. As points of comparison, we also recorded native speakers undertaking each test. We also made some comparisons to the LINDSEI (Louvain International Database of Spoken English Interlanguage) corpus and, to a lesser extent, the spoken section of the BYU-BNC corpus.
Test data does not really provide much evidence of pragmatic competence so we constructed a Speech Act Corpus of English (SPACE) using recordings of computer-animated production tasks by B2 level learners for requests and apologies in a variety of contexts. These were also rated holistically and we used only those which were rated as appropriate or very appropriate in each scenario. Native speakers also recorded responses and these were used as a point of comparison.
3. What were the most surprising findings?
In terms of the language learners used, it was a little surprising that as levels increased, learners did not always display a greater range of vocabulary. In fact, at all levels (and in the native speaker data) there was a heavy reliance on the top two thousand words. Instead, it is the flexibility with which learners can use these words which changes as the levels increase so they begin to use them in more collocations and chunks and with different functions. There was also a tendency across levels to favour use of chunks which can be used for a variety of functions. For example, although we can presume that learners may have been taught phrase such as ‘in my opinion’ this was infrequent and instead they favoured ‘I think’ which can be used to give opinons, to hedge, to buy time etc .
In terms of discourse, the data showed that we really need to pay attention to what McCarthy has called ‘turn grammar’. A big difference as the levels increased was the increasing ability of learners to co-construct conversations, developing ideas from and contributing to the turns of others. At B1 level, understandably, the focus was much more on the development of their own turns.
4. What findings would be most useful to language teachers?
Hopefully, in the lists of frequent words, keywords and chunks they have something which can inform their teaching at each of these levels. It would seem to be reasonable to use, as an example, the language of successful B2 level speakers to inform what we teach to B1 level speakers. Also, though tutors may present a variety of less frequent or ‘more difficult’ words and chunks to learners, successful speakers will ultimately employ lexis which is more common and more natural sounding in their speech, just as the native speakers in our data also did.
We hope the book will also give clearer guidance as to what the CEFR levels mean in terms of communicative competence and what learners can actually do at different levels. Finally, and related to the last point, we hope that teachers will see how successful speakers need to develop all aspects of communicative competence (linguistic, strategic, discourse and pragmatic competence) and that teaching should focus on each area rather than only one of two of these areas.
There has been some criticism, notably by Stefan Th. Gries and collaborators that much learner corpus research is restricting itself factorwise when explaining a linguistic phenomenon. Gries calls for a multi-factor approach whose power can be seen in a study conducted with Sandra C. Deshors, 2014, on the uses of may, can and pouvoir with native English users and French learners of English. Using nearly 4000 examples from 3 corpora, annotated with over 20 morphosyntactic and semantic features, they found for example that French learners of English see pouvoir as closer to can than may.
The analysis for Successful Spoken English was described as follows:
“We examined the data with a mixture of quantitative and qualitative data analysis, using measures such as log-likelihood to check significance of frequency counts but then manual examination of concordance line to analyse the function of language.”
Hopefully with the increasing use of multi-factor methods learner corpus analysis can yield even more interesting and useful results than current approaches allow.
Chris and his colleagues kindly answered some follow-up questions:
5. How did you measure/assign CEFR level for students?
Students were often already in classes where they had been given a proficiency test and placed in a level . We then gave them our speaking test and only took data from students who had been given a global pass score of 3.5 or 4 (on a scale of 0-5). The borderline pass mark was 2.5 so we only chose students who had clearly passed but were not at the very top of the level and obviously then only those who gave us permissions to do so. The speaking tests we used were based on Canale’s (1984) oral proficiency interview design and consisted of a warm up phase, a paired interactive discussion task and a topic specific conversation based on the discussion task. Each lasted between 10-15 minutes.
6. So most of the analysis was in relation to successful students who were measured holistically?
7. And could you explain what holistically means here?
Yes, we looked at successful learners at each CEFR level, according to the test marking criteria. They were graded for grammar, vocabulary, pronunciation, discourse management and interactive ability based on criteria such as the following (grade 3-3.5) for discourse management ‘Contributions are normally relevant, coherent and of an appropriate length’. These scores were then amalgamated into a global score. These scales are holistic in that they try to assess what learners can do in terms of these competences to gain an overall picture of their spoken English rather than ticking off a list of items they can or cannot use.
8. Do I understand correctly that comparisons with native speaker corpora were not as much used as with successful vs unsuccessful students?
No, we did not look at unsuccessful students at all. We were trying to compare successful students at B1-C1 levels and to draw some comparison to native speakers. We also compared our data to the LINDSEI spoken learner corpus to check the use of key words.
9. For the native speaker comparisons what kind of things were compared?
We compared each aspect of communicative competence – linguistic, strategic, discourse and pragmatic competences to some degree. The native speakers took exactly the same tests so we compared (as one example), the most frequent words they used.
This is a response to a John Sweller article in 2017 on applying cognitive load theory to language teaching.
Geary and the interface hypothesis
I want to first discuss cognitive developmental and evolutionary psychologist David Geary’s, 2007, two types of knowledge since Sweller invokes Geary to assert a critical division or discontinuity between child first language acquisition and adult second language acquisition.
Geary’s first type of knowledge (or abilities/domains/cognition, Geary uses these terms interchangeably) has evolved over human evolutionary time and is labelled primary knowledge. Such knowledge (such as your first language) is said to be fast and implicit. Geary’s second type of knowledge develops due to cultural reasons and is slower and explicit. Geary uses reading as an example of secondary type of knowledge. I have dropped the label biological as I think it is unhelpful for the present discussion.
We could see a parallel here between Geary’s division and the conscious/unconscious or explicit/implicit division discussed in second language acquisition (SLA). The following quotes of Geary:
“I focus on primary abilities because these are the foundation for the construction of secondary abilities through formal education.” Geary, 2007:3 “Academic learning involves the modification of primary abilities…” Geary, 2007:5 “I assume that primary knowledge and abilities provide the foundation for academic learning.” Geary, 2007:6
seem to indicate when applied to language that there is some sort of interface between conscious learning of language and its unconscious acquisition.
So does such an interface exist? If so how does it work? Absent answers to such questions we should accept the default position that there is no interface, that explicit conscious language knowledge is separate from implicit unconscious knowledge (John Truscott, 2015).
Discontinuities and the nature of language
Cognitive scientists such as Susan Carey (2009) class language as a core cognitive activity (core cognition differs from sensory-perceptual systems and theoretical conceptual knowledge) along with object, number, and agent cognition. And there is (largely) a continuity of such core cognitions from childhood to adulthood. Discontinuities happen with say object knowledge and physics knowledge – infants know that objects are solid yet when older the theory of physics tells them that objects are not really solid. Here the physics is “incommensurate” with object cognition and this contributes to the difficulty for students of studying physics at school. Physics is at the same time more expressively powerful than object cognition.
It is unclear from Geary what kind of discontinuity is being described or even if there is one (as the labels primary and secondary seem to point to). From what I can gather Geary seems to think that primary knowledge can help with secondary knowledge (seen as the interface position in SLA) and so the two may not be so conflictual after all. I may of course be mistaken in my reading here of Geary.
The unclarity from Geary of what kind of discontinuity he means may explain the logical leap that Sweller seems to have made, namely, adult second language acquisition is secondary knowledge and incommensurate with the child’s first language acquisition. Let’s look at the passage where he indicates this:
“Learning a second language as an adult provides an example of secondary knowledge acquisition as do most of the topics covered in educational institutions. We invented education to deal with biologically secondary information. Learning to listen to and speak a second language as an adult requires conscious effort on the part of the learner and explicit instruction on the part of instructors. Little will be learned solely by immersion. Furthermore, since learning to read and write are biologically secondary because we have not evolved to acquire these skills, they also require conscious effort by learners and explicit teaching by instructors, irrespective of whether we are dealing with a native or second language.”
Sweller seems to be mixing up literacy skills with (adult) language acquisition. And further seems to switch between the two – compare “learning to listen to and speak a second language as an adult requires conscious effort” and “learning to read and write are biologically secondary”. Also he assumes that because languages are taught in schools that means they are like other school subjects i.e. language is like developing conceptual knowledge in physics, maths, chemistry etc.
This assumption that language is like conceptual knowledge is very evident in this 1998 article by Graham Cooper and his use of a “foreign language” example to explain an aspect of cognitive load theory:
Most language teachers will find this view of language very peculiar. For example, the assumption that because a vocab item may be a single word it can be classed as a low element interaction. This ignores the semantics of single words for a start. More generally, as seen in the screenshot, there is an assumption that language is an object that can be transmitted to learners from the environment much like concepts in a subject like maths.
I want to now comment on some more paragraphs in the Tesol Ontario article. Let’s start with the first paragraph:
“Most second language teaching recommendations place a considerable emphasis on “naturalistic” procedures such as immersion within a second language environment. Immersion means exposing learners to the second language in many of their daily activities, including other educational activities ostensibly unrelated to learning the second language.”
I guess by “naturalistic” procedures Sweller may be alluding to the Natural approach by Krashen? If so he has badly understood what that means and is badly out of date with the debate. Badly understood since the natural approach does not entail immersion and badly out of date by ignoring developments such as task based learning which arguably “includes other educational activities ostensibly unrelated to the second language”.
“Information-store principle. In order to function, we must store immeasurably large amounts of information in long-term memory. The difference between people who are more as opposed to less competent in any area including competence in a second language is heavily determined by the amount of knowledge held in long-term memory (Ericsson & Charness, 1994; Nandagopal & Ericsson, 2012).”
This may, with caveats, apply to vocabulary learning or pragmatics, but how applicable is it to other language systems such as syntax or phonology. Further the studies quoted are based on novice and experts in non-language domains like chess.
“In second language learning, this means teachers should explicitly present the grammar and vocabulary of the second language rather than expecting learners to induce the information themselves (see Kirschner et al., 2006, for alternative formulations that emphasise implicit learning) as occurs when dealing with a biologically primary task such as learning a native language as a child.”
Sweller is characterizing child acquisition as “expecting learners to induce the information”. What is meant by induction here? Does he mean usage based notions of induction where statistical information in the environment is used by the child to learn a language? If so then usage folks say the same process also happens in adult language learning and further that process is not explicit in the sense used by Sweller.
“Requiring learners to go to a separate dictionary imposes an additional cognitive load. Learners should not be required to search for needed information.”
How does this claim compare with say the involvement load hypothesis of Batia Laufer and Jan Hulstijn from 2001, where “search” is one of the cognitive components and more “search” e.g. consulting a dictionary is said to lead to better vocabulary retention? (as an aside – involvement load hypothesis was influenced by the levels of processing theory, a general critique of cognitive load theory is why should more load lead to learning problems? Contrast this to levels of processing which implies deeper (more load?) processing would lead to better performance).
“Another recommendation is to avoid redundancy. Unnecessary information frequently is processed with learners only finding after the event that they did not need to process the additional information in order to learn.”
Considering the reported benefits for novice language learners of elaborated input (not translations but “redundancy and clearer signaling of thematic structure in the form of examples, paraphrases and repetition of original information, and synonyms and definitions of low-frequency words” – Sun-Young Oh, 2001), what evidence is there that such elaborated input is not as beneficial for more expert language learners?
To conclude, note that the summary report from the Centre for Education Statistics and Evaluation (2017) which ELT Research Bites covered, describes several criticisms of cognitive load theory in general. My discussion attempted to critique the application of this theory to language acquisition. This critique is only very cursory but it is I think enough to raise serious doubts about the extent of Sweller’s awareness of SLA research and hence to take any applications very critically. This does not preclude future applications of cognitive load theory in language teaching and certainly, notwithstanding the general critiques, it is applicable in the domain of instructional design where it originated.
Thanks for reading.
Carey, S. (2009). The origin of concepts. Oxford University Press.
A #corpusmooc participant in answering a discussion question on what they would like to use corpora for replied that they wanted a reference book that shows various common structures in various genres such as “letters of condolence, public service announcements, obituaries”.
The CORE (Corpus of Online Registers) corpus at BYU along with the virtual corpora feature allows a way to reach for this.
For example, the screenshot below shows the keywords of verbs & adjectives in the Reviews genre:
Before I briefly show how to make a virtual corpus do note that the standard interface allows you do to a lot of things with the various registers. The CORE interface shows you examples of this. For example the following shows the distribution of the present perfect across the genres:
Create virtual corpora
To create a virtual corpus first go to the CORE start page:
Then click on Texts/Virtual and get this screen:
Next press Create corpus to get this screen:
We want the Reviews Genre so choose it from the drop down box:
Then press Submit to get the following screen:
Here you can either accept these texts or say you want to build only a film review corpus manually look through links and filter for film reviews only. Give your corpus a name or add it to an already existing corpus. Here we give it the name “review”:
Then after submitting you will be taken to the following screen which shows you all your virtual corpora collection we can see the corpus we just created at number 5:
Now you can list keywords.
Do note that the virtual corpora feature is available in most of the BYU collection so if genre is not your thing maybe the other choices of corpora might be useful.
Thanks for reading and do let me know if anything appears unclear.
There is a new online test, the CAT-WPLT (computerized adaptive testing of Word Part Levels Test) to assess students word part knowledge, i.e. prefix, suffix and stems (though the test only uses affixes for receptive use). The (diagnostic) test is composed of three parts – form, meaning and use. The form part presents 1 real affix and 4 distractor affixes for the test user to choose. The meaning part presents 1 correct meaning and 3 distractor meanings and the use part presents 4 parts of speech to match one of these correctly to the affix.
The online test takes about 10-15mins to complete and results in a nice feedback screen showing how the test taker did on the form, meaning and use of the affixes. There are comparison advanced, intermediate and beginner profiles.
So say you have a profile of a student who shows weakness in form and meaning. What now? Mizumoto, Sasao, & Webb (2017) suggest giving learners their pdf list of 118 affixes (assuming you don’t need to use the test again). So if your learner is at level 1 for recognizing the form of an affix, the affixes listed as level 2 can be focused on.
Another possibility is a memory technique called the word part technique.
Word part technique
Very simply it is using an already known word which contains the same word stem/root as the new word to be remembered.
More specifically the system Wei and Nation (2013) describe lists very frequent stems i.e. stems which appear in words in the most frequent 2000 words of the BNC. These are then used to learn stems appearing in the remaining 8000 mid-frequency words in the BNC wordlist. For example a high frequency word like visit has the root -vis- which appears in mid-frequency words such as visible, envisage, revise.
Once a form connection is seen between a known high frequency word and a mid-frequency word a meaning connection needs to be made i.e. explaining the form connection. So to explain the word visible we can say visible is something that you can see. Here the explanation uses the meaning of -vis- i.e. see.
(high freq. word) visit -> go to see someone
(stem) vis -> see
(mid-freq. word) visible -> something that you can see
According to Wei & Nation (2013) the most difficult step is explaining the connection. Though I think the most difficult is the first step – seeing the connection i.e. the stem/root word. Wei & Nation (2013) encouragingly state that making the connection and explaining it can develop with practice.
Click here to see top 25 word stems taken from Wei & Nation (2013)
They go on to recommend that once students have worked with this technique with the teacher they can go on to use it themselves as a strategy.
The technique’s efficacy is on par with the keyword technique and learners own methods or self-strategies (Wei, 2015). The word part technique has the added benefits that come with the nature of etymology and the history of words.
Thanks for reading.
Mizumoto, A., Sasao, Y., & Webb, S. A. (2017). Developing and evaluating a computerized adaptive testing version of the Word Part Levels Test. Language Testing, 0265532217725776.
Wei, Z., & Nation, P. (2013). The word part technique: A very useful vocabulary teaching technique. Modern English Teacher, 22, 12–16.
Wei, Z. (2015). Does teaching mnemonics for vocabulary learning make a difference? Putting the keyword method and the word part technique to the test. Language Teaching Research, 19(1), 43-69.
First up is the news that there are more than 700 members. Nice.
Important date for your diaries is 25 September 2017 when another round of #corpusmooc is launching. This time new sections are promised and most notable new addition is a new version of LancsBox. Check out the following two cute vids being used to promote #corpusmooc 2017:
“We need to be careful though not to oversell the technology and be clear about what it can and can’t do. There is no silver bullet. This is especially the case when it comes to skills vs knowledge; a lot of the applications that could come from this sort of technology will help improve knowledge of English, and may contribute to accuracy” [https://eltjam.com/machine-learning-summer-school-day-5/]
The above two quotes are from a nice series of posts by ELTJam on a machine learning workshop. The first point from the first quote is indeed important to recognize. Bill VanPatten (2010) has argued that knowledge and skill are different. However what is meant by knowledge and what is meant by skill? For a nice video summary of the VanPatten paper see the video linked below.
Knowledge is mental representation which in turn is the abstract, implicit and underlying linguistic system in a speaker’s head. Abstract does not mean the rules in a pedagogical grammar rather it refers to a collection of abstract properties which can result in rule-like behaviors. Implicit means that the content of mental representation is not accessible to the learner consciously or with awareness. Underlying refers to the view that a linguistic system underlies all surface forms of language.
The actual content of mental representations include all formal features of syntax, phonology, lexicon-morphology, semantics. And a mental representation grows due to input being acted on by systems from the learners mind/brain.
Skill is the speed and accuracy with which people can do certain behaviours. For language skill this refers to reading, listening, writing, speaking, conversational interaction, turn taking. To be sure being skilled means that the person has a developed mental representation of the language. However having a developed mental representation does not entail being skilled. How skill develops depends on the tasks that people are doing. A person learns to swim by swimming. A person learns to write essays by writing essays.
It follows that the Write&Improve (W&I) tool (as the flagship example of machine learning based tool for language learning) can be seen as targeting how to be skillful in writing Cambridge English Exam texts. The claim that machine learning, and by implication the feedback by W&I, is changing the knowledge of the learner’s English does not accord with VanPatten’s description of knowledge as mental representation. His description implies that no explicit information, in the form of feedback in the case of the writing tool, can lead to changes in the mental representation of the language of writing. He states that research into writing is unclear as to whether feedback impacts writing development.
My point in this post is to briefly clarify the distinction between knowledge and skills (do read the VanPatten paper) and to suggest that the best machine learning based tools can offer are opportunities for students to practice certain skills.
W&I has never claimed that its tool has impact on language knowledge. See Diane Nicholls comment below.