Articles and collocational effects

While I was doing some marking I came across what Master (2007) in discussing article choice has called “overgeneralisations from similar patterns” or effects of collocational phrases.

The following patterns were found in the same essay (on cyber warfare):
1. Those types of attacks are occurring everyday, and are often due to the lack of awareness of the victims.
2. The USA has declared unofficially that they have been under several cyber attacks from China, but with the lack of evidence, they can’t press charges against them.

In 1 article use seems to be okay but in 2 at first glance there seems to be an error i.e. the student should have chosen “a lack of evidence”.

However Master (2007) argues that:

article selection may have been the product of overgeneralization from an already learned collocational phrase rather than from the misapplication of a rule.

In our example we can see that in 1 the student has correctly used “due to the lack of awareness of the victims” whilst in 2 we could argue that the student is implying the thought “with the lack of evidence (that they have)”.

In marking feedback you might consider that since in sentence 1 the student has postmodified “the lack of awareness” with “of the victims” then we could give the student the benefit of the doubt in sentence 2 by saying they are implying a postmodification of “that they have” in the use of definite article in “the lack of evidence”.

Master (2007) found based on 20 low-advanced proficient students doing timed essays that:

more than a quarter of the article “errors” were actually viable choices that should have been honored.

He classed his data into noun phrase structure, modification structure and discourse structure. The noun phrase structures included count/noncount, generic/specific and idioms. The modification structures included pre-modifying ranking adjectives, postmodified/nonmodified, unlimited/limited quantity, partitive/descriptive Of-phrases, intentional vagueness. The discourse structure was first and subsequent mention.

As an example of a count/non-count distinction:

An example of a pedagogical intervention is:

Master (2007) uses two reasons to justify this approach of looking at how students  choose articles. One is that based on Yoon & Bailey (1988), as cited by Masters (2007), “teachers as editors often correct article usage in ways unintended by the original author”. And two based on Sheen (2007), as cited by Masters (2007), metacognitive feedback in addition to corrective feedback can impact student article use more than just corrective feedback.

Furthermore teaching article use in context such as lexical bundles, which are another form of collocation, has shown to be effective (Shin & Kim 2017). It must be noted that the chances of finding what could be called a paired example (as in my student’s two sentences above) may be rare and so giving students more leeway could be harder to justify.

Thanks for reading and if you want to explore more on article use have a read of some previous scribbles on this: Classified and Identified – A pedagogical grammar for article use and A, an, the, definiteness and specificity.

References:

Master, P. (2007). Article errors and article choices. The CATESOL Journal, 19(1), 107-131. Retrieved from (pdf) [http://www.catesoljournal.org/wp-content/uploads/2014/07/CJ19_master.pdf]

 

Shin, Y. K., & Kim, Y. (2017). Using lexical bundles to teach articles to L2 English learners of different proficiencies. System, 69, 79-91.
Advertisements

Finding relative frequencies of tenses in the spoken BNC2014 corpus

Ginseng English‏ @ginsenglish issued a poll on twitter asking:

This is a good exercise to do on the new spoken BN2014 corpus. See instructions to get access to the corpus.

You need to get your head around the parts of speech (POS) tag. The BNC2014 uses CLAWS 6 tagset. For the past tense we can use past tense of lexical verbs and past tense of DO. Using the past tenses of BE and HAVE would also pull in their uses as auxiliary verbs which we don’t want. This could be a neat future exercise in figuring out how to filter out such searches. Another time! Onto this post.

Simple past:

[pos=”VVD|VDD”]

pos = part of speech

VVD = past tense of lexical(main) verbs

VDD = past tense of DO

| = acts like an OR operator

So the above look for parts of speech tagged as either past tense of lexical verbs or past tense of DO.

Simple present

The search term for present simple is also relatively simple to wit:

pos=[“VVZ”]

VVZ     -s form of lexical verb (e.g. gives, works)

Note the above captures third person forms, how can we also catch first and second person forms?

Present perfect

[pos = “VH0|VHZ”] [pos =”R.*|MD|XX” & pos !=”RL”]{0,4} [pos = “AT.*|APPGE”]? [pos = “JJ.*|N.*”]? [pos =”PPH1|PP.*S.*|PPY|NP.*|D.*| NN.*”]{0,2} [pos = “R.*|MD|XX”]{0,4} [pos = “V.*N”]

The search of present perfect may seem daunting; don’t worry the structure is fairly simple, the first search term [pos = “VH0|VHZ”] is saying look for all uses of HAVE and the last term [pos = “VVN”] is saying look for all past participles of lexical verbs.

The other terms are looking for optional adverbs and noun phrases that may come in-between namely

“adverbs (e.g. quite, recently), negatives (not, n’t) or multiword adverbials (e.g. of course, in general); and noun phrases: pronouns or simple NPs consisting of optional premodifiers (such as determiners, adjectives) and nouns. These typically occur in the inverted word order of interrogative utterances (Has he arrived? Have the children eaten yet?)” – Hundt & Smith (2009).

Present progressive

[pos = “VBD.*|VBM|VBR|VBZ”] [pos =”R.*|MD|XX” & pos !=”RL”]{0,4} [pos = “AT.*|APPGE”]? [pos = “JJ.*|N.*”]? [pos =”PPH1|PP.*S.*|PPY|NP.*|D.*| NN.*”]{0,2} [pos = “R.*|MD|XX”]{0,4} [pos = “VVG”]

A similar structure to the present perfect search. The first term [pos = “VBD.*|VBM|VBR|VBZ”]  is looking for past and present forms of BE and the last term [pos = “VVG”] for all ing participle of lexical verb. The terms in between are for optional adverb, negatives and noun phrases.

Note that all these searches are approximate – manual checking will be needed for more accuracy.

So can you predict the order of these forms? Let me know in the comments the results of using these search terms in frequency per million.

Thanks for reading.

Other search terms in spoken BNC2014 corpus.

Update:

Ginseng English blogs about frequencies of forms found in one study. Do note that as there are 6 inflectional categories in English – infinitive, first and second person present, third person singular present, progressive, past tense, and past participle, the opportunities to use the simple present form is greater due to the 2 categories of present.

References:

Hundt, M., & Smith, N. (2009). The present perfect in British and American English: Has there been any change, recently. ICAME journal, 33(1), 45-64. (pdf) Available from http://clu.uni.no/icame/ij33/ij33-45-64.pdf

Classified and Identified – A pedagogical grammar for article use

1990 was a good year for music  – Happy Mondays, Stone Roses, Primal Scream, James, House of Love. 1990 was also good for what is, in my humble opinion, one of the best pedagogical grammars for article instruction – Peter Master’s paper Teaching the English Articles as a Binary System published in TESOL Quarterly.

It is a pedagogical grammar because it simplifies the four main characteristics of articles definiteness[+/-definite], specificity[+/-specific], countability[+/-count] and number[+/-singular] into two bigger concepts namely classification and identification. So 0 or no article and a/an is used to classify and the used to identify.

As discussed in a previous post the two main features of articles are definiteness and specificity. So the four possible combinations are:
1a. [-definite][+specific] A tick entered my ear.
b. [-definite][-specific] A tick carries disease.
c. [+definite][+specific] The computer is down today.
d. [+definite][-specific] The computer is changing our lives

Master’s binary scheme emphasizes 1b and 1c at the expense of 1a and 1d. That is +identification feature describes [+definite][+specific] and -identification or classification describes [-definite] [-specific].

The effect of ignoring specificity in indefinite uses is saying all uses of no article or a/an is essentially generic. Whether we mean a specific, actual tick as in 1a or a generic one as in 1b we still classify that tick when using the article a. Paraphrased as something that can be classified as a tick entered my ear/carries disease.

The effect of ignoring specificity in definite uses is saying that all uses of the are essentially specific. Although the difference between 1c and 1d is significant we can rely on the fact that generic the is relatively infrequent. Further some argue that generic the is not very different from specific the. The identified quality of a generic noun like the computer is held onto. We do not classify one-of-a-group for computer until we interpret the rest of the sentence. And when we understand the noun as requiring a generic interpretation we seem to see such interpretation through the individual. So generic the is considered as “the identification of a class

Master goes on to give some advice of teaching classification. For instance,  have students sort a pile of objects into categories – These are books/These are pencils/This is paper/This is a pen.

For identification have students identify members in the categories – This is the blue book/These are the red pencils/This is the A4 paper/This is the new pen.

In addition teach them that proper nouns, possessive determiners (my, her), possessive ’s (the girl’s), demonstratives (this, that) and some other determiners (e.g. either/neither,each, every) —> identify; while no article , a/an, and determiners such as some/any one —> classify.
Countability only needs to be considered for classified nouns as identified nouns require the whether they be countable or not.

Master then provides the following chart:

After the concepts of classification and identification are presented and practiced details of use can be shown as in the table below:

Master-2002
From Master, 2002

I won’t repeat what Master says as I have already done too much of that. Once you read Master’s paper the two figures can be used as a memory aid.

Master says that discourse effects of article use (e.g. given/theme and new/rheme) can be matched onto his binary schema i.e. given info is identification and new info is classification. And that for many noun phrase uses of article such as ranking adjectives, world shared knowledge, descriptive vs partitive of phrases, intentional vagueness, proper nouns and idiomatic phrases there is no need to go beyond the sentence unless first/subsequent mention is a involved.

Thanks for reading.

References:

Master, P. (1990). Teaching the English articles as a binary system. Tesol Quarterly, 24(3), 461-478.
Master, P. (2002). Information structure and English article pedagogy. System, 30(3), 331-348.

A, an, the, definiteness and specificity

This is my attempt at recombobulating my thoughts on article use. Information is mainly drawn from Ionin, Ko & Wexler (2004) and Thornbury (2009). All errors mine.

The following are the (informal) definitions used by Ionin, Ko & Wexler (2004) for definiteness and specificity:

[+definite] the speaker and hearer presuppose the existence of a unique individual

[+specific] the speaker intends to refer to a unique individual and considers this individual to possess some noteworthy property

Ionin, Ko & Wexler, 2004, p.5

The paper argues that English as a two article system (a/an, the) favours the definite-indefinite categorization hence the is definite and a indefinite and it does not mark any articles for specificity. Other languages like Samoan favour the specific-nonspecific categorization where they use le with specific and se with non-specific and does not mark any articles for definiteness.

Side note: apparently in spoken English this can be used to specify nouns (i.e. referential use of this vs demonstrative use) hence we can consider also that English is a three-article system!

The theory is that learners fluctuate between categorizing nouns on definiteness and categorizing nouns on specificity until they eventually settle on definiteness as their proficiency grows.

Both systems of definiteness and specificity predict that learners will use one article the for definite specific and one article a for indefinite non-specific.
However these systems differ on what article will be used with specific indefinites and non-specific definites.

That is the definiteness system will group specific definites with non-specific definites i.e. predict use of the article the; and will group specific indefinites with non-specific indefinites i.e. predict the use of the article a. See Table 1:

table1

Table 1, Ionin, Ko & Wexler, 2004, p.13

By contrast the specificity system will do the opposite – it will group definite specifics with indefinite specifics i.e. predict the use of the article the; and it will group definite non-specifics with indefinite non-specifics i.e. predict the use of the article a. See Table 2:

table2b

Table 2, Ionin, Ko & Wexler, 2004, p.13

This means that the theory will predict overuse of the article the in specific indefinites and overuse of the article a in non-specific definites. See Table 3:

table3a

Table 3, adapted from Ionin, Ko & Wexler, 2004, p.19

So what does this mean for teaching articles? Not sure but knowing that learners will tend to overuse the with indefinites and overuse a/an with definites due to the conflict with the specificity system is enlightening. Further I found the definitions in the paper very useful as I was confused about how specificity was different from definiteness.

I’ll put here a revised table (Table 4) from Scott Thornbury’s blog on articles that does not have the confusing (for me) label general and colour coded for overuse as in Table 3 above.

table3b

Table 4, adapted from Thornbury, 2009

Finally for your students do check Glenys Hanson’s exercises and flowchart.

Thanks for reading.

References:

Ionin, T., Ko, H., & Wexler, K. (2004). Article semantics in L2 acquisition: The role of specificity. Language Acquisition, 12(1), 3-69.

Thornbury, S. (2009). A is for Articles (1) – An AZ of ELT – WordPress.com. Retrieved April 2, 2016, from https://scottthornbury.wordpress.com/2009/12/12/a-is-for-articles-1/.

Impassive Pullum on Passives

There’s a regular module I do at one school on writing about processes coming up soon. So a focus here is on use of passive clauses in such contexts. For years I was happily ignorant, induced by inaccurate instruction from books, about this grammar area. So it was a blessing to read and watch noted linguist Geoffrey Pullum pull apart such advice.
pullum-hunt

As an exercise for me to try to remember his counsel I knocked up three infographics, some work better than others. The information for these graphics come from Fear and Loathing of the English Passive (html); the 6 part video series Pullum on Passives  and On the myths that passives are wordy (pdf).

Types of Passives

types-of-passives-med

Real rules for Passives

real-rules-for-passives-med-new

Allegations against Passives

allegations-against-passives-med

Note that Pullum is not really impassive more impassioned but that makes the title of this post less groovy : )

Hope these are of use to you, thanks for reading.

Corpus Linguistics for Grammar – Christian Jones & Daniel Waller interview

CLgrammarFollowing on from James Thomas’s Discovering English with SketchEngine and Ivor Timmis’s Corpus Linguistics for ELT: Research & Practice I am delighted to add an interview with Christan Jones and Daniel Waller authors of Corpus Linguistics for Grammar: A guide for research.

An added bonus are the open access articles listed at the end of the interview. I am very grateful to Christian () and Daniel for taking time to answer my questions.

1. Can you relate some of your background(s)?

We’ve both been involved in ELT for over twenty years and we both worked as teachers and trainers abroad for around a decade; Chris in Japan, Thailand and the UK and Daniel in Turkey. We are now both senior lecturers at the University of Central Lancashire (UCLan, Preston, UK),  where we’ve been involved in a number of programmes including MA and BA TESOL as well as EAP courses.

We both supervise research students and undertake research. Chris’s research is in the areas of spoken language, corpus-informed language teaching and lexis while Daniel focuses on written language, language testing (and the use of corpora in this area) and discourse. We’ve published a number of research papers in these areas and have listed some of these below. We’ve indicated which ones are open-access.

2. The focus in your book is on grammar could you give us a quick (or not so quick) description of how you define grammar in your book?

We could start by saying what grammar isn’t. It isn’t a set of prescriptive rules or the opinion of a self-appointed expert, which is what the popular press tend to bang on about when they consider grammar! Such approaches are inadequate in the definition of grammar and are frequently contradictory and unhelpful (we discuss some of these shortcomings in the book).  Grammar is defined in our book as being (a) descriptive rather than prescriptive (b) the analysis of form and function (c) linked at different levels (d) different in spoken and written contexts (e) a system which operates in contexts to make meaning (f) difficult to separate from vocabulary (g) open to choice.

The use of corpora has revolutionised the ways in which we are now able to explore language and grammar and provides opportunities to explore different modes of text (spoken or written) and different types of text. Any description of grammar must take these into account and part of what we wanted to do was to give readers the tools to carry out their own research into language. When someone is looking at a corpus of a particular type of text, they need to keep in mind the communicative purpose of the text and how the grammar is used to achieve this.

For example, a written text might have a number of complex sentences containing both main and subordinate clauses. It may do so in order to develop an argument but it can also be more complex because the expectation is that a reader has time to process the text, even though it is dense, unlike in spoken language. If we look at a corpus we can discover if there is a general tendency to use a particular pattern such as complex sentences across a number of texts and how it functions within these texts.

3. What corpora do you use in the book?

We have only used open-access corpora in the book including BYU-BNC, COCA, GloWbe, the Hong Kong Corpus of Spoken English. The reason for using open-access corpora was to enable readers to carry out their own examinations of grammar. We really want the book to be a tool for research.

4. Do you have any opinions on the public availability of corpora and whether wider access is something to push for?

Short answer: yes. Longer answer: We would say it’s essential for the development of good language teaching courses, materials and assessments as well as democratising the area of language research. To be fair to many of the big corpora, some like the BNC have allowed limited access for a long time.

5. The book is aimed at research so what can Language Teachers get out of it?

By using the book teachers can undertake small-scale investigations into a piece of language they are about to teach even if it is as simple as finding out which of two forms is the more frequent. We’ve all had situations in our teaching where we’ve come across a particular piece of language and wondered if a form is as frequent as it is made to appear in a text-book, or had a student come up and say ‘can I say X in this text’ and struggled with the answer. Corpora can help us with such questions. We hope the book might make teachers think again about what grammar is and what it is for.

For example, when we consider three forms of marry (marry, marries and married) we find that married is the most common form in both the BYU-BNC newspaper corpus and the COCA spoken corpus. But in the written corpus, the most common pattern is in non-defining relative clauses (Mark, who is married with two children, has been working for two years…). In the spoken corpus, the most common pattern is going to get married e.g. When are they going to get married?

We think that this shows that separating vocabulary and grammar is not always helpful because if a word is presented without its common grammatical patterns then students are left trying to fit the word into a structure and in fact words are patterned in particular ways. In the case of teachers, there is no reason why an initially small piece of research couldn’t become larger and ultimately a publication, so we hope the book will inspire teachers to become interested in investigating language.

6. Anything else you would like to add?

One of the things that got us interested in writing the book was the need for a book pitched at undergraduate students in their final year of their programme and those starting an MA, CELTA or DELTA programme who may not have had much exposure to corpus linguistics previously. We wanted to provide tools and examples to help these readers carry out their own investigations.

Sample Publications

Jones, C., & Waller, D. (2015). Corpus Linguistics for Grammar: A guide for Research. London: Routledge.

Jones, C. (2015).  In defence of teaching and acquiring formulaic sequences. ELT Journal, 69 (3), pp 319-322.

Golebiewksa, P., & Jones, C. (2014). The Teaching and Learning of Lexical Chunks: A Comparison of Observe Hypothesise Experiment and Presentation Practice Production. Journal of Linguistics and Language Teaching, 5 (1), pp.99–115. OPEN ACCESS

Jones, C., & Carter, R. (2014). Teaching spoken discourse markers explicitly: A comparison of III and PPP. International Journal of English Studies, 14 (1), pp.37–54. OPEN ACCESS

Jones, C., & Halenko, N.(2014). What makes a successful spoken request? Using corpus tools to analyse learner language in a UK EAP context. Journal of Applied Language Studies, 8(2), pp. 23–41. OPEN ACCESS

Jones, C., & Horak, T. (2014). Leave it out! The use of soap operas as models of spoken discourse in the ELT classroom. The Journal of Language Teaching and Learning, 4(1), pp.1–14. OPEN ACCESS

Jones, C, Waller, D., & Golebiewska, P. (2013). Defining successful spoken language at B2 Level: Findings from a corpus of learner test data. European Journal of Applied Linguistics and TEFL, 2(2), pp.29–45.

Waller, D., & Jones, C. (2012). Equipping TESOL trainees to teach through discourse. UCLan Journal of Pedagogic Research, 3, pp. 5–11. OPEN ACCESS