Using BYU-Wikipedia corpus to answer genre related questions

A link was posted recently on Twitter to an IELTS site looking at writing processes and describing graphs.
The following caught my eye:

…natural processes are often described using the active voice, whereas man-made or manufacturing processes are usually described using the passive.
(http://iamielts.com/2016/02/descriptive-report-process-descriptions-and-proofreading/)

The claim seems to go back to 2011 online (http://ielts-simon.com/ielts-help-and-english-pr/2011/02/ielts-writing-task-1-describe-a-process-1.html).

This is an interesting claim. It has been shown that passives are more common in abstract, technical and formal writing (Biber, 1988 as cited by McEnery & Xiao, 2005). Here the claim is about specific written texts on natural processes and man-made processes.

Well we can simplify this by asking are there more passives used when writing about man-made processes than when writing about natural processes? Since if you use passive clauses then you don’t use active clauses and we can come to a conclusion by deduction.

BYU-Wikipedia corpus can be used to get approximations of natural process writing and man-made process writing. The keywords I used (for the title word) were ecology and manufacturing. Filtering out unwanted texts took longer than expected especially for the manufacturing corpus. In the end I had an ecology corpus of 77 articles and  153,621 words and a manufacturing corpus of 116 articles and 98,195 words.

The search term I used to look for passives was are|were [v?n*]. This gave me a total of 293 passives for ecology and 304 passives for manufacturing. According to the Lancaster LL calculator this showed a significant overuse of passives in manufacturing compared to ecology. According to the log ratio score this is about 2 times as common (if I understand this statistic correctly). Now this does not mean much as a lot of the texts in the wikipedia corpora won’t be specifically about processes but still it is interesting.

What is more interesting are the types of verbs used in passives in ecology and manufacturing. The top ten in each case:

Ecology:

 

ARE FOUND

ARE CONSIDERED

ARE KNOWN

ARE CALLED

ARE COMPOSED

ARE ADAPTED

ARE USED

ARE DOMINATED

ARE INFLUENCED

ARE DEFINED

Manufacturing:

ARE USED

ARE MADE

ARE KNOWN

ARE PRODUCED

ARE CREATED

WERE MADE

ARE DESIGNED

ARE CALLED

ARE PERFORMED

ARE PLACED

 

Thanks for reading.

References:

Biber, D. (1988) Variation Across Speech and Writing(Cambridge: Cambridge University Press).

McEnery, A. M. and Xiao, R. Z. (2005) Passive constructions in English and Chinese: A corpus-based contrastive study . Proceedings from the Corpus Linguistics Conference Series, 1 (1). ISSN 1747-9398 Retrieved from http://eprints.lancs.ac.uk/63/1/CL2005_(22)_%2D_passive_paper_%2D_McEnery_and_Xiao.pdf

Impassive Pullum on Passives

There’s a regular module I do at one school on writing about processes coming up soon. So a focus here is on use of passive clauses in such contexts. For years I was happily ignorant, induced by inaccurate instruction from books, about this grammar area. So it was a blessing to read and watch noted linguist Geoffrey Pullum pull apart such advice.
pullum-hunt

As an exercise for me to try to remember his counsel I knocked up three infographics, some work better than others. The information for these graphics come from Fear and Loathing of the English Passive (html); the 6 part video series Pullum on Passives  and On the myths that passives are wordy (pdf).

Types of Passives

types-of-passives-med

Real rules for Passives

real-rules-for-passives-med-new

Allegations against Passives

allegations-against-passives-med

Note that Pullum is not really impassive more impassioned but that makes the title of this post less groovy : )

Hope these are of use to you, thanks for reading.