Teach a man to find correlations, he posts them for a lifetime

Aphorism showing its age aside, this post is designed for both men and women who use Canvas and are curious about statistics that may be hiding in their classes’ grades.

I have my own data to share about this semester’s classes, but first, here is a tool that you can use to do the same:

Stat sheet for grades 1.1

And an explanation of how to use it:

On to what I found.

I had 4 classes this semester – 2 Oral Language classes and 2 Written Language classes, both in the 2nd to last term of my university’s IEP. My university’s IEP works a bit unusually – my 4 classes were just 2 groups of people meeting for 4.5 hours a day 4 days a week, about half of which was “Oral Language” and half of which was “Written Language”. The first group of people were my students for the first “term” (=half of a semester), and the second group were mine for the second term. All told, I still had 4 gradebooks on Canvas to export and fiddle with. Between the 4 of them, I found these interesting statistical tidbits:

Scores of 0 are more predictive of final grades than full scores are

One would expect the number of 0s on assignments to negatively correlate with final grades, and the number of full scores to do the opposite. That is, thankfully, true. However, they correlate at different rates – across all my classes, on average, 0s are more (negatively) correlated with final grades than full scores are (positively) correlated. The reason for this is that full scores were more evenly distributed among all students than 0 scores, which were concentrated among a few students. The one class for which this was not true was the one that I changed my late work policy and started giving 1/2 credit for certain late assignments.

This would not be a cause for any particular change except for 2 reasons: 1) as shown by the last class, many of the 0s that students were getting were from late work rather than unsubmitted work, and 2) we have a fairly strict policy about grading by SLOs (student learning outcomes, one of the first abbreviations I had to learn upon my return to the USA after years in Japan), and nowhere in our SLOs does it say that students should learn the sometimes-merciless grading policies that one may encounter at university.

Therefore, I should really make the “late work gets partial credit” policy permanent. I should also probably give fewer full scores.

5% 0s is a line in the sand

I enjoy running t-tests to see what values in what grade categories produce statistically significant differences (p=0.01) in my students’ final grades. One t-test I ran (on the “other stats” sheet in the file linked above) was seeing if students who missed 5% of assignments were different in statistically significant ways from those who didn’t. It turns out that they are, in all 4 of my classes this semester. On the other hand, those who missed 2% of assignments weren’t. Perhaps I should give an opportunity to make up homework on about 2% of assignments (as I already do for classwork).

I’m hoping that my future classes have grades that reflect the average quality of their work, which in turn reflects their ability to do academic work in English, rather than their tendency to check due dates and read rubrics thoroughly on Canvas. These are important skills, but I won’t want to make them a bottleneck through which every grade must pass.

RDs need a bump, FDs need a nerf

Across 4 essays in both Written Language classes, the average correlation of rough draft scores with final grades was 0.70. The average correlation of final draft scores with final grades was 0.76. Since final drafts are worth at least twice as many points as rough drafts, this is rather surprising – even moreso because for 3 of the 4 essays, the rough drafts’ correlations are actually higher than the final drafts’ (the last had a very low correlation for the rough drafts).

I’ve been making changes to my writing process over the last few semesters, and it seems I need to make a few more. I think part of the comparitively low correlations that final drafts have is due to my grading practices – I think I take it easier on final drafts precisely because they’re so many points. My average scores for final drafts are higher than for rough drafts, and the standard deviations as lower – roughly 62%-95% with an average of 78% for rough drafts and 65%-95% with an average of 80% for final drafts. It’s not a huge difference, but looking back at the scores now they don’t seem to reflect the range in quality of the essays. Part of the high correlations for the rough drafts is also due to the skills that are involved in producing a first draft – planning, reading, responding to a prompt, and a bit of grammar – that are assessed in a lot of other assignments as well. Final drafts, meanwhile, assess (in addition to the same things that first drafts assess, but less directly) responding to criticism and editing, which don’t figure largely in many other assignments. Seeing how first drafts track more of the skills that I care about, and I seem to grade them with less of a high-stakes mentality, I should probably weight them more. On the other hand, since final drafts have a somewhat narrow range of skills that they assess, I should weight them less, or even separate my grades for final drafts into smaller sub-assignments like the COCA assignments I currently use, but also a written response to criticism and proof of visiting tutors instead of trying to indirectly read those things into the final draft.

I need to keep in mind too that I’m not necessarily serving my students well if I introduce them into a writing process that none of their psychology, history, or any other professors will use – I hear that most papers turned in for any class other than English are just the final drafts, already assumed to be revised and polished to a sheen. Maybe having one paper like this per term is also justifiable just in terms of preparing students for being taught by PhDs who know more than anyone else in the world about the behavior of certain species of field mice under certain conditions but have never studied pedagogy.

Look forward to more like this same time next semester, and let me know if you find the sheets useful for your own classes.

Review: Interchange (13th ed.)

The Interchange (formerly New Interchange) series is a mainstay of ELT worldwide, used in contexts as diverse as “cottage industry” (Nagatomo, 2013) private language academies to institutions of higher education. The series has undergone significant changes with its 13th edition that warrant fresh review.

Image result for interchange richards
A former edition

To begin with, significant revisions have been made to the content and layout of every chapter, in the words of the publisher, to “bring our content and delivery into alignment with the norms of the 2050s”. To this end, many chapters have been struck entirely or completely rewritten. The section on ethnic foods from book 2, unit 4, for example, is not only gone (a welcome change) but replaced with a pre-activity on the meaning of “tradition” that is more postmodern than many will find in their own Zones of Proximal Development.

None of this would be significant, however, without the accompanying revisions to grammar presentation and newfound focus on project-based learning. The new edition of Interchange changes its fundamental teaching strategy so much as to be unrecognizable compared to earlier versions, both in method and in geopolitical consequences.

Indeed, the methods are so innovative and the learning so efficient that within one semester students display 4-skills competence indistinguishable from native English speakers – at least the somewhat more stilted types of native speakers that populate English textbooks. The new student-centered activities sections have also made learners egocentric, hedonistic devotees of an urbane, bourgeois lifestyle often completely at odds with those of their surrounding cultures. In fact, students seem so transformed by their exposure to this textbook that their former linguistic and cultural identities completely disappear. Students leave classes having conversations about “their hobbies” or “their weekend plans” apparently never reverting to their former Spanish, Chinese, or Qatari selves, becoming strangers to their families and neighbors. Putting aside the decimation of local communities, the new presentation of language items is much improved. At least in municipialities where the new Interchange books have been used, few among the educated classes speak any language than Standard American English.

As one can imagine, the local and national governments of these areas have taken steps to discourage (to put it delicately) the use of the new, unprecedentedly effective Interchange books. Indonesia has taken an early lead in this regard, suspending visas for foreign English teachers and confiscating all (even previous editions of) Interchange. Police are being trained to conduct interrogations in Standard American English (“Excuse me, / I was wondering if / you would mind / telling me your political affiliations”). There are stories of language store owners being detained by paramilitary groups, although not with an official government mandate as yet (Liong, in publication). As if in anticipation of these events, Cambridge University Press made the electronic edition of Interchange purchasable with a variety of virtual currencies and viewable from within a browser window on any phone. The spread of Interchange 13th ed. and its devastating research-based methodology has therefore been impossible to stem.

An explosive rise in vigilantism has been another effect of the pedagogic success of Interchange. With the mitigation of distinct linguistic and cultural identities, societies have seen rising racism and other quasi-biological ideologies of difference that seek to reify formerly “obvious” national and ethnic borders. Informal communities of practice, usually composed of young men (although posses might be a better word) roam city blocks like the home of this publication in Tokyo, seeking to enforce ethnic unity on a purely physical basis – length of nasal bridge, eye color, hair texture, attached earlobe, etc. and interrogating those who deviate in impeccable textbook English (“Oh, my! Just have a look at his nose, will you? It seems quite wide for a Han, doesn’t it?”). Pre-existing ideologies of racial difference, present if marginal in many societies, have been used as historical justification for what amounts to racial terror. Because language use, especially in the age of Interchange 13th ed., does not reliably correlate with racial characteristics, this phenomenon has not directly victimized English speakers, but rather visible minorities of any language background. As such, it is better seen as a side effect of the extreme English fluency brought about by Interchange than as a countervailing force.

Foreign English teachers like this writer find themselves trapped between governments’ anti-English programs on one hand and paramilitary groups’ informal efforts at racial homogenization on the other. One hopes that this review, and further revisions to the Interchange series, help to reverse current deleterious trends in geopolitics even at the expense of the rapid and effortless English mastery present in its current edition.


Liong, W. (2058). Governmental and non-governmental revanchist efforts in linguistically flattened societies. The Language Teacher, 82(3), 45-59.

Nagatomo, D. H. (2013). The advantages and disadvantages faced by housewife English teachers in the cottage industry Eikaiwa business. The Language Teacher, 37(1), 3-7.