A Taxonomy of Jargon

I’ve noticed a consistent difficulty that my ESL students have, which is comprehending words that are particular to a certain academic field, analytic lens, or article/book, especially as distinct from homonymous words in the dictionary. My classes often read Duhigg’s The Power of Habit as their main text, which features a unique definition of habit, among many other words. For example, Duhigg defines cue, routine, and reward thus:

This process within our brains is a three-step loop. First, there is a cue, a trigger that tells your brain to go into automatic mode and which habit to use. Then there is the routine, which can be physical or mental or emotional. Finally, there is a reward, which helps your brain figure out if this particular loop is worth remembering for the future... (19)

and later specifies further that a reward “can range from food or drugs that cause physical sensations, to emotional payoffs, such as the feelings of pride that accompany praise or self-congratulation” (Duhigg 25)

Clearly, a reward to Duhigg is something fairly intuitive and immediate, like the taste of a delicious food or relief from an itch, as he later illustrates with examples of rats and monkeys in behaviorist, stimulus-response-type experiments. Yet I consistently find in my students’ papers that they define reward much more similarly to their dictionaries, something like a biweekly paycheck or a college degree, often abstract and far off. This resetting of the definition of the academic jargon we’ve been learning back to its lay version happens with great regularity.

The issue seems to be that students will default to the dictionary definitions of those words when dictionary definitions are available, even if we’ve been talking about the newly learned definitions for weeks. That is, although we’ve been trying to hang a new concept on an old hook, students reaching for the old hook reliably come up with the old concept instead.

This got me thinking of how the jargon (Merriam-Webster: “the technical terminology or characteristic idiom of a special activity or group”) that students encounter throughout their academic careers varies, and how the differences between types of jargon can lead to easier or harder experiences of mastering them as words and as concepts. And though the word “jargon” can have a bit of a negative connotation, here I’m not at all interested in castigating academics for using the terminology particular to their field (even to the point of alienating non-experts) or even for coining new and potentially confusing terms, just identifying some characteristics that could make academic jargon more transparent or less transparent for English learners.

What follows is a preliminary attempt to categorize types of jargon according to overlap with other words and concepts.

Pure jargon (new words)

Perhaps the easiest jargon to identify is that which is clearly a new word, a term completely unique to its field, and though rare, one that probably occurs in the dictionary and exists in the students’ L1 with almost the same definition. Some examples of this type of jargon might be:

  • gluon, a type of subatomic particle
  • aphasia, a language disorder
  • semaphore, a way of organizing multiple processes in a computer
  • molality, something having to do with chemistry
  • palantir, a magical stone used for seeing

The most common issue with words like these in my experience is that students may translate them into the L1, recognize the translation, and then feel as if because they recognized (as opposed to understood) the translation, they therefore know the word. Obviously, someone who hasn’t studied chemistry in any language (like me) won’t really know what molality is.

But in general, these words’ properties as words aren’t what cause confusion, and what difficulties students have in grasping them are likely to be difficulties in grasping the concepts themselves.

Compound jargon

A step up in opacity is novel compound words, words whose components are known but when used in combination refer to a new concept. Some examples might be:

  • the Honeymoon Stage, one of Kalervo Oberg’s 4 stages of culture shock
  • the New Deal, a group of government programs during the Great Depression
  • the Great Depression, since I brought it up
  • nature-identity, one of Gee’s 4 identities (see references)
  • call-out (or cancel) culture, a straw man of conservatives on the Internet
  • blue book, either the publication containing a used car’s estimated value or the value itself

The superficial familiarity of everyday words like “great” and “depression” can yield a false sense of familiarity with the referent of the term “Great Depression”. In my experience though, most problems with understanding these terms come from incorrect parsing of their grammar: many students seem to read “Great Depression”, ignoring its capital letters and interpreting it simply as an adjective followed by a noun, as any depression which is large or severe.

Interpreting compound nouns, or adjective-noun pairs meant as proper nouns, can dovetail with understanding the role of lexical chunks. I have no evidence of this, but ability to comprehend compound jargon may correlate with ability to parse language as chunks rather than strictly as words and grammar.

Homonymous jargon

This class of jargon, sharing spelling and pronunciation with a lay term, is what I was talking about in the introduction, and to me, the type of jargon most likely to cause confusion. I have broken down this group into a few sub-categories:

Homonymous and conceptually similar

The most difficult jargon to distinguish from its vernacular equivalent is jargon which shares a form with a non-jargon word and refers to almost the same thing, but is defined more specifically or to fit within a particular framework. Some examples might be:

  • Cue-Routine-Reward, the three parts of the habit loop as defined by Duhigg
  • Mindset, either growth or fixed as defined by Dweck
  • Grit, perseverence in pursuit of a goal as defined by Duckworth
  • Health, an integer subject to increase with sleep or decrease with physical damage as defined by the Final Fantasy series

Again, the errors seem to stem most commonly from substituting in the lay version of a word’s definition when the technical one was called for.

Homonymous but conceptually different

Some homonymous jargon extends the meaning of a lay term to the point that the connection may not be clear to outsiders. Consider terms like “sweeten” in production, which means adding effects like a laugh track to make a final product more palatable, much like sugar does to tea.

Mitch Hedberg quote: We're gonna have to sweeten some of these ...

Other jargon which is not particularly close in meaning to its lay equivalent might be:

  • Whale, a high-stakes gambler
  • Remainder, to dispose of unsold books (also tricky for morphological reasons; the lay term “remainder” is a noun while the jargon is a verb)
  • Sleeve, the body into which a digitally stored consciousness is inserted (see also “Shell“)

I have never encountered an instance of a student accidentally reverting to the lay definition of a term like this in writing, perhaps because the definitions are so different as to preclude confusion. No one is going to write about a whale visiting a casino and suggest that he may have been disappointed to find the buffet out of krill.

Homonymous and “technically correct”

Within the type of jargon that is a homonym for its lay counterpart are many words whose definitions are distinct but which are taken as the “true” definitions of those words. That is, the technical definition is thought to be what people “really mean” when they use the word in other, non-technical contexts. Some examples might be:

  • Depression (I have a hypothesis that part of what makes psychology so difficult is that so much of its jargon are homonyms of everyday words like “self” and “positive”)
  • DNA, a stand-in for “heritage” in popular discourse but not in biology
  • Myth, a story with particular cultural power, interpreted in popular discourse as “a falsehood”
  • Million, liable to be corrected even when clearly meant as a synonym for “a lot” and not exactly 1,000,000 of something

To illustrate the difference between this type of jargon and the other homonymous jargon above, consider that someone who uses “DNA” in a sentence like “I love BBQ. It’s in my DNA” may be “corrected” and forced to rephrase, while someone who uses “whale” to refer to an aquatic mammal will never be reproached for sloppy, non-technical language use, nor will someone who uses “grit” to refer to general hardworkingness be shamed for not using Duckworth’s specific definition.

What to do

Some consciousness-raising work on just how common academic jargon is in university classes and the flexibility of words’ meanings is probably a good idea.

Part of this really should be a thorough introduction not just to the idea that dictionaries (bilingual or monolongual) or translation are not reliable ways to understand course content, but many illustrations of why, including showing a list of the possible translations of “grit” (for example) and invitating students to compare any of them to the specific definition that the course uses.

Perhaps jargon can be interpreted not as a stumbling block to success but as an opportunity to raise consciousness as to the relationships of words to the concepts that they refer to.

Works Cited

Gee, James Paul. “Chapter 3: Identity as an analytic lens for research in education.” Review of research in education 25.1 (2000): 99-125.

Duhigg, Charles. The Power of Habit: Why we do what we do and how to change. Random House, 2013.

Discussion Circles

Apologies to whoever I stole this idea from – I don’t remember who I should be crediting with it. It has, however, become a staple of my classes.

Previously, class discussions that I’ve worked into lessons have had problems. If the whole class tried to have a discussion together, a few very vocal students dominated the arena while others either tried in vain to compete or happily ceded the floor and retreated into themselves. If discussion groups were smaller, it was harder for non-participants to avoid notice, but discussions still depended on the willingness of a few people to keep a conversation going to prevent them from dissolving into a group of people sitting together, each checking his or her phone. Even groups that stayed on task would default to talkative students talking more and quieter students nodding along.

Discussion circles are a way of facilitating equally participatory conversation among students who naturally vary in their willingness to speak as themselves and voice opinions on either academic or familiar topics. They do this by:

  • Removing some of the burden on the students of representing themselves, because they are playing assigned roles rather than simply voicing their own thoughts,
  • Supplying pragmatically appropriate language, and
  • Encouraging participants, in various ways, to listen carefully to and respect each others’ contributions.

I use 3 versions of Discussion Circles sheets, each of which has 4 roles that participants need to play:

  • Discussion Leader
    • Chooses questions to ask and asks them
    • Begins and ends the meeting
  • Harmonizer
    • Thanks other members for participation
    • Asks for clarification
    • Rephrases others’ opinions
    • Encourages other members to participate
  • Reporter
    • Takes notes on the members’ contributions
    • Asks members to repeat or rephrase
  • Devil’s Advocate
    • Disagrees with other members’ contributions (constructively!)

These tasks are in addition to actually answering the questions that the Discussion Leader asks.

Each of these roles has a worksheet to fill out with sections for before, during, and after the discussion. These are turned in to the teacher afterward. The teacher, incidentally, is not involved in the discussions except to provide a list of questions and assign roles at the beginning.

The version of the worksheet that I use for at least the first 3 times that I do this activity is about 2 pages long per member. The “Before” and “After” sections are fairly involved and take about 10 minutes to do each. (The discussion itself can take anywhere between 20 minutes to an hour.)

You can get a copy of it here: Discussion circles online (called “online” because it is in a format that is easily distributable on Google Classroom. You can also print it.)

After they are used to the expectations of each role, I use a shortened version of the sheet. This one has a shorter “Before” section and no “After” section.

Find a copy here: Discussion circles lite online

Towards the end of the semester, I use a Turbo version of the sheet in which the participants switch roles with every question.

Get it here: Discussion circles turbo

In my Center for Excellence in Teaching and Learning group that meets on Fridays (basically a community of practice for new professors), I tried a revised Turbo version that had the job Quoter replacing Reporter.

Get it here: Discussion circles turbo 2

I give two grades for this assignment every time it is used: One grade for participation in the meeting and one for completing the worksheet. Now that we’re all online, the participation grade comes from a predetermined member recording their Zoom meeting and sharing the video with me.

Obviously, for the last few weeks of our spring 2020 semester, I’ve been distributing these online and having students share one sheet for the whole group rather than printing and handing out the sheets for an in-class discussion. I find that the distribution of responsibility in Discussion Circles, where everyone has to participate in order to complete their own sheet, suits the slightly impersonal nature of online synchronous discussions fine. Students often remark that they take more easily to some roles than others, but I try to make sure everyone plays every role at least once, so that even if they don’t “naturally” like to disagree with others, they will all be able to do so respectfully when it becomes important.

I find that Discussion Circles are a helpful scaffold for a lot of skills practice that we hope to see in class discussions, to the point that I rarely have a class discussion without them anymore. I hope you get some value from them too.

Thresholds in missed assignments predicting final grades

First, a note on a quick research project that yielded nothing interesting: I checked whether Canvas page views immediately before going remote, immediately after going remote, and the change in page views during that period were correlated with final grades, and they basically weren’t. I was wondering whether students who tended to check Canvas a lot (during the remote instruction period and before it) tended to do better in the class overall, and I didn’t find any evidence of that.

Also, and this will be mentioned again later, in checking the average scores for assignments over the past semester, I noticed that assignments that had to be done in a group were more likely to be completed than those that weren’t. This is interesting to me because setting up a Zoom meeting and talking to classmates, sometimes in other countries, would seem to be harder, not easier, than completing a worksheet by oneself. However, just taking two types of assignments from my Written Language class as examples, Reading Circles, which had to be done as a group via Zoom (or in person before we went remote in March), had a mean score of about 93% for the term, while Classwork, which included many assignments that were completed solo, had an average of 89%. In the Oral Language class, Discussion Circles (a sort of role-playing exercise with questions on an assigned topic) had an average scores of 99%, and Classwork 90%. It seems that Zoom meetings and the rare chance and synchronous interaction that they represent facilitate work, despite the pain of setting them up.

In other news, I have just completed my first academic year at the university IEP that I started at full-time last fall. As a celebration we got Thai takeout from one of the three good Thai restaurants in town (there are, mysteriously, no good Indian restaurants for 40 miles in any direction), and I immediately started blogging, vlogging, and tinkering with Google Sheets to fill the void left by work.

I’ve been slowly adding functionality to the Google Sheets that I use to do my end-of-course number crunching, mostly by figuring out new ways to use the FILTER function along with TTEST to see if there are statistically significant differences in my students’ final grades when they are separated into two populations according to some parameter. I put together a master Sheet for the year that included all of my classes between last August and now.

One possible factor that I had noticed anecdotally throughout the year was that students seemed more likely to fail or do poorly for assignments not turned in at all than for assignments done poorly. There was no shortage of work that was half-finished or ignored instructions, but the really low grades for the course were usually for students with work that was not even turned in.

So I set up a t-test on my Google Sheet to separate my students into two populations by the % of assignments that received a grade of 0 and look for a statistically significant difference in their final grades. Naturally, one expects students who have more 0s to do worse, but I still wondered where the dividing lines were – did getting 0s on more than 5% of assignments produce statistically significantly different populations? Did 10% do the trick? Is there a more graceful way of expressing this idea than “statistically significantly different”?

The relevant cells in my Google Sheet look like this:

As you can maybe figure out from the above, missing 10% of assignments (regardless of the points that those assignments were worth) produced a statistically significant difference in final grades: those who missed 10% or more of assignments had a final course grade of 66.9% (or D) on average while those who missed less than 10% had an average course grade of 90.8% (or A-).

On the other hand, getting full scores (which in my class means you followed all the directions and didn’t commit any obvious mistakes like failing to capitalize words at the starts of sentences) on more than 50% of assignments also produced a statistically significant difference in final grades: those who got full scores on 50% or more of assignments had a final course grade of 93.2% (or A) on average while those who got full scores on less than 50% had an average course grade of 78.4% (or C+). This isn’t the difference between passing and failing, but the ratio of full scores does produce two populations, one of which fails on average and one of which passes – see below.

Other significant dividing lines were:

  • Missing 3% of assignments
    • If you missed more than 3%, your average grade was 83.3% (B)
    • If you missed 3% or less, your average grade was 92.1% (A-)
  • Missing 5% of assignments
    • If you missed more than 5%, your average grade was 78.6% (C+)
    • If you missed 5% or less, your average grade was 91.7% (A-)
  • Getting full scores on 35% of assignments
    • If you got a full score on more than 35%, your average grade was 90.0% (A-)
    • If you got a full score on 35% or less, your average grade was 68.0% (D+)
  • Getting full scores on 70% of assignments
    • If you got a full score on more than 70%, your average grade was 96.6% (A)
    • If you got a full score on 70% or less, your average grade was 85.0% (B)

As you can see, I am not a prescriptivist on the use of the word “less”.

As you can also see, there are some red lines that pertain to the number of assignments that students can miss before they fall into a statistical danger zone: 10% of assignments missed, or only 35% of assignments with full scores. A student who fails to meet these thresholds is statistically likely to fail.

Statistics like these don’t carry obvious prescriptions about what to do next, but I worry a bit that the number of missed assignments will go up as classes are moved permanently online and assignments lose the additional bit of salience that comes from being on a physical piece of paper that is handed to you by a physical person. I also, for mostly bureaucratic reasons, worry that my grades seem to reflect less “achieving learning outcomes” and more “remembering to check Canvas” – although I’m sure this discrepancy is nearly universal in college classes.

I am considering giving fewer assignments per week that are more involved – fewer “read this article and complete this worksheet” and more “read this article, make a zoom discussion, and share the video and a reflection afterward”. We will see if that produces grades that reflect the quality of work rather than the mere existence of it.

Academic ESL and interlanguage: Partially totally effective or totally partially effective (or effective for other purposes)?

Three hypotheses for the observed effectiveness of academic ESL for preparing students for academic work in English:

  1. Academic ESL is perfectly effective at developing interlanguage, but academic ESL classes finish before the end of interlanguage development because students cease being ESL students and matricutate into regular degree programs. Students would still benefit from academic ESL after this point, but rarely have time due to their undergraduate or graduate class schedules. Some stunting occurs in students’ interlanguage because of the premature end of their ESL courses.
  2. Academic ESL is partially effective at developing interlanguage, and academic ESL classes finish at the end of their period of effectiveness. Students would not benefit from more academic ESL after this point because interlanguage development cannot occur through further academic ESL classes. Students are more likely to have student interlanguage development because of excessive time spent in ESL than a premature start to their degree programs.
  3. Academic ESL is partially effective at developing interlanguage but mainly effective at introducing compensatory strategies for students to use to make up for their lower language skills. Some of these strategies are specific to language learners and others are of use to any college student, but former ESL students in degree programs succeed by using them more than other students. Interlanguage development is less predictive of academic success than application of compensatory strategies.

Earlier this semester, we requested some data from our campus researcher, and he just got back to us. I won’t say what exactly he told us, but it pertained to average GPAs among different populations of undegrads, and it was good news for the apparent effectiveness of our IEP.

That said, we don’t know why our IEP appears to be effective. It is possible that we are getting better at our jobs. It is also possible that we are just recruiting better students. It’s possible that our students are far better than average, but we’re doing a worse-than-average job preparing them for college, resulting in performance that converges on the mean. Assuming that the work we do in class is at least part of the reason, it might help us to better focus our efforts in order to improve even more if we knew what part of what we do in class helps our students the most.

(For most of my career, I was used to the idea that interlanguage development started when students joined my class and stopped when they quit. In EFL, you can’t count much on outside factors to keep the interlanguage development ball rolling – students aren’t part of formal or informal organizations that facilitate regular English use and their identities accommodate English as a hobby at most. I tried as the owner of an eikaiwa to get students to start pastimes that included English, only to realize that as an eikaiwa teacher, I was the pastime. In short, I was used to thinking of English class as a self-contained unit; anything I wanted my students to do with English we had to do together.

I realized partway through my first year teaching community college ESL in California that we were by design only giving our students a partial education. We wanted to send them off into English 100 with maybe a bit of a head start and without a lot of baggage, but we expected English 100 to continue the work of interlanguage development. I’m sure some of us thought that ESL would still benefit our students, but they had to get on with their credit-bearing classes eventually, and some of us probably thought that ESL was inherently limited in what it could accomplish. There are also those who think that the one and only way that a student will come to understand adjective clauses is if the teacher explains adjective clauses and have never heard of interlanguage.)

Anyway, this would make a good long-term study project: find a decent sample of former academic ESL students in their undergrad years, give them the TOEFL or IELTS (which they wouldn’t have taken in a few semesters at least), survey them on their “compensatory strategies” (defining those would be a lot of work), and measure those against their undergraduate GPAs.

By the way, I’ve started recording some old blog posts as vlogs, seeing how different people tend to read ELT blogs and watch ELT-related content on YouTube. Feel free to stop by and leave a comment about how I don’t look like you expected.

Dr. Brute Facts doesn’t care about your feelings

Of all of the world’s doctors, and particularly at St. Jude’s Hospital, Dr. Judd Shapiro is a physician with a rare gift. While most doctors consider the physical body a vessel for a living, thinking person first and foremost, and the apperception and treating of the physical body a worthy enterprise because of the comfort that it might bring such a person, Shapiro considers the person to be a mere collection of physical phenomena, of value only as the sum total of the stuff of which it is made. To be even more precise, “it”, the physical body, barely exists to him; it demands no more recognition of oneness or continuity than does the air in a balloon or the water that makes up a particular section of river. This is not just a matter of point of view, but more of the acuity of his senses: he sees the interactions of particles and waves in front of him much like you or I see a fly landing on a window, and thinks primarily on that scale, the scale that is most intuitive and meaningful to him. The scope of his thought can be widened to include constructs we might call “life”, including human life, and on some occasions he be persuaded to adopt our conventions of grouping matter into “cells”, “organs”, and “people”. In this way he has a grasp, in both the general and particular senses, of the true inner workings of the human body, and any body that happens to be in his examination room. He knows both the sequences of DNA (that is, can recite the base pairs) that commonly predispose one to abnormal hemoglobin production as well as the precise location and corpuscular hemoglobin concentration of any given red blood cell in the bloodshot, overworked eye of any of the nurses. The difficulty of working with him lies not in his intellect but where he is inclined to focus it, in convincing him to think of the physical phenomena that he is attuned to as part of a “body” or a “patient” at all. To him, arrangements of matter in the coordinated pattern we call a “human” are but a convenience, a mental shorthand, for those without direct access to the atomic motion that underlies it all. When his opinion is solicited, he has the habit of emerging from a trance-like state to declare, “facts, not feelings” – which, as far as we can tell, is an enjoinder to let go of our parochial attachment to bodies and minds of the patients under our care and focus on simply what is objectively there. To him, a “body” is a random selection of the possible arrangements of matter, only intrinsically appealing to humans because we appreciate things of roughly our own size, and in particular things whose cells contain DNA recombinable with our own. He seems to see no point in putting his answers to us, when we can get them, in body-scale terms, instead phrasing them in the the brute factual language of nature. They often come in descriptions of the states of electrons somewhere nearby, which would be useless to us even if it were clear that these electrons were in the patient’s body and relevant to his complaint. If we manage to elicit a comment on a particular condition, we find that just as often as not he has taken the side of the disease over the patient. At one point, a fellow doctor contributed weeks of study to discern the meaning of one off-hand comment Dr. Shapiro had made about a T being where a C should be (we were lucky that in this case he favored us with an explanation in terms of molecules rather than quanta), thinking that it might hold a clue as to the particular mutation in liver cells that led to a patient’s hepatoblastoma, only to find that Dr. Shapiro’s remark was about the liver’s healthy cells – Dr. Shapiro seemed to see no reason that the patient’s whole liver should not be cancer. Apparently, cancer displays all of the hallmarks of what we call “living”, with the added bonus of not disturbing his trances with irrelevant questions.

I wonder if it would not be a terrible burden on his gift of perception to favor the intuitions of others as to the relative value of different arrangements of matter, seeing that he accepts the incomparably more arbitrary construct that is “employment” as a “doctor” and the remuneration it brings.

Do the timing and number of edits on a draft predict improvement?

Since I started using Google Classroom for writing classes a few years back, I’ve noticed a pattern in the emails Google sends you whenever a student clears a comment you left. A few times, I’ve been able to tell when a student was still working on a paper past the deadline or if they got enough sleep the night before (emails at 3:20 AM are a bad sign). Most often though, you just find that a lot of students are making edits the morning that a paper is due, as your first email check of the morning features 30+ emails all saying “Bob resolved a comment in Final Essay”.

There exists a tool called Draftback (introduced to me, as with many edtech tools, by Brent Warner), a browser extension for Chrome, that lets you replay the history, letter by letter, of any Google Doc that you have edit access on. Its most obvious utility is as a tool for detecting academic dishonesty that plagiarism checkers like Turnitin miss (like copy/pasted translations, which show up in the editing history as whole sentences or paragraphs appearing all at once as opposed to letter by letter). It also has the benefit of showing you the exact times that edits were made in a document, which you can use to track how quickly students started responding to feedback, how many revisions they made (grouped helpfully into sessions of edits made less than 5 minutes apart), and whether these revisions were all made in the 10 minutes the student said he was just running to the library to print it. Draftback is the kind of tool that you hope not to need most of the time, but is hard to imagine life without when you need it.

This video gives a good introduction to Draftback.

With the pattern in my email inbox fresh in my mind (a term just having ended here), I thought I’d use Draftback to see whether this flurry of last-minute editing had some bearing on grades. To be specific, I used Draftback to help me answer these questions:

  • Do numbers of edits correlate with scores on final drafts (FD) on papers?
  • Does the timing of edits correlate with FD scores?
  • Do either of these correlate with any other numbers of interest?

This required quite a bit of work. First, I copied and pasted rough draft (RD) and FD scores for each one of my students’ essays for the past 3 terms, totalling 6 essays, into a big Google Sheet, adding one more column for change in grade from the RD to the FD (for example, 56% on the RD and 92.5% on the FD yields a change of 65.18%). Then, I generated a replay of the history of each essay separately. Because each essay is typed into the same Google Doc, this gives me the entire history of the essay, from outline to final product. After each replay was generated (they take a few minutes each), I hit the “document graphs and statistics” button in the top right to see times and numbers of edits in easier-to-read form. I manually added up and typed the timing and number of the edits into the Google Sheet above. Last, I thought of some values culled from that data I might like to see correlated with other values. Extra last, I performed a few t-tests to see if the patterns I was seeing were meaningful.

(The luxury of a paragraph about how annoying the data was to compile is part of the reason I put these on my blog instead of writing them up for journals.)

Example “document graphs and statistics” page. From this, I would have copied 1468 edits for the due date (assuming the due date was Monday the 30th), 79 edits 4 days before the due date, and 1911 edits for 5 days before the due date, with 0 edits for every other day.

The values that I thought might say something interesting were:

  • % of edits (out of all edits) that occurred on a class day
    • I’m curious whether students who edit on days when they don’t actually see my face do better – i.e., if students who edit on the weekends write better. Eliminating class days also helpfully eliminates lab days, the two class days a week when all students are basically forced to make edits. Incidentally, our classes meet Mon-Thu and final drafts are always due on the first day of the week. The average across all the essays was 63%, with a standard deviation of 38%.
  • % of edits that occurred on the due date
    • Specifically, before 1 PM – all my final drafts are due at the beginning of class, and all my classes have started at 1 PM this year. My assumption is that a high % of edits on the due date is a sign of poor work habits. The average was 21% with a standard deviation of 31%.
  • total # of edits
    • One would hope that the essay gets better with each edit. This number ranged from near 0 to more than 6000, with both an average and standard deviation of about 1700. Obviously, if you calculate this number yourself, it will depend on the length of the essay – mine were all between 3 and 5 pages.
  • maximum # of edits per day
    • I’m interested in whether a high number of edits per day predicts final grades more than a high number of edits total. That is, I want to know if cram-editing benefits more than slow-and-steady editing. The average and standard deviation for this were both about 1200.
  • # of days with at least 1 edit
    • Same as the above – I want to know if students who edit more often do better than ones who edit in marathon sessions on 1 or 2 days. The average was 3.25 days with a standard deviation of about 1 day.

All of the above were computed from the due date of the last RD to the due date of the FD, up to a maximum of 1 week (my classes last for 6 weeks, and there is very little time between drafts – read more about the writing process in my classes here). When I was done, after several hours of just copying numbers and then making giant correlation tables, I had hints of what to look into more deeply:

2 essays from each student, each taken separately.

As you can see in cells C9-H14 (or duplicated in I3-N8), students didn’t necessarily use the same revision strategies from essay to essay. A student who had a ton of edits on one day for essay 1 might have fewer edits spread out over more days for essay 2, as evidenced by the not-terribly-strong correlations in the statistics between essay 1 and essay 2. To take one example, “days with > 0 edits” on essay 1 was correlated with “days with > 0 edits” on essay 2 at just 0.21 (cell M7). Some of these differences were still statistically significant at p=0.05 (a good enough p for a blog, imo):

  • Students who did > 2000 total edits on essay 1 had an average of 3428 total edits on essay 2. Students who did <= 2000 total edits on essay 1 had an average of 1650 total edits on essay 2.
  • Students who did > 50% of their edits for essay 1 on the due date did an average of 45% of their edits for essay 2 on the due date. Students who did <= 50% of edits on essay 1 on the due date did an average of 17% of their edits for essay 2 on the due date.

Anyway, because it seemed prudent to consider the strategies used on each essay rather than the strategies used by each student, I made a second spreadsheet where the individual essays rather than the students (who each wrote 2 essays) are the subject of comparison, resulting in this much-easier-to-read correlations table:

Here I treat each essay as a unique data point rather than 2 products of the same student.

Columns I and J (or rows 9 and 10) are probably the most interesting to other writing teachers: those hold the correlations between statistics derived from Draftback data and I) final draft scores and J) change in score between the rough draft and final draft. In plain English, the correlations here suggest:

  • As expected, % of edits on class days and % of edits on the due date are negatively correlated with the final grade for the essay. That is, people who did a lot of their edits in class or right before turning in the essay seemed to do worse (but not by much-neither produces statistically significant differences in FD grades or in improvement between RD and FD).
  • Total # of edits and max edits per day are both positively correlated with final grades (and with each other). Editing more tends to produce better essays.
  • Everything that is true for the final scores is also true for the change in scores between RD and FD. The fact that RDs were even more negatively correlated with % edits on class days and % edits on the due date than those values were with FDs mean that the changes appear to be positively correlated, but I take it as meaning that those strategies with an improvement from very bad RD scores to mildly bad FD scores.

To give a bit more detail, these were some statistically significant differences (p=0.05):

  • Students who did > 2000 total edits had an average grade of 86.8% on the FD. Students who did <= 2000 total edits had an average grade of 78.7% on the FD.
  • Students who did > 3000 total edits had an average grade improvement of 17.8% between the two drafts. Students who did <= 3000 total edits had an average grade improvement of 4.9%.
  • Students who did edits on > 3 days had an average grade of 84.8% on the FD. Students who did edits on <= 3 days had an average grade of 78.9%.
  • Students who did edits on > 5 days (that is, almost every day) had an average grade improvement of 33.6% between the two drafts. Students who did edits on <= 5 days had an average grade improvement of 5.8%.

The data suggests a fairly uncontroversial model of a good writing student – one who edits often, both in terms of sheer numbers of changes and in terms of frequency of editing sessions. In fact, “model student” rather than “model essay” may be what the data is really pointing at – the amount and timing of the work that went into a particular essay seems sometimes to show more about the student’s other work than it does about the quality of that essay.

For example, it’s not clear why data derived from the time period between RD and FD would be correlated with RD scores (in fact, you would expect some of the correlations to be negative, as high RD scores might tell a student that there is less need for editing), but perhaps the fact that the same data points that are correlated with FD scores are correlated in the same ways with RD and final course grades indicates that the data shows something durable about the students who display them (my caveat earlier notwithstanding). It is feasible that the poor work habits evidenced by editing a major paper a few hours before turning it in might affect students’ other grades more than that paper itself.

In fact, this seems to be the major lesson of this little research project. One t-test on % edits on due date was statistically significant – one that compared students’ final course grades. To be precise, students who did > 20% of their total edits on the due date had average course grades of 84.5%. Those who did <= 20% of their total edits on the due date had average course grades of 88.8%.

Just to pursue a hint where it appeared, I went back into my stat sheets for each class for the last year and copied the # of assignments with grade 0 (found on the “other stats” sheet) for each student into my big Google Sheet. Indeed, there was a statistically significant difference. That is, students who made > 20% of edits made on the day an essay was due got a score of 0 on 5% of assignments across the term, and students who made <= 20% of edits made on the day an essay was due got a score of 0 on 3.2% of assignments across the term.

Like many characteristics of “good students”, from growth mindset to integrative motivation, whether a pattern of behavior correlates with success and whether it is teachable are two almost unrelated questions. It doesn’t necessarily follow from this research that I should require evidence of editing every day or that I should move due dates forward or back. It does suggest that successful students are successful in many ways, and that editing essays often is one of those ways.

I might just want to tell my students that I really love the Google Docs “cleared comment” emails that I get on Monday morning and I wish I got them all weekend, too.

The Academic Support Catch-22

There is a pattern among formerly-known-as-remedial “academic support” classes that I’ve noticed that may work against their intended purpose.

The pattern is a result of the assumption that the subtext of planning and preparation in most assignments in college needs to be made text. That is, the assumptions of what needs to happen for a college student to be successful need to be made explicit and accounted for. For example, here is a representation of creative writing that I think gives a pretty accurate representation of the work that has to be done vs. what ends up on the page:

writing iceberg

Academic support often seems to work by taking all of those hidden parts of the writing process out in the open and making them graded assignments themselves. An assignment that in another class might look like this:

Write a research paper on a topic covered in this class. (100 pts)

might turn into a weeks-long writing unit like this:

  • Brainstorming discussion notes (classwork)
  • Research goal discussion: 5 pts
  • Mind map: 2 pts
  • Library scavenger hunt (classwork)
  • Works Cited and Plagiarism worksheet: 5 pts
  • Outline w/ annotated Works Cited page: 10 pts
  • Outline pair feedback (classwork)
  • Introduction in-class writing (not graded)
  • Rough draft 1: 10 pts
  • RD1 peer feedback (classwork)
  • RD1 tutoring visit reflection discussion: 5 pts
  • RD2: 20 pts
  • RD2 professor feedback reflection Flipgrid: 5 pts
  • RD2 office hours appointment: 2 pts
  • FD: 70 pts
  • FD writing process reflection discussion: 5 pts
  • Optional FD re-submission for makeup points
  • Optional FD re-submission for makeup points reflection
  • Optional FD re-submission for makeup points reflection2

Ok, the last two are jokes, but otherwise this writing process, where every step is explained, given its own rubric, shared, and reflected upon, is quite normal for a writing class that is coded “for English learners”, “academic support”, or just has a professor trying a more workshoppy approach.

This can be invaluable unless it sets too strong a precedent for explicit requirements of the writing process in students’ minds. Some students, particularly in ESL, may have no idea at all what the writing process is supposed to entail or how to use the resources like libraries, tutoring, etc. It’s better that at least one class during a college student’s first year puts this all on the record, but it might be counterproductive if too many do. It shouldn’t be lost on us that each step made explicit in the “academic support” writing process makes it resemble a typical college writing assignment less and less. If students expect these steps always to be explicitly outlined, they may neglect them or delay them on assignments where they are not.

The contrast between two types of assignments in my classes crystallize these concerns for me. The first type resembles the detailed, all-steps-accounted-for work flow above. I have 2 papers in a term whose writing processes basically fill all of the 2 or 3 weeks ahead of their final due dates with discussions, peer review, presentations, and pre-writing. The second type is an “all-term” assignment given the first week of class and due the last week, usually worth a significant amount of points but doable in a few hours with the right preparation. Examples of this type of assignment are “go to an on-campus event and take detailed notes” or “email a professor in the department you plan to major in and ask 3 questions”. Students tend to do the first type of assignment with the appropriate level of dedication, preparing them well for the big essays that come at the end of the two- or three-week unit. At the same time, they tend to leave the second type of assignment until the weekend before the last week of class, days before they are due, and often run into problems like not having campus events to go to on Presidents’ Day weekend (this post is a topical one). This tells me that, in my classes at least, the precedent of having all the “underwater part of the iceberg” work outlined in detail for some assignments results in the underwater part being ignored for others.

Another factor may be that, for the first type of assignment, students are all doing the same thing at the same time and know that avoiding embarrassment during a week’s worth of discussions and presentations depends on their doing their work. For the second, on the other hand, students may all go to different events, email different professors, etc. all at different times and never have to show their work to their classmates. Again though, it is not unusual for major assignments in other classes to be solitary affairs. The many reasons that students seem to neglect solitary assignments with implicit requirements on time and preparation only highlight the problems that that neglect causes.

I don’t really have a solution for the skewing of expectations that academic support seems to produce – I just verbally warn students that most of the steps in our writing process will need to be taken of their own volition in their History, Psychology or Accounting classes. Maybe I need to give points for reflecting on that warning.