Testing some skills would seem more straightforward than testing others. If you want to see whether people can ride a bike, put them on bikes and see if they can get from point A to point B with a reasonably low rate of broken bones and concussions. If you want to see whether people know who the Axis and the Allies were in World War 2, ask them to name them. If you want to know if someone can speak a language, have them speak it in the presence of testers or record them speaking it for later evaluation.
Well, most of my readers will know that the last one was thrown in as a tripwire, because no language teacher believes testing speaking is that easy. First, the equivalent of an obstacle course to ride one’s bike over as a test is quite difficult to recreate for spoken language – most people are rather choosy about who they engage in minutes-long conversations with, for one, and the preconditions for the interaction generally aren’t “provide evidence for strangers that you can put words in the right order”. Also, there is a number of smaller skills involved in bike riding which can be directly or indirectly observed by putting someone on a bike, but what if one of those skills were intuiting the intentions of other bike riders based on combinations of thousands of hand signals and bells of subtly varying frequencies? And of course, the test needs to be completable in a few minutes and for the sake of fairness the same for every participant.
For the sake of argument, imagine what a perfect testing machine would look like. Ideally, it’d be able to cut through all the situational variables that can affect test performance and simply tell whether a given concept or skill is instantiated in a reasonably target-like way in any mind it tests (I’m eliding the huge question of what “target-like” knowledge would look like). What I picture is something like a read-only Matrix brain socket, capable of checking the end result of learning (something of a neuroscience miracle, given that instantiation is probably vastly different in different brains, and more complicated the more complete the learning). Now add back in every barrier between this mindreading test machine and conventional tests that exist now. Besides the obvious one of requiring the test-taker to actively retrieve information, there are all the non-subject-related but highly influential factors like sleep, anxiety, allergies, handwriting, and the other people around you taking the test, making noise or maybe just intimidating you by looking smart. Add in the fact that many tests are done by reading and writing and you push all that knowledge through a bottleneck of technology that is common but unintuitive for our species. Many language tests, even popular ones that purport to be about “international communication”, are administered in crowded lecture halls by means of a cheap Casio CD player to rows of students looking downward at a sheet of paper. A perfectly accurate test is to the Matrix what TOEIC is to a Dungeons & Dragons manual.
Language in particular is a skill that for test-writers to have any access to they must dig downward through many layers of shifting and misdirecting layers of cognitive sediment. Through the points of entry provided by our eyes and ears and those of our testtakers, we need to see whether a representation of a complex system of words and rules in one brain is similar enough to a representation of the same system in other brains to meet the standards of progress expected for a semester’s work. This would be difficult enough if our speech always gave an accurate assessment of what our thoughts were, but our mouths are but the very exit of the funnel into which a whole lot of neuronal activity is poured, and often spilled.
When you think about it, these issues never completely go away for any indirect measure of skills, knowledge, or attitudes. A multiple-choice history test isn’t as vulnerable to the frequent bugaboo of language tests that the suite of skills you’ve developed in communicating in another language just happens not to include one of the words on the card you’ve been handed and ordered to talk about. I believe though that many history teachers skip consideration of these issues in favor of enjoining students to be prepared. Language teachers have no such luxury; to be even barely competent at another language is to have applied knowledge (and/or implicit knowledge) in a variety of domains and the ability to improvise with it. It’s as if every history test were a debate where the topic is expected to migrate randomly from 2016 to post-WW1 Catalonia.
Since I went way overboard with my last of these entries, why not another? I also happen to think that a lot of the test-based sorting that goes on between the ages of 12 and 18 ostensibly on something called “academic ability”, which is generally understood to be a biologically-based capacity for computation and memory, is really sorting for being able to be interested in what adults want you to be interested in for those years. Like Ralph Nader used to say, Americans know plenty of things, even fairly dense statistics, they’re just generally slugging percentages rather than p-values. It’s not that smart people know probate law and stupid people know when it’s fixin’ to rain, it’s that “smart” people almost instinctively align the things they know with things that earn social capital among other “smart” people. Being able to do this during one’s teenage years is a talent, but we shouldn’t mistake it for simply having more brain power. People’s talents are not always apparent when education systems say they should be, and in any case test measure a hundred other things (when I was young, being Scantron-friendly was a big one) before how intelligent someone is comes up.