The Day the Pope Dropped the C-Bomb - What the F: What Swearing Reveals About Our Language, Our Brains, and Ourselves - Benjamin K. Bergen

What the F: What Swearing Reveals About Our Language, Our Brains, and Ourselves - Benjamin K. Bergen (2016)

Chapter 5. The Day the Pope Dropped the C-Bomb

By any account, Pope Francis has made interesting choices. He has foregone the traditional, opulent Papal Apartments, electing to reside in a small, modest bedroom in a Vatican guesthouse instead. He wears a silver ring instead of the traditional gold. And he has made a practice of washing feet each Easter—not the feet of priests as his predecessors did but those of patients at a home for the elderly and disabled, non-Catholics, and women. In aggregate, these many small acts of modesty have helped him build up a public image as the pope of the vulgar people.

Still, no one expected him to be quite this vulgar. On March 2, 2014, while delivering his weekly Vatican address, he slipped in a word that caught the world by surprise. He was speaking in Italian, and this is what he said: in questo cazzo. This translates literally as “in this dick,” but since the offending word cazzo is used in Italian roughly as fuck or fucking are in English, in colloquial terms, he said something roughly equivalent to “in this fucking …” I’m no papal scholar, but I’m willing to go out on a limb and proffer that this is an uncommon turn of phrase for a pope, even one fresh off an appearance on the cover of Rolling Stone. The media ran with it—the story was featured on the Huffington Post,1 NPR,2 and the Daily Mail,3 just to name a few.

This particular incident is so surprising and juicy in part because it runs afoul of how we expect the pope to express himself. Uttering a profane word like cazzo places him in an ideological double bind. If the curse word was accidental, then he’s just as linguistically fallible as the next guy, which isn’t necessarily the ideal public image for the professed terrestrial representative of God. Conversely, he might still be infallible, yet have intended to say cazzo. Again, likely not the image he means to project. All signs point to the former explanation—that this was a case of mistaken articulation. The clearest evidence is what he said next. Immediately after in questo cazzo, he corrected his phrasing to in questo caso, meaning “in this case,” which seems more like something you’d hear from the Holy See. With a slip of the tongue, the pope revealed one more way that he’s like the people. Or did he?

Everyone’s tongue slips, including the tongues of people who are not the pope. Researchers who try to quantify this sort of thing report that people generate speech errors at an average rate of one or two errors every thousand words, or one error per ten minutes of speech.4 But not all of these errors are equal. Some, especially errors that produce profanity, are particularly revealing. These are a big deal for the science of speech production—how people plan speech, select words to say, and articulate sounds. People produce profane and innocuous errors at different rates, which turns out to be one of the best ways to understand why we flub our words, how we’re able to avoid errors, and how the brain manages all this. And as we’ll see, in flubbing caso, the pope might have shown more of his hand than he intended.

# $ % !

Producing language is one of the most complicated things you do, and yet you hardly ever notice you’re doing it. When speaking fluently, you talk at a rate of about 120 to 180 words per minute.5 That’s a lot to churn out—two to three words per second. And despite the fact that words are separated by spaces in writing, you don’t typically pause after each word (unless you’re William Shatner).a This is in part because people in many speech communities feel pressure to keep talking once they’ve started. If, as a speaker, you don’t manage to make sound come out of your mouth more or less continuously, someone else might believe that you have no more to say and jump at the opportunity to take the floor. Or they might become concerned that something is wrong and check to make sure you’re not asphyxiating on your food or drifting to sleep. So you keep talking. And this taxes the system. In order to produce connected speech, you have to decide what the next word will be and start planning to say it before you’re even done with the current word. You have to look ahead. This, in part, explains why speech is populated with ums and likes and other filler words that allow you to keep making sound, even if you don’t know exactly what you want to say. And it also generates speech errors.

And if you, dear reader, actually are William Shatner, I take it all back! I’m a huge fan! I especially loved your Esperanto work in the cult-classic 1965 film Incubus! I have so many questions! Call me!

Among the different kinds of speech errors you make—swapping, dropping, and even adding sounds—certain errors most clearly stem from the preplanning of words that are a little farther down the assembly line. A simple but common type of error happens when a person anticipates a later sound and accidentally pronounces it too early. For instance, you might intend to say shark pit but instead accidentally mispronounce it as park pit. An error like this could only occur if the speaker were already planning later words (in this case, pit) while still articulating the current one (shark), because the p from the second word ends up at the beginning of the first word. It’s also common to accidentally swap two sounds; you might know this as a spoonerism, but psycholinguists call it an “exchange error.” A typical example would be intending to say shark pit but accidentally producing park shit. This again would arise because you were planning the next word while articulating the current one.

Because planning ahead causes exchange and anticipatory errors, we can actually harness them to reveal precisely how far ahead speakers plan. Some exchanges and anticipations are much more distant than just the adjacent word. We know this because linguist Victoria Fromkin of the University of California, Los Angeles, spent a large part of her career compiling a massive database of actual, observed speech errors. Among these errors were cases like a Tanadian from Toronto (instead of Canadian) and Baris is the most beautiful city (instead of Paris). These cases demonstrate how distant words can be and still exert influence on each other. Canadian and Toronto are two words and five syllables apart, while Paris and beautiful are four words and five syllables apart.6 The distribution of speech errors suggests that when you make errors, you’re already planning one to five words ahead of the word you’re currently articulating.7 And you’re likely planning ahead even when you don’t make errors.

So is it possible the pope’s error was due to a mere planning inefficiency? Any single instance of a speech error could have many causes, and the only way we can tell that planning plays a role is in aggregate across many errors where the patterns reveal themselves statistically. But if we closely examine exactly what the pope said and what he was planning to say later, we can determine at the very least whether planning ahead is a plausible explanation for his caso to cazzo slip. So here’s the full text of the sentence containing the error, as released by the Vatican in the official transcript of his prepared remarks. WARNING: The following may contain Italian.

Se ognuno di noi non accumula ricchezze soltanto per sé ma le mette al servizio degli altri, in questo caso la Provvidenza di Dio si rende visibile in questo gesto di solidarietà.8

In this critical sentence, the pope is making an appeal for charity, as you can tell from the English translation: “If each one of us does not amass riches only for oneself, but half for the service of others, in this case the providence of God will become visible through this gesture of solidarity.”

Let’s look at the words that followed caso for possible anticipation candidates. There’s la, the feminine version of the definite article “the,” and then Provvidenza (“providence”). Notice that Provvidenza has a z right before the last vowel, just as caso has an s right before its last vowel. Could Francis have been planning the long, complicated word Provvidenza, while still articulating caso, such that he replaced the s of caso with an anticipated z? It’s possible. The two sounds are quite similar,b and they’re in the same place only two words apart, in words that are also of the same part of speech (which can increase the likelihood of errors).9 But this close reading of the text doesn’t lead to anything conclusive. At best, it tells us we should hold on to preplanning as a reasonable suspect.

But they’re probably not the sounds you’re thinking of. In Italian, the letter s in caso is pronounced how an English speaker would pronounce z, and the Italian z in Provvidenza is pronounced like an English ts. The double zz of cazzo is pronounced as an elongated ts. But this doesn’t substantively change the argument I’m making; it just makes everything more complicated if you don’t speak Italian, which makes it seem like the type of thing a considerate author would quarantine in a footnote.

There are, of course, other potential causes of the pope’s error. The popular conception of speech errors holds that they’re due more to meaning than mechanics. Sigmund Freud famously argued that when you make a speech error, it can reveal the inner workings of your unconscious mind.10 He wrote, “Almost invariably I discover a disturbing influence from something outside of the intended speech … a single unconscious thought, which comes to light through the special blunder.”

Certainly Freudian slips abound. When Tiger Woods had to withdraw from a tournament with a sore neck in 2010, only a few months after a dozen women had accused him of infidelity, a reporter on the Golf Channel announced that Woods was suffering from a bulging dick.11 Freud might interpret this error as deriving from something the speaker was actually thinking about, which managed to sneak its way into her words. Another well-known slip is credited to Condoleezza Rice, then national security advisor under President George W. Bush. Rice was famously committed to her job (as of the time of this writing, she has never married or had children), so it caught the White House press corps off guard when, at a Washington, DC, dinner party, she was reportedly heard to say, “As I was telling my husb- … As I was telling President Bush.”12 The word husband doesn’t sound much at all like President Bush, so the best explanation is that rather than sound similarity between the words, some aspect of meaning—what she intended to say or what she was thinking about—drove this error. But we will have to leave determining the precise semantic motivation in this case to those more familiar with the principles of psychoanalysis.c

This reminds me of a joke about Freudian slips: A patient tells his doctor, “Doctor, last night I made a Freudian slip. I was having dinner with my mother, and I wanted to say, ‘Could you please pass the butter.’ But instead I said, ‘You manipulative bitch, you completely ruined my life.’”

Could there have been similar Freudian motivation for the pope’s articulatory malfunction? Maybe. Seventy-five years after Freud’s death, the principles of Freudian psychoanalysis have largely fallen out of favor, at least among most researchers interested in cognition and behavior. But remnants of his theories persist, particularly when they aid in predicting behavior. And taboo language shines in speech error experiments based on Freudian premises.

Suppose, as Freud would have it, that whether you’re the head of the National Security Agency, the pope, or just you, things you’re thinking about but not intending to say out loud influence the speech errors you make. How could you tell? You might come up with an experimental paradigm like the one used in a set of studies conducted by psychologist Michael Motley and his colleagues in the late 1970s and early 1980s. They had people come into their lab and read pairs of words out loud, pairs like back mud, bad mouth, and so on. In using this paradigm, they found that people make errors at a certain base rate. And they wanted to know whether that rate would go up when people were thinking about something that they were actively trying not to talk about. So here’s the methodology they devised.13 It’s pretty clever.

Participants were all young, self-identified heterosexual men. Half were greeted by Motley himself, dressed professionally, and the other half were met by a provocatively dressed, young, female research assistant. As Motley describes it, “She was wearing a translucent, nearly transparent, off-the-shoulders top with a super-short yellow skirt.” Motley continues, “And we had her sitting on a stool where her knees were at eye-level with the guys.”14 It goes without saying what the young, heterosexual, male participants were expected to be thinking about.

In the experiment itself, participants had to read word pairs at a rate of one pair per second—a pace similar to normal speech. And as the critical component of the experiment, there were two types of word pairs. The first included pairs like mad bug. As you can see, making an anticipatory or exchange error in pronouncing this pair would produce something totally innocuous, like bad mug. But the second type would produce sex-related errors—let gaid makes get laid and share boulders gives you bare shoulders.

When they counted the number of exchange errors the participants made, Motley and colleagues found that the half of participants who had been sitting with the scantily dressed research assistant did indeed make more errors overall. But the increase was driven entirely by the sex-related word pairs: participants made more errors on let gaid but not on mad bug. They made the same number of innocuous errors regardless of who was sitting next to them.

We might never resolve whether this means, as Freud would explain it, that our cognitive unconscious is straining to have its say through daily speech errors or simply that when you’re thinking about sex, you’re more likely to say sex-related words. But it is clear that things you’re thinking can make their way to the surface, sometimes overcoming your will to suppress them. This tells us that selecting words to articulate is more complicated than merely picking the right words for the meanings you want to express. It involves selecting and suppressing thoughts as well—because under certain conditions, those thoughts bubble up in the form of speech errors.

As a mere fallible human—and possibly a heterosexual male one to boot—even the pope is not immune to this sort of Freudian influence. In inadvertently uttering cazzo (“dick”), could the pope have been talking about the virtues of charity but thinking about his vow of chastity? It’s possible.

We’ve established that the pope might have erred due to the sequence of sounds he was planning to utter, and we’ve entertained the Freudian possibility that things underneath the vestments were on his mind. But he was also speaking under challenging conditions. This wasn’t an oration he delivered alone in the shower. Public speaking can be distracting—feedback from the amplification system and things happening in the crowd draw your attention from the task at hand. And it can be stressful too. Both of these factors increase the rate of speech errors.15 I have occasion to observe this frequently in daily life. I live in San Diego, and despite the reputation America’s Finest City enjoys for outstanding news broadcasting,d the local announcers are a bit uneven. One particular afternoon host on the public radio station has a habit of stumbling over words, whether describing the upcoming segment on All Things Considered or announcing the names of companies providing local underwriting. And there have been some doozies, as I’m sure you can imagine, when the local businesses in question have names like Chism Brothers Painting and Bastyr University.e The pressures of public address must surely be as challenging for popes as for anyone.

For a historical perspective, see the following documentary: Apatow, J., and McKay, A. (2004).

Paradoxically, we hold people to a higher standard in exactly those conditions that are most likely to induce errors. Even knowing as I do that speech errors are an inevitable part of speech production, especially when a person is experiencing stress (as radio broadcasters probably do), I often find myself yelling at the radio, “Come on, step up your game! This is public radio! You don’t think I donated just for the tote bag, do you?” And then I remember that I didn’t donate this year, and I feel remorseful, but then I rationalize not donating with the thought that if the broadcasters didn’t bungle their delivery all the time, maybe they’d deserve my money. The human mind is a silly place.

What’s more, the pope was speaking in a foreign language, which makes fluent speech harder. You make errors in a foreign language simply because you don’t know the gender of a word or because you have a tenuous grasp of some detail of the language’s grammar. (Do indirect object pronouns come before or after the verb? And for that matter, what’s an indirect object pronoun again?) These lack-of-knowledge-based errors aside, your base rate of normal slips of the tongue also goes up—by 1400 percent.16 So perhaps we shouldn’t be surprised that delivering an address in Italian, instead of his native Argentinian Spanish, might ratchet up the frequency of a pope’s errors.

It seems reasonable that a fallible pope ought to be subject to the same pressures and linguistic traps as anyone else, and in this particular case they may have conspired to generate the linguistic C-bomb we witnessed. Indeed, with all these pressures at play, it’s surprising that what comes out of your mouth—or the pope’s—isn’t just a stream of mistakes. Strangely, for the most part, it isn’t. While we all make speech errors, the majority of words we produce really are precisely what we intend. And as we’ll see, profanity provides the most revealing clues to how we accomplish this.

# $ % !

Some psycholinguists have hypothesized that the only way you could possibly be as good at speaking as you are is by somehow monitoring your planned speech. You might, in this view, have an internal editor in your head that pays attention to the words you’re planning to say, the order you’re planning to say them in, and exactly how you’re planning to pronounce them. It’s like quality control at the end of the assembly line, right before the words get packed up and leave the factory. When your internal editor notices something about to go awry, it stops the conveyer belt and sends the offending word back for repair. Of course, some errors get through, so we know the editor can’t be perfect, but the idea is that perhaps internal self-correction keeps your errors down to the acceptable level they’re at.

How would we go about detecting such an internal editor? It’s tricky because an editor would leave little trace. If there is indeed an editor, and if it’s mostly successful, then there will be a few but not many errors to observe. Likewise, if there isn’t an editor, there will be a few errors. The problem is figuring out how many errors a person would have made without an editor if we don’t know whether there’s an editor involved in the first place.

Let me flesh this out with our factory analogy. Suppose you want to know if a factory that cans diet soda has a quality control department. You can start by observing errors—every once in a while, someone finds a cockroach in a sealed can of diet soda. Let’s say it’s once out of every hundred million cans opened. Now, were thousands of other roaches in cans of soda caught by a vigilant quality control department? Or is the fabrication process itself so hygienic that an interloping roach finds itself trapped in a saccharine sarcophagus only one time in one hundred million? How would you tease apart these alternatives? Seems like a dead end (and not just for the cockroach).

But here’s a possible way forward. To continue with the analogy, suppose you know that a cockroach finding its way into certain types of soda would be particularly devastating for the factory’s reputation. For instance, suppose that the same factory packages the very same liquid not just as a brand-name soda but also, labeled differently, as an in-house, generic supermarket label. The only difference is the cans. And let’s presume that the company has a greater incentive to ensure that the brand-name version is roach-free because it’s a bigger source of revenue, and a single photo of just one roach in the brand-name soda will break the Internet and gut the company’s bottom line. By contrast, the company might reason that people sort of half expect to find insects in in-house supermarket-label soda. Maybe that’s even why they buy it. Here’s the point. If there’s no quality control department, then you’d expect to find cockroaches in the brand-name and generic sodas with about the same frequency; they’re produced and canned in the same factory using the same process. But if you found far fewer canned roaches in the brand-name versus the generic soda, that would tell you that someone’s making sure that when it matters more, mistakes don’t make it out the door and onto the truck.

Several psycholinguists have used exactly the same logic with speech errors. To do this, you need to find certain speech errors that would have graver consequences than others. What’s the linguistic equivalent of a cockroach in a brand-name can of soda? Well, the speech errors with the direst results are probably those that generate profanity. So the question becomes, when you put people in a position to say the wrong words, do they make the same number of errors, regardless of whether the error would produce profanity? If so, then there’s no evidence of an internal editor. But if people make fewer profane mistakes than nonprofane mistakes, then that implies that people are internally suppressing the errors before they hit the tongue. They’re self-monitoring language.

If you want to use this logic, you have to devise a way to induce speech errors in the lab. The first group to do this came up with a clever design.17 You’ll recall Michael Motley, the researcher with the provocatively clad research assistant in the Freudian slip study. He and his colleagues had people read carefully designed word pairs one at a time. Some of the pairs had the potential to become obscene if the participant made an exchange error. For example, tool kits seems totally innocuous, until you recognize how it would sound as a spoonerism. You can construct lots of potentially obscene lures like this: bunt call, hit shed, duck fate, heap chore, fast luck, and so on. The key question in Motley’s studies wasn’t whether you can get people to mistakenly mispronounce these. You can. Following the cockroach logic, the question was whether people would make fewer errors on profane-potential pairs like these than on pairs that do not threaten obscenity. So there was a second type of pair, like tool kicks. These words are identical in most every way to tool kits—they start with the same sounds, they’re the same length when pronounced, they’re the same parts of speech, and so on. The difference is that the errors you might make, producing things like cool ticks, aren’t in the least taboo. Would there be more errors on these inoffensive pairs than on the offensive ones?

One methodological note, in case you’re interested in trying this at home: Just reading pairs of words out loud doesn’t yield many errors. So you need to boost the error signal. One thing you can do (and the researchers did) is stack word pairs one after another leading up to the critical one in order to set people up for failure. For instance, if you want people to make an exchange error on tool kits—swapping the initial t and k—you’d stack pairs in front of it as below. Try reading these aloud:

kind tiger

calm time

cold tea

tool kits

Setting people up with the swapped consonants before critical pairs like tool kits or tin cable makes participants much more likely to produce errors. This provides more opportunities for mistakes, which makes potential differences in the frequency of taboo and nontaboo errors easier to measure.

So, all other things being equal, do people make fewer errors on pairs like tool kits, where the result would be offensive, than on ones like tin cable, where it wouldn’t be?

Two studies did this originally, in 1981 and 1982. The chart you see on the next page shows the number of errors people produced on average. There were many more neutral errors than taboo ones in both studies (though the difference was bigger in the second study). It follows that people in these studies were successfully avoiding errors specifically when the results would be obscene. Fewer roaches in the brand-name soda. People were self-monitoring.

But this is only the beginning of the evidence. If you’re clever enough, as the researchers working on this are, you can come up with some other things you’d expect to see if people were doing internal quality control. Here’s one. Suppose you’re on quality control at the factory and you find a can with an unwelcome passenger inside. Sending it back to be fixed or replaced should add time to the process. So even when the ultimate product shows no sign of error, the time it takes to produce it could be a hallmark of monitoring. Mapping this over to speech, if the lower error rates with taboo word pairs are due to editing, then that should show up on how long it takes people to produce the pairs correctly—it should take people longer to say pairs correctly when they’ve planned an error but subsequently taken the time to catch and correct it.

People make fewer speech errors when the result would be taboo

People make fewer speech errors when the result would be taboo.

But people are constantly not making errors. We know that people make fewer errors producing taboo words, but that doesn’t mean that every time they successfully avoid an error they’re making a correction. Some proportion of the time, they’re probably just getting it right from the outset. We need a way to diagnose whether people are activating an internal plan to say something that they eventually don’t produce. This is nearly impossible to do.

Except with taboo words. That’s because even thinking about taboo words has special effects on the human body. When people say taboo words, their pores open up within seconds, and they sweat. And this is measurable.18Sweat conducts electricity, and the more sweat there is on your skin, the more conductive the surface of the skin will be. So, you can pass a very low level of electrical current across people’s skin, say, on a finger, and measure how conductive the skin is. When people start sweating, the conductance increases. This is the basic logic behind what’s known as the galvanic skin response (GSR). You might be familiar with this tool because it’s one of the components of a traditional lie-detector test—skin conductance also changes as a function of anxiety, which may be driven by lying in some people.19 Critically, although many words make people sweat, including emotional words like murder and hate, the most profuse and most reliable sweating comes from hearing taboo words.20

Suppose you were able to measure skin conductance as people performed the word-pair reading task. If merely planning a taboo word—even one that ultimately gets internally corrected and is therefore never pronounced—makes people sweat, then this should show up as an increase in skin conductance.

So here’s the idea. You have people produce word pairs. Some, like tool kits, can induce profane errors, which, most of the time, people successfully avoid making. But looking just at the times when people didn’t make an error, you can split these correct pairs into two groups. The first includes those where the participants’ skin conductance spiked—suggesting that they had internally activated a taboo word, even though they didn’t make an error. And the other group includes instances where there was no error and also no spike—suggesting that no profane word was ever even considered. And you look to see if the sweaty trials also take longer. Together, an increase in skin conductance and a longer time to produce the word pair would provide compelling evidence—albeit still circumstantial—that people were internally planning, but ultimately taking the time to correct, taboo words.

And that’s what was found. The chart on the next page shows how long it took people to successfully avoid taboo errors when they had large and small GSRs.21 As you can see, when they were sweating (the high-GSR group), they also waited significantly longer to start talking (the left bar is taller).

Here’s another, lower-tech way to detect whether people are doing active editing before they speak. It has to do with the different types of errors people make. We’ve already talked about anticipatory errors (hit shed becomes shit shed) and exchange errors (hit shed becomes shit head). But people also make something known as a perseveratory error, which takes a sound pronounced earlier and repeats it later (hit shed becomes hit head). The taboo-error word pairs we’ve been talking about, like hit shed and heap chore, typically have the potential to produce an obscene word on either the first word or the second word but not both. As a result, for any given pair, an exchange error will always produce an obscene word (hit shed becomes shit head), and so will either an anticipatory error or a perseveratory error, but not both. In the case of hit shed, an anticipatory error produces the obscene shit shed, but a perseveratory error gives you hit head, which isn’t so bad. With pairs in which the taboo word would be in second position, like heap chore, only the perseveratory error (heap whore) and not the anticipatory one (cheap chore) would be obscene. So if people are editing, then when the taboo words would be first, people should avoid the taboo word by making more perseveratory errors and fewer anticipatory ones. And when the taboo word would be second, this should reverse: people should make more anticipatory than perseveratory errors. That’s what you see in the next chart.

People who successfully avoid making taboo errors take longer to speak when they’re sweating more (high GSR)

People who successfully avoid making taboo errors take longer to speak when they’re sweating more (high GSR).

What’s more, you might notice that the numbers are far greater when the second word of the pair is taboo than when the first is—the light bar on the right towers over the others. Why would this be? Why would people make more anticipatory errors when the second word would be taboo? This too seems to implicate internal editing. If editing takes time, then presumably you’d be more likely to catch and stop an error when it’s on the second word of a pair than when it’s on the first. And if the errors you’re most vigilant about are taboo ones, then it follows that you’d be most likely to catch and avoid those errors when they’re planned for the second word.

When people do make errors, they avoid generating a taboo word

When people do make errors, they avoid generating a taboo word.

This work, all conducted in the late 1970s and early 1980s, was influential at the time because taboo language provided a privileged way to detect internal editing processes. More recently, researchers have been interested in tracking down the brain basis for internal quality control of language. And they’ve turned to the same foundational paradigm, with some major technological additions.

The first new twist has people perform basically the same word-pair reading task while tethered to an electroencephalogram (EEG) machine that measures their brain waves.22 In a nutshell, here’s how EEG works. Electrodes (lots of them—as many as 256 but more often 32 or 64) are applied harmlessly to the scalp. The electrodes measure fluctuations in the electrical field, and the specific electrodes used in EEG experiments are sensitive enough to measure microvolts—one-millionth the voltage of your AA battery. A lot of things affect the electrical field measured by electrodes placed on the scalp, including passing airplanes, elevators, and even the muscles firing when a participant blinks. But it turns out that highly sensitive electrodes can detect something far more important to cognitive scientists: the activity of neurons. When a neuron fires, at the chemical level a bunch of ions flow into or out of it. And those ions carry electrical charge (those pluses and minuses attached to Ca2+ or Cl-). So when a nerve cell fires, the flow of ions affects the electrical field around it. And when thousands or millions of neurons oriented similarly and located close to each other fire at once, the electrical field change is strong enough for those sensitive scalp electrodes to measure.

Over many decades of research, neuroscientists have observed people performing hundreds of tasks while measuring their brain waves using EEG. And they’ve observed that certain types of behavior produce predictable changes in the measured electrical field. For instance, about four hundred milliseconds after you see a word, the electrical field centered over the top of your scalp deflects negatively. This is believed to index the process of interpreting the meaning of the word and integrating it into your ongoing understanding of the language you’re reading or hearing.23 Other components of the electrical signal relate to other specific behaviors and cognitive processes.

And so, when you wire people up to an EEG and have them read error-inducing word pairs, their brains produce different electrical signals, depending on what type of error they’re avoiding.24 A temptingly taboo pair like bunt call induces a stronger negative-going inflection over the center of the scalp about six hundred milliseconds after the prompt to speak, as compared with what happens when the same brain sees a neutral pair, like bunt hall. And this is true even when the person doesn’t actually commit a verbal error. This tells us that the brain is doing different things when successfully avoiding a taboo error versus a neutral one. It doesn’t reveal exactly what those different processes are—we only know for sure that neurons are firing differently—but it does tell us when that difference occurs. At six hundred milliseconds after the prompt to speak, people’s brain activity diverges, depending on the type of error that would be produced. This suggests that the brain is doing something different when you’re planning speech and not making taboo errors than when you’re planning speech and not making mundane errors.

But we want to know not only whether something different is happening in people’s brains and when but also what is happening. And because the human brain exhibits localization of function—as we saw in the last chapter, circuits located in different places execute different computations—knowing the location of the brain differences revealed by EEG could help us figure out what they mean. Unfortunately, it’s notoriously challenging to extract locational information from EEG; changes to the electrical field measured at a particular electrode aren’t necessarily due to the activity of neurons located directly below that electrode in the nearest piece of tissue. (The issue is complex, but it turns out that the direction the neurons are pointing matters, among other complicating factors.)f

The real issue is that this is a type of “inverse problem.” Here’s the basic idea. Even if you know what the output of a complex system is, tracking down its causes turns out to be impossible. This is because although the output is determined by the system, the system is complex enough that many different system behaviors could produce the same output. For more, see Baillet, S. (2014).

But other techniques can tell us something about location. One is functional magnetic resonance imaging (fMRI), which measures fluctuations in the magnetic field from outside the body. When neurons fire, they use energy, and the more they fire, the more oxygenated blood flows to them, providing more energy (in the form of ATP, which you might remember from high school biology). Rushes of oxygenated blood can be measured by their magnetic signature, and so this can serve as a somewhat delayed and messy proxy for where neurons are firing in the brain. When you get more of this blood-flow signal to a particular region in the brain for one task than another, chances are the neurons in that region are doing more during the first task than during the second.

Applied to the same word-pair reading task we’ve been tracking, fMRI starts to fill in the picture of what the brain is up to while it’s editing planned words. When you compare the brain’s blood-flow signal during the taboo-eliciting pairs (tool kits) and neutral pairs (tin cable), you find that they’re significantly different in one place, shown on the next page.25

The little blob in the lower part of the frontal lobe of the right hemisphere is in a region called the right inferior frontal gyrus, which is implicated in inhibitory control—your ability to stop or prevent yourself from doing something. Suppose you’re waiting at a traffic light, for instance. It turns green, so you prepare to step on the gas, but then just as quickly, it turns red (perhaps because an emergency vehicle or train has overridden the usual light sequence). You need to quickly interrupt your plan to move forward. This appears to be the specialty of the right inferior frontal gyrus. It sends a hold-the-presses signal to stop action before it starts.26

Is it just me, or does that sound a lot like what an internal editor would be doing when confronting a taboo word that’s about to come out of your mouth?

# $ % !

The right inferior frontal gyrus (a brain region involved in inhibitory control) experiences increased blood flow when people avoid taboo errors as compared with nontaboo errors

The right inferior frontal gyrus (a brain region involved in inhibitory control) experiences increased blood flow when people avoid taboo errors as compared with nontaboo errors.

The evidence from taboo speech errors and what happens when you avoid them implicates an internal process of self-monitoring. You are constantly censoring your words even before you articulate them in order to avoid slipups like the pope’s. And it looks like, as far as the brain is concerned, suppressing an error—in particular an erroneous taboo word—is a lot like suppressing an action in response to an external stop signal.

But this is only the beginning. There are other ways to tell that people call in an all-systems-halt signal when they feel they’re about to inadvertently say something obscene. We know this because psychologists have been trying to trick people into profane slipups in a variety of ways for decades.

One way is a well-known phenomenon called the Stroop effect. Basically, if you have people look at words and say what color font they’re printed in, they do pretty well. Show them a word written in blue, and they can say it’s blue. Unless, that is, the word printed in blue ink happens to be the word red. Then it gets a lot harder—people are slower and make more mistakes. That’s the normal Stroop effect, and it’s interesting in its own right because it reveals that you can’t help but process the meanings of the words that you read, even when willing yourself to pay attention only to the color of the ink. You process meanings automatically and you’re tempted to produce them. To avoid errors, you slow down.

Strangely, taboo words induce a Stroop effect as well: they interfere with people’s ability to say what color a word is printed in. It’s hard to demonstrate this on the printed page when you only have black ink (what century is it again?), but here’s a quick-and-dirty approximation. We’ll replace color with typography. Your job is to go through the list of words below, in order, and say whether each one is printed in italics, bold, or underline. Do it as quickly and accurately as possible. OK, go.




















If all worked according to plan, you should have noticed that this task was harder for some words than others. The words that denote a particular font style, like the words italics and bold, should have been tough when they didn’t match how they were printed. They should have taken longer, and you might even have made mistakes on them. That’s the normal Stroop effect. But the taboo words should have taken longer as well: this is the taboo Stroop effect. You can see data from the first taboo Stroop experiment conducted on the next page, as described in a 1995 paper.27 As you can see, the normal Stroop task causes a delay of about 150 milliseconds—compare the middle bar with the leftmost “incongruent” bar. The taboo Stroop (the middle bar versus the rightmost bar) is nearly as large.

The normal Stroop effect causes people to name colors about 150 milliseconds slower, as shown by the difference between the control and incongruent conditions. The taboo Stroop is of similar magnitude, as demonstrated by the difference between the control and taboo conditions

The normal Stroop effect causes people to name colors about 150 milliseconds slower, as shown by the difference between the control and incongruent conditions. The taboo Stroop is of similar magnitude, as demonstrated by the difference between the control and taboo conditions.

What causes the taboo Stroop? Certainly part of the story has to be the same as for the standard Stroop: it’s hard to ignore the meaning of words that you focus on. Otherwise, you could selectively attend to the style of printing, and there would be no Stroop effect for us to talk about in the first place. Taboo words of course are particularly hard to ignore. But why do taboo words cause a delay in speaking, as compared with control words? Suppose, as we conjectured earlier, that when you perceive that you might mistakenly produce a taboo word, your internal monitor hits the brakes. This would lead to longer reaction times when the information that you’re supposed to ignore (but might erroneously produce) is taboo. So the taboo Stroop effect could be explained once again by internal self-monitoring.

To be fair, there are other possible explanations. Perhaps the most convincing one sidesteps the issue of production and inhibition entirely. We know that seeing a taboo word evokes an emotional response. As I mentioned earlier, when you measure skin conductance as people simply listen to words, even when they don’t need to speak at all, GSR is larger for taboo words than for neutral ones.28 So an alternative explanation of the taboo Stroop effect is simply that this emotional response sucks up mental resources that you’d otherwise need to name the color of the word.29 In effect, the emotional jolt you get from profanity overwhelms you to the point where other tasks you’re trying to perform concurrently get put on the back burner and therefore take longer.

There’s some corroborating evidence for this view. Experiencing strong emotions leads people to instantaneously encode a memory of what they’re experiencing when that emotion hits, generating a so-called flashbulb memory, like the birth of a child, seeing the space shuttle explode, and so on. So if the taboo Stroop stems from a strong emotional response to seeing a taboo word in the middle of a psychology experiment, then your brain should encode an image of the event—in this case the word—that’s stronger than memories typically encoded for less emotional experiences, like neutral words in the same experiment.

In fact, when you spring a pop quiz on people who have participated in one of these taboo Stroop experiments, they remember the taboo words they saw far better than they do neutral ones. Not only do they remember which words they saw, but they’re also better able to remember what color the taboo words were printed in and even where they appeared on the screen.30

The upshot is that the taboo Stroop effect could provide further evidence that people are self-monitoring so they don’t accidentally say the wrong word, but it’s also consistent with this alternative, emotion-driven explanation.

There’s another corroborating effect, similar to Stroop, which comes from a paradigm known as picture-word interference.31 The basic idea is that you have people name pictures of familiar artifacts and organisms—hammers, tigers, and so on. This isn’t hard. But the task gets slightly harder when you print words over the pictures. The words don’t really interfere with your ability to perceive the object, but they can make it harder to name it, depending on how the word and picture are related. The details are tricky, but, generally, if you show people a word that sounds like the name of the picture (for instance, if you print “dock” over a picture of a dog), then people name the picture faster; however, if the word is related in meaning (like “cat” over a dog), people take longer.g

A caveat in case you’re a psycholinguist or plan to become one: the precise timing with which the picture and word appear on the screen affects the size of these effects.

In picture-word interference, a person names a picture as quickly as possible while attempting to ignore the written word. Taboo words interfere significantly with picture naming as compared with neutral words

In picture-word interference, a person names a picture as quickly as possible while attempting to ignore the written word. Taboo words interfere significantly with picture naming as compared with neutral words.

Just as there’s a taboo version of the Stroop effect, there’s a taboo version of this picture-word interference effect. If you print an unrelated, neutral word over a picture (above, left), people have no trouble naming the picture (it’s a dog). But put a taboo word there (above, right), and people slow down, by about forty milliseconds on average.32

Again, an internal editor could be performing quality control here. Alternatively, this generalized slowing down could stem from the emotional reaction people have to the printed taboo word. Taboo words are special in several ways, which means that there are different ways that they could have the same effects on people trying to produce speech.

# $ % !

Why did the pope blunder into profanity? For the same reasons the rest of us occasionally do. There’s time pressure on a speaker to pronounce the present word while planning the ones to follow. Layer on top of that the stress of speaking in public, the challenge of negotiating a foreign language, and the Freudian attraction of thoughts you think you ought to suppress, and the pope’s ability to say anything fluently is a small miracle (but not in the technical sense that would qualify him for canonization). Mistakes like the pope’s reveal the pressures at work in every moment of language use. They underline what a remarkable feat we accomplish in navigating the gauntlet of potential gaffes at every turn of phrase.

And although we tend to notice speech errors when they generate taboo words—those errors really grind our ears—these are far less frequent than innocuous ones. We saw why. A person’s internal monitor isn’t as strongly compelled to censor unintended words when they’re inoffensive. And so the particular error that the pope made (replacing caso with cazzo) is actually a bit surprising, not just because it reveals him to be fallible but because you’d think his monitor would kick in when it detected a potential profanity.

So why did the pope’s self-monitoring fail in this particular case, spectacularly producing the Italian C-word? Possibly the pressures of public speaking in a foreign language overwhelmed his ability to self-monitor, and that profanity slipped by unchecked. But let me offer an alternate account. It’s also possible that he slipped into a profane error because his self-monitoring system didn’t know that he was about to make a profane blunder. In other words, the pope might just have revealed himself not to know the C-word in Italian. He may have produced a word whose meaning he didn’t know and then corrected himself from the to-him-innocuous slipup. Ironically, by flubbing a profane word, he might have shown himself to be something less than the pope of the people that his public image suggests.