What Makes a Four-Letter Word - What the F: What Swearing Reveals About Our Language, Our Brains, and Ourselves - Benjamin K. Bergen

What the F: What Swearing Reveals About Our Language, Our Brains, and Ourselves - Benjamin K. Bergen (2016)

Chapter 2. What Makes a Four-Letter Word?

Across the globe, profanity tends to emerge from particular domains of meaning—I refer you to the Holy, Fucking, Shit, Nigger Principle. But for every profane holy, fucking, and shit, there’s a technical and anodyne liturgical, copulation, or excretion. For every cock and cunt, there’s a childlike wee-wee and cha-cha. Many words describing sexual organs, excretory functions, and so on fail to rise to the heights (or, if you prefer, sink to the depths) of profanity. These words are articulated without fear of offending, whether in the classroom or the courtroom or the examination room. They aren’t profane, despite referring to taboo concepts. This means that something beyond what a word denotes—what it refers to—must cement it as profanity.

What is that thing?

Why is cunt a dirty word when coochie-snorcher isn’t?

The most obvious possibility is that some aspect of how profane words are written or sound makes them vulgar. Let’s begin with the eight-hundred-pound gorilla. Many English profane words famously have four letters—not just cunt but fuck, shit, piss, cock, tits, and many others. No matter how you count, a lot of the profane words in English are spelled with four letters. Take just the words from the four lists in the last chapter. These lists aren’t exhaustive. But what’s nice about them is that they weren’t assembled with any particular interest in what the words sound like or how they’re spelled. Admittedly, the people who had to come up with lists of profane words might have been unconsciously swayed by the four-letter word notion, but at least that wasn’t their stated objective. So in that way, they offer as unbiased a sample as we’re likely to find. Those four lists in aggregate give us a total of eighty-four distinct words (I’ve removed multiword expressions like get fucked and Jesus fucking Christ, which include other words already in the list). Of the eighty-four words, twenty-nine are spelled with four letters. By this count, then, just over a third of profane words are four-letter words. This number may be artificially deflated, since many of the longer words (like asshole, motherfucker, and wanker) have shorter four-letter words embedded inside them. But it’s a good start.

The first thing to notice from this is that having four letters isn’t a necessary prerequisite for profanity. Certainly, we already knew this: words like ass and motherfucker don’t have four letters, and most of the words on the list have some number of letters other than four. Nor is having four letters sufficient, since many four-letter words are not at all profane, like four or word. So we have to reconsider the question we’re asking. The real issue seems to be whether having four letters makes a word more likely to be profane, all other things being equal. That’s still an interesting question. Here’s a way to ask it.

Given that many (but not all) profane words in English are spelled with four letters, we can try to find out whether the pattern is stronger than you’d expect, given how words in the language are spelled generally. That is, suppose you grabbed a random set of eighty-four English words. What are the odds that twenty-nine of them would have four letters? You can see a histogram on the next page, showing how many profane words from our list have each number of letters—the profane words are in the dark bars. As you can see, there’s a sharp spike at four, representing those twenty-nine four-letter profane words. But is twenty-nine a lot? You can tell by comparing the lengths of profane words in dark bars with English words in general, shown in the light bars. (To calculate these values, I counted the English words with each number of letters and normalized these counts to an eighty-four-word language to make them directly comparable to the profane numbers).a As you can see, English has a lot of words with four, five, six, or seven letters. And in general English looks like a smoother version of the profane distribution. But what really sticks out is how many more profane four-letter words there are than expected from English in general. The 29 profane four-letter words in our list are significantly more than you’d expect if profane words were like English words in general, in which case we’d expect only 12.6 profane four-letter words out of 84.b

To calculate the numbers for English in general, I used the lemmatized word list that Adam Kilgarriff generated from the British National Corpus (available from his web page).

A chi-squared test of lengths three through twelve reveals that the two samples are significantly different. For the statistically minded, χ2(3) = 38.61, p < 0.0001.

Profane words (dark bars) are more likely to be three, four, five, or eight letters long than are English words in general (light bars)

Profane words (dark bars) are more likely to be three, four, five, or eight letters long than are English words in general (light bars).

Perhaps more surprising is how many profane three- and five-letter words there are. There are relatively few three-letter words in English overall, and profane words are almost twice as likely to have three letters than you’d expect, all things being equal. We’ll come back to this in a moment, because it’s important. Less important but also notable is the little bump in eight-letter profane words, compared with the language in general. This is due to words composed of two four-letter words, like ballsack, bullshit, buttfuck, and shithead. Four-letter words appear to bend how English words look even when they’re merely parts of other words. But for our present purposes, it’s enough to note that profanity in English is strikingly more likely to have four letters than other words. The take-away is that there’s some truth to the popular notion about four-letter words.

So this raises the obvious question, why? Why are profane words more likely than other words to have four letters?

If you were a linguist, and maybe you are, the first thing to occur to you would be that the special length of profane words might be due to their frequency of use. In general, the most common words in a language tend to be shorter (in English, these include the, be, of, and, a, and so on), and as words get less frequent, they also get longer (the one-thousandth most frequent word in English is useful, the five-thousandth is gravity, and so on). The explanation for why this is the case is fascinating (having to do with efficiency of information transmission), but for our purposes it could also possibly account for the aberrant lengths of profane words. Maybe profane words are shorter than words in general because they tend to be among the most common.

In fact, if you compare profane words with the most frequent words in English, shown below, you can see that they match up a lot better. But there’s still a little bump for profanity at four and eight letters, and the two groups of words are still statistically different.c So this can’t be the whole explanation, but it might be part of one.

I ran a two-by-ten chi-squared test for lengths one to ten comparing profane words with the 626 most frequent words from the British National Corpus: χ2(9) = 19.17, p < 0.05.

Profane words are also more likely to be four or eight letters long than the most frequent 10 percent of English words

Profane words are also more likely to be four or eight letters long than the most frequent 10 percent of English words.

The catch is that it’s hard to know whether profanity really is as frequent as the top 10 percent of English words. The difficulty lies in the fact that sources we usually use to measure word frequency are all written, mostly in a formal language register—things like newspaper archives and great literature. Profanity is vanishingly rare there. But the informal and spoken environments that form its natural habitat and in which it thrives leave no record. So we can’t measure how common it is in those places. Here’s the best I can do. I searched in a place where people do use language relatively casually and that does leave a permanent, searchable record: the website Reddit. Reddit is an interactive news, entertainment, and commentary platform fractured into various topic-related communities. People can post links or comments and often interact with informal language. They also tend to be younger than the population at large and more male. I took the eighty-four profane words in question and computed how frequent they were on all of Reddit, on average, over the course of two years, from August 31, 2013, to August 31, 2015. Profane words were quite frequent—not quite as frequent as those in the top 10 percent, but close.

So the upshot is that frequency might explain part of why profane words tend to be four letters long in English. But it doesn’t tell the whole story. Perhaps there’s something else going on—perhaps something about having this number of letters causes a word to seem particularly taboo. Indeed, in some places in the world, people avoid the number four systematically—you can think of it as the number thirteen of Southeast Asia. More on that later, but the association between four letters and profanity appears to largely be an English-specific phenomenon. Although we can’t do comparable analyses for other languages (because lists of profane words in other languages haven’t been systematically tested), a quick tour around the swearwords of the world reveals that the four-letter rule doesn’t apply in many other languages. Often the most profane words in non-English languages are a different length. For instance, the strongest French words, putain (“whore”) and foutre (“fuck”), have six letters, and there is almost no Mexican Spanish four-letter profanity—strong words are longer, like chingar (“fuck”), concha (“cunt”), and pinche (“fucking”). In some other languages, profane words aren’t spelled with four letters because there’s no spelling at all—in places where tetraphobia (fear of four) is pervasive, the local languages often aren’t spelled with an alphabet. Chinese, for instance, uses logographic characters instead. And more generally, spelling is only relevant to the half of the world’s languages that have a written form.1 So if spelling is responsible for the four-letter phenomenon, then it would have to be for English-specific reasons.

And when you sit with the idea for a bit, other considerations might cause you to second-guess whether having four letters could really make words profane. After all, people have been speaking English for a thousand years, and for most of that time many of those people couldn’t read or write. But they could swear. Children can swear before they can read or write (more on that later!). And even within English, some pretty profane words happen not to have four letters but are pretty close, like ass or bitch. So perhaps we’re detecting obliquely, through spelling, another, deeper cause at play. Maybe profane words tend to be four letters long because four-letter words tend to be pronounced a particular way. Maybe shit, cunt, fuck, and the like don’t look profane so much as sound profane.

# $ % !

This might seem outlandish, but hear me out. Take a moment to think about profane four-letter words, like cunt, fuck, and shit. Doesn’t something about them just sound dirty? Don’t they sound vulgar? Don’t they sound aggressive?

If you agree, you’re not the first person to intuit that the words of your language somehow sound appropriate for what they mean. This was noted at least as early as the nineteenth century by German linguist Georg von der Gabelentz,2 who observed that German speakers consider the French silly for calling a horse cheval rather than the clearly more suitable German word Pferd.d Truly, though, cheval? Ridiculous! Obviously it’s a Pferd. Even if you know intellectually that there are different names for horses in other languages, in your heart of hearts, you may still feel like horse in your native language fits the animal best and that equivalent words in other languages are slightly less apt. This is sometimes called the “sound-symbolic feeling”: the sounds of words in your language feel like they suit their meanings.

In case you’re wondering, yes, it’s really pronounced with a p followed by an f!

Taboo words often elicit particularly strong sound-symbolic feelings. When you say them—fuck, shit, bitch—when you roll them around in your mouth, they have a certain mouth-feel. And gut-feel. They feel like they sound obscene. One manifestation of this feeling is that it’s hard to imagine them meaning anything else. How could fuck signify anything other than what it does? (For instance, a word that sounds very similar in French, phoque, preposterously means “seal.”) And so we’re baffled when people who are not native speakers of our language accidentally produce profane words. Spanish speakers often confound English sheet and shit or beach and bitch because Spanish doesn’t encode a distinction between the ee and i sounds. Even knowing this, the sound-symbolic feeling makes it almost inconceivable to a monolingual English speaker that you could think that sheet would be pronounced shit. Shit feels dirty. Sheet shouldn’t.

So could something about how profane words sound make them profane? Does this sound-symbolic feeling index something more than a subjective feeling? Do shit and fuck sound objectively more vulgar than poo-poo and copulate?

One of the most common reasons words sound appropriate to their meanings is that the things they refer to sound like something, and the word’s pronunciation reflects that sound. This familiar phenomenon is known as onomatopoeia or sound symbolism. Words for sounds or actions that produce sounds often (but not always) imitate those sounds themselves. For example, even if you didn’t know what they meant, with a little context, you might be able to make an educated guess about the meanings of cock-a-doodle-doo and swish.

Could profanity be sound symbolic? There are some good candidates; consider words like barf or piss. Of course, the word barf isn’t a perfect imitation of what vomiting sounds like; nor is piss an exact replication of the sound of micturition. Still, there’s enough of a resemblance between the words and their referents to create a semblance of sound symbolism.

But how can we tell? This is a hard problem because we don’t have a great way to measure sound symbolism. One brute-force approach would be to just ask people to report how sound symbolic a word seems to them, say, on a scale from one to seven. Researchers do this a lot. But this really only tells us what words English speakers subjectively think are sound symbolic—it’s an index of their sound-symbolic feeling. We, however, are looking for an external, objective measure of whether the words would sound like what they mean even if you weren’t already a speaker of the language.

So a slightly more nuanced way to measure symbolism is to take a list of words from one particular language, say, English, and present them to people who don’t speak that language, say, monolingual Japanese speakers. And then you ask them to do something like guess the meanings of the English words. You have to do a lot to set up an experiment like this right. You have to use participants who really haven’t been exposed to any English and English words that haven’t been borrowed into Japanese. You have to be sure that there aren’t any similar Japanese words just by chance. But if you get it all right, then in principle the English words guessed more easily by people who speak only Japanese (or any other non-English language) are more likely to be sound symbolic.

But to my knowledge, no one has ever done this systematically with taboo words. So we don’t know. And in any case, it’s unlikely to work. For one thing, the implementational details would make it hard to pull off. For instance, it’s getting harder and harder to find people around the globe who aren’t familiar with some English, especially profanity. So you’d probably encounter the most success if you used profanity from Finnish or Basque or some language with a lower profile than English. But the deeper issue is that it’s unlikely in principle that sound symbolism of the cock-a-doodle-doo type is in play for a lot of profane language. Sound symbolism is most common and most effective with words that describe either sounds or things that systematically make stereotypical sounds. Barf has potential for sound symbolism because it describes an action that makes a canonical and recognizable sound. Same with piss. But there are few other viable profane candidates: maybe crap, queef, and a couple others. Most profane words are ill-suited to sound symbolism because the things they refer to don’t systematically sound like anything. Bitch isn’t a good candidate, at least not in the taboo use, because there’s no sound associated with a malicious or unpleasant person that can be imitated. And the same is true for language about sacred concepts (what does God’s prophet sound like?) and sexual organs (what does a penis sound like?).

But the real death knell for profanity being generally sound symbolic comes when you compare profane words with similar meanings. If words sound like what they mean, then words with similar meanings should also have similar sounds. For instance, there’s reason to believe that moan, groan, and whine are sound symbolic not just because they individually sound like the sounds they denote but because they have both similar meanings and similar sounds. Likewise, if fuck somehow sounds like what it means, then other words with similar meanings should sound similar. But they don’t; a good comparison list might include verbs like bang, bone, dick, shag, screw, and so on. Consider how these words sound and how they’re spelled. Most don’t share any sounds at all.

And you get the same insight when you compare words across languages. Look at words that are the translation equivalents of fuck in other languages. French has baiser, Spanish chingar, Mandarin cào, Russian yebát, and so on. Even at first glance, and even including in our little sample only languages that are very closely related and that maintain close cultural contact with one another, despite their similar meanings, these words sound nothing alike. None of the sounds in fuck are in any of these other words (the c in the transliteration of cào is pronounced more like English ts than k). The words are different lengths, they contain different sounds, and they’re written differently. And the same is true of shit and bitch and any profanity you want to try out. These words, across languages, behave more like horse, in that the various words don’t share a resemblance, than neigh, where they do.

The point is that no matter how apt fuck feels to express the concept it does, when you turn to the next language, people have the same feeling about their words—French baiser or Spanish chingar—which use totally different sounds. By this measure, the sounds used to express the meanings of these words appear arbitrary. That is, it appears that nothing about the sounds in the word fuck makes them particularly apt to express the meaning of the word fuck. And nothing makes the sounds of fuck a better fit for its meaning than the sounds of cào. Over the course of the history of English, French, Spanish, Chinese, and thousands of languages on earth, words have evolved that do similar social work—that fit a similar communicative niche—but are pronounced in very different ways.

The upshot is that while some profanity might sound the way it does because of sound symbolism, this is unlikely to be true for the majority of taboo words. (At least not in the words of spoken languages. But in the next chapter we’ll look at gestures and at the signs of signed languages, where the story is revealingly different.) Perhaps the sound-symbolic feeling we get with profanity is really just the result of a lifetime of using a word with a particular sound to mean a particular thing. If seeing a horse (or smelling or hearing or feeling it) often goes with hearing or thinking or saying the word horse, then why wouldn’t you develop a strong association between the sound and the meaning, especially if that’s the only language you know? And likewise for profanity.

So it seems that sound symbolism isn’t what makes profane words sound dirty.

# $ % !

If not sound symbolism, perhaps some other aspect of how profane words sound makes them seem dirty. Let’s loop back around to where we started. English exhibits a higher proportion of four-letter words among profanity than among words in general. As you’ll recall, there are also more profane three-letter words. So let’s dig into these words. What’s special about how these three- and four-letter words sound?

The three-letter words included in the list are ass, cum, fag, gay, god, Jew, and tit. And the four-letter words are anal, anus, arse, clit, cock, crap, cunt, dick, dumb, dyke, fuck, gook, homo, jerk, jism, jugs, kike, Paki, piss, scum, shag, shit, slag, slut, spic, suck, turd, twat, and wank.

Do you notice any general trend in how these words are pronounced?

Here’s one idea. I haven’t seen it discussed anywhere in the literature on profanity before, but it occurs to me that if you look closely, the three-and four-letter words tend to have two properties. First, regardless of how many letters they’re spelled with, they tend to be pronounced with just one syllable. In case you need a refresher, a syllable is a rhythmic beat of language, during which the mouth opens and closes. When you pronounce bitch and shit normally, they’re only one syllable long.e Just a few words on the list have more than one syllable: anal, anus, homo, Paki, and, arguably, jism.

That said, you can opt to make them into two-syllable words, as in Sheeyit, what a gigantic beeyotch! And if you’re not into the whole brevity thing, you can even turn it up to three syllables with biz-nee-atch and shiz-nee-at.

Now, this can’t possibly be the whole story, because there are thousands of one-syllable words in English, and most of them aren’t taboo—11,752 to be precise (with the vague notion of precision appropriate for counting words in a language).f The profane words are but a speck in a sea of monosyllables. And if we’re just looking at three- and four-letter words, it’s no surprise that they’ll tend to be pronounced with one syllable or two.

The count of all monosyllabic words appearing in the MRC Psycholinguistic Database at least once is 11,752: Wilson, M. D. (1988).

But these words don’t just tend to be monosyllabic. They tend to be built in a particular way. English allows many different types of syllable. Every syllable has a vowel at its core.g For some syllables, the vowel is both the beginning and the end (the alpha and the omega, as it were), as in words like a, I, and uh. (Don’t be confused by spelling—there’s no h in the pronunciation of uh.h) But most syllables also have consonants in them, before or after the vowel. So with this in mind, we can return to English profanity. If you briefly revisit the words in the lists above, you may notice something remarkable about their syllables. I’ll wait for you to discover it yourself.

Or something vowel-like. A word like hurdle has two syllables, but neither has an easily recognizable vowel. Yet both the ur and the le can anchor syllables.

Unless you’re Butthead.

Got it? I’ll give you a hint. There are four exceptions. They’re gay, Jew, homo, and Paki.

Here it is. Every other word on those lists ends with one or more consonants. That is, they all have “closed syllables” rather than syllables sporting bare vowels. (A decent mnemonic is that your mouth closes at the end of a closed syllable.) As you can see, many profane words even double down on their final consonants. Words like cunt and wank actually have two consonant sounds at the end. Interestingly, consonants seem pretty important in general—all but a few (like ass or arse) begin with at least one consonant, and many begin with two, like crap, prick, slut, and twat. But really the strong generalization here appears to be that syllables of profane words tend to be closed.

Could these two tendencies—a trend toward having just one syllable and another toward that one syllable being closed—be part of what makes profane words sound profane?

We can start to answer this by splitting our data in a different way—based not on how many letters a word is spelled with but on how many syllables it has and whether those syllables are closed. When we do that, we find that not just the three- and four-letter words are closed monosyllables; so are seven of the sixteen five-letter words, like balls, bitch, prick, and whore, but not Jesus or pussy. In all, thirty-eight of the eighty-four words on the list are one syllable long, and thirty-six of these (or 95 percent) are closed. Only two profane words on the list, Jew and gay, are “open” monosyllables (the w and y aren’t pronounced as separate consonants—they’re part of the respective vowels they follow). How does this ratio compare to the words of English more generally? I took the top 10 percent most frequent monosyllabic words from the MRC Psycholinguistic Database, which has both frequency information and phonetic transcriptions for English words. It turns out that whereas 95 percent of our profane monosyllabic words are closed syllables, that number drops down to 81 percent when you look at nonprofane words, which is significantly lower.i

Fisher’s exact test: p < 0.05.

You might now be scrambling to find exceptions—profane monosyllabic words in English that are open. Our list of eighty-four words definitely doesn’t cover all profane words in the language—we were using it, as you recall, because it was constructed without any explicit prior expectations about the sounds or spellings of profane words. And you can probably find some profane open monosyllables. Like, potentially, ho, lay, poo, and spoo. These are good candidates. Maybe you can come up with one or two more. But for each one, there are a dozen closed monosyllable candidates that we left out of our initial list. They include, in alphabetical order, boob, bung, butt, chink, cooch, coon, damn, dong, douche, dump, felch, FOB, gook, gyp, hebe, hell, jap, jeez, jizz, knob, mick, MILF, mong, muff, nads, nards, nip, poon, poop, pube, pud, puke, puss, queef, quim, schlong, slant, slope, smeg, snatch, spank, spooge, spunk, taint, tard, THOT, toss, twink, vag, wang, and wop. And I’m only getting started. Run the numbers again with these new open and closed monosyllabic words, and you still have upward of nine out of ten profane monosyllables that are closed.

This pattern is statistically real, but we really want to know whether it’s psychologically real too. Do English speakers think that closed monosyllables sound more profane than open monosyllables? There are different ways to figure this out. Here’s one type of circumstantial evidence. When English speakers invent new, fictional swearwords, do they tend to be closed? For instance, when English-speaking fantasy and science fiction writers invent new profanity in imaginary languages, what do those words sound like? Battlestar Galactica has frak (“fuck”). Farscape has frell (also “fuck”). Mork & Mindy had shazbot (a generic expletive). Dothraki, the invented language in HBO’s Game of Thrones, has govak (“fucker”) and graddakh (“shit”). Not all are monosyllabic, but they all end with closed syllables. In fact, it’s very hard to find fictional profanity ending with open syllables. The one glaring counterexample I’ve been able to dig up comes from the movie Star Wars: Episode 1, in which poodoo means “bantha fodder” and is used as a weak expletive. Just by way of speculation, the open syllable might have been selected because the target audience of the movie appears to have been quite young (it was rated PG), and so a more profane-sounding fictional profanity could have felt too strong.

We can also indirectly assess the psychological reality of profane closed syllables by looking at real words that are not taboo by dint of their meaning but happen to have closed syllables. Do people think of these words as obscene despite their innocent meanings? In fact there’s a phenomenon known as word aversion, in which some people have particularly strong reactions to particular words, even though the words have totally anodyne (or inoffensive) meanings. The English word that appears to crawl most insidiously under people’s skin is moist. I can’t tell you how often, upon discovering that I’m interested in profanity, people declare their everlasting hatred for this word. I suspect that the fact that moist is a closed monosyllabic word has something to do with it (along with aspects of its meaning). But to date I know of only one piece of empirical research on word aversions,3 and it focuses exclusively on moist, so if there are indeed other words that people find to be the linguistic equivalent of nails on a chalkboard, it’s impossible to know what those words sound like.

But alien languages and word aversions really only supply very indirect evidence about profanity. The best way to tell whether people feel that closed monosyllables are more profane than open monosyllables would be to conduct a study with invented English words, ones that differ only in what kind of syllable they have. You could ask people how profane those words would be if they were real English words. Would people feel that cheem is more vulgar than chee? Is smoob more profane than smoo? That way, you could control for all other differences between the closed and open monosyllables and measure whether having a final consonant alone is enough to push the profanity needle.

So I ran this study. I generated a bunch of potential monosyllabic words of English that happen not to be real English words, like chee and smoo, and I paired up each open monosyllable with a closed monosyllable that was identical except for the last sound. So skoo went with skoom, and stee was paired with steesh, and so on for twenty pairs of words that were the same on all the relevant dimensions and different only in the type of syllable.j I also manipulated how many consonants there were at the beginning of the word, known as the “onset,” just to see if this also made a difference in how profane the words sounded to people. So of the twenty pairs of words I created, ten began with just one consonant, like dee and deeve, and the other ten pairs began with two consonants, like smee and smeef, always an s followed by some other consonant, because that happens to be a way English likes to put multiple consonants at the beginning of a syllable. And then I asked sixty native speakers of English, “How profane does the following made-up English word sound?” on a four-point scale from “Very Profane” to “Not at all Profane.” You can see what they thought in the graph above. (Words that start with just one consonant are shown under “C onset,” and those with an s and another consonant are “sC onset.”)

Open and closed monosyllabic words weren’t significantly different in neighborhood density, mean positional phoneme probability, or mean biphone probability, none of which you would ever have heard of if you weren’t a psycholinguist but all of which you would be very concerned about if you were.

People rate made-up words as more profane when they have more consonants, either at the beginning of the syllable or at the end

People rate made-up words as more profane when they have more consonants, either at the beginning of the syllable or at the end.

Pretty clearly, when everything else is held constant, native English speakers think that closed syllables sound more profane than open syllables (the dark bars are higher than the light ones). Also of interest, and slightly more surprising, there appears to be a weaker though significant effect where having more consonants at the beginning of the word also makes a word seem more profane (the pair of bars on the right is higher than the pair on the left).

So not only does English profanity tend to be pronounced with closed monosyllables, but English speakers moreover think that closed monosyllables sound more profane than open ones. In terms of how languages work in general, this isn’t entirely unprecedented. Sometimes within a language, you will find clusters of words with similar meanings that happen to have similar forms. These arise not because their forms reflect their meanings through sound symbolism but for another reason. Consider words in English that have meanings related to light or vision. Many of them happen to start with gl. I’ll give you a few: glisten, glitter, gleam, glow, glare, glint. And there are many more, from glaucoma to glower. Now, it’s impossible for sound symbolism to be at play here because light and vision don’t sound like anything at all, and even if they did, there’s no reason to think it would be gl. Instead, we’ve uncovered a little dense spot in the English lexicon where words with similar meanings have similar forms for no better reason than that they do.

The story of how these sets of similar words come about goes something like this. In general, sound symbolism notwithstanding, words arbitrarily pair together forms and meanings. But because the words of any language are governed in part by chance, there will happen to be some places in the lexicon of a language where a couple words that have similar meanings happen also to have similar forms. People who learn and use this language may notice these little clusters, or they may not (for example, you may or may not have noticed English gl-words before), but over time the clusters will act as a form of attractor for new words. Old words that are misheard, mislearned, or misremembered will be slightly more likely to gravitate toward the form and meaning of a cluster, which appears to have happened in the history of the gl-words in English. And new words that people invent will also be attracted to the clusters such that they’re slightly more likely than chance to have meanings and forms aligned with the growing pattern. This, too, has happened in the history of English: see examples like glitzy (in 1966) and glost (a glaze used in pottery, in 1875).4 It’s also a factor in product naming—imagine which glass-cleaning spray you’d prefer to buy: Brisserex or Glisserex. Over centuries, maybe even millennia, these clusters are reinforced in a kind of rich-get-richer process until you have English, where a healthy 39 percent of words starting with gl relate to light or vision.5

And perhaps this is what happened with English profanity. Perhaps through historical accident there came to be a core set of profane English words that happen to be pronounced with a closed monosyllable. They exerted a gravitational tug on words around them—existing words came to be pronounced similarly, and newly coined words were more likely to follow the same pattern. We can see this in our newest profanity, where acronyms like MILF, THOT, and FOB tend to be closed monosyllables. And we can see it in the profane abbreviations that people have created over the years, like gyp, hebe, and smeg.

# $ % !

We began by asking if something about words like fuck and cunt, aside from their meaning, makes them profane. By following the four-letter road, we discovered a hidden pattern in how profane words sound in English. At their core are closed monosyllables. This isn’t just a descriptive fact about the words that are currently profane in English; it also affects what English speakers think about new words, whether inventing a science fiction language or participating in a behavioral experiment.

To reiterate, though, there are many exceptions to this closed monosyllable pattern. Not only are there a few profane words with open monosyllables, like gay, spoo, and so on, but there are also many profane words that have more than one syllable, like asshole, motherfucker, cocksucker, and company. But this shouldn’t be too surprising to the well-weathered linguist. Languages exhibit few exceptionless rules. We all know that English makes past tense forms of verbs by adding -ed. Except that it often doesn’t, in so-called irregular verbs like spend, go, and drink. English nouns place stress on the first syllable and verbs on the second (compare a record and to record, a permit and to permit). But then sometimes they don’t—copy and double are pronounced with first-syllable stress as both verbs and nouns. So it’s no surprise that we can’t find a hard-and-fast rule about how English profane words sound. As with these other generalizations about language, there’s a tendency. Just as English profanity tends to be drawn from certain semantic domains, so it tends to sound a certain way.

This trend and the fact that it has exceptions might explain differences among words with similar meanings. Words like poo, pee, gay, Jew, and spoo are all arguably profane words. But if the closed monosyllable pattern is real inside the heads of English speakers, then all other things being equal, words like these should seem less profane than words with similar meanings that are pronounced with closed syllables. Indeed, when you contrast them with closed versions, they might seem to have less oomph. Which is more profane: pee or piss? Compare spoo with spooge. Jew with hebe. Gay with fag. Does it seem to you like the closed-syllable words are somehow more profane? If so, how well they fit with the closed monosyllable pattern might be responsible. And it might also predict how well they maintain their oomph over time and how widely they’re used. As a closed monosyllable, spooge ought to end up more widely disseminated as a profane word than would an open monosyllable like spoo.

And, of course, the polysyllabic profane words in English still have to be considered. In a way these words are exceptions to the closed syllable trend, and in another way they aren’t. More than half of the polysyllabic words on our profane list (twenty-seven of forty-six) begin with a profane closed monosyllable, like the cock in cocksucker and wank in wanker. And even more of these same words (thirty in total) end with a closed monosyllable, like bastard and faggot. The numbers become a little muddier when we try to count these composed words—we could consider dozens that include shit, fuck, dick, or cum in them, and we’d have to make arbitrary choices about what to count. But even without going there, we see clearly that English profanity is built in part from closed syllables, whether by themselves or as part of longer words.

If this closed monosyllable pattern is real, where does it come from? I offered an analogy with gl-words earlier, suggesting that there doesn’t have to be an intrinsic motivation in terms of what the word means and why a particular sound would be well suited to it. For a cluster to take off, it need only be sufficiently frequent. Perhaps, somehow in the history of English, the ratio of open to closed monosyllables in English shifted locally in the subclass of profanity. And that little tilt in the lexicon snowballed.

In keeping with this story, the closed monosyllable principle isn’t a cross-linguistic universal. Some languages don’t allow for anything like the range of closed syllables we have in English. For instance, syllables in the Hawaiian language can never end with a consonant—they’re always open. So there’s no possible Hawaiian version of the English closed syllable pattern. And many of the most profane words you might now be familiar with from other languages are open syllables or polysyllabic: French putain (“whore”), Spanish chingar (“fuck”), Russian yebát’ (“fuck”), and so on. But as I’ve mentioned, we don’t have reliable studies of profanity for most languages. As a consequence, it’s hard to know whether the English pattern shows up in other languages as well.

I want to raise the possibility of another explanation for why profane words in English sound the way they do. It’s possible that some of those sounds are particularly well suited to the functions that profanity serves. To be clear, I’m not talking about sound symbolism. It’s not that the words might sound like what they mean. The idea instead is that they might sound the way they do because that way of sounding is effective for the way you want to use the words.

This could work in several different ways, in principle. One way is based on the difference in childishness of open and closed monosyllables in English. It just so happens that as they’re learning a language, children are first able to pronounce open syllables. That’s why a child typically says ma and mama before mom; a child substitutes ba for ball and da for that.6 (We’ll have a lot more to say about this in Chapter 8, when we explore where children’s little potty mouths come from.) As the child’s motor system matures, she then develops the capacity to coordinate consonants not just at the beginnings but also at the ends of syllables. So on the basis of those developmental facts, here’s a just-so story. Maybe open syllables sound more childlike because they are in fact easier for children to pronounce. Perhaps people unconsciously associate words that are harder to pronounce with the adults whose motor systems can in fact articulate them. So closed syllables—and words with lots of consonants at both ends—sound like words adults but not small children say.

If this story is true, then we’d expect profanity to show a preference for not only syllable types that are harder for children (closed ones) but also sounds that are harder for them. We’d expect to see sounds like th, which is hard for kids, rather than p, which is easier. And we’d expect profanity to eschew the repetition of syllables (known as reduplication) that’s typical of infant and toddler speech: mama, baba, and so on. Something like poo-poo would be the epitome of a childlike and therefore nonprofane word.

That’s one possible foundation for the cluster of profane words we see in English. Here’s another, equally speculative explanation. Perhaps short, closed words are more useful than others for swearing. There’s an argument to be made that monosyllabicity is useful for expletives—when you slam your finger in a car door, you don’t exactly have a lot of time to express what you’re feeling. Short words are simpler and more direct. That’s the monosyllabic part. Now to the consonants. Perhaps having a consonant at the end works particularly well for words, like profanity, that are deemed inappropriate in some settings. An open syllable just keeps going, whereas having a consonant at the end seals the word in silence. This is especially visible in epithets or slurs, derogatory labels for groups of individuals, which overwhelmingly follow the pattern (think of hebe, chink, gook, jap, WOP, and so on). These are precisely the type of word you might want to be able to cut short and mumble into your beard. A closed syllable permits that.

We can actually test this seemingly far-fetched idea by looking at precisely what types of consonants bring up the rear of English closed monosyllables. The key is that not all consonants are created equal. Some consonants bring a word to an immediate halt—notably consonants known as “stops” or “plosives,” like the sounds behind p, t, k, b, d, and g. Other consonants allow sound to continue being emitted—you can prolong a nasal n or m, a fricative s or f, or an approximant l or r. English monosyllabic words in general show a healthy preference for stop consonants over others in their final position—just under half of them, as you can see above, end with a brief, percussive sound, like p, t, or k. But split profane words in the same way and, as you see above, the bias toward stop consonants is significantly more exaggerated. There are far more profane words like spic and twat and far fewer like piss and cum than we’d expect by chance.k This is far from conclusive evidence, but it does lend a little credence to the “shut your mouth” explanation for profanity’s tendency to end with a consonant, and not just any consonant.

Specifically, a Fisher’s exact test reveals that profane words ending with stop consonants are significantly more frequent than would be expected from the lexicon in general, p < 0.01.

Profane English monosyllables are significantly more likely to end with a stop consonant, like t or k, than other English words

Profane English monosyllables are significantly more likely to end with a stop consonant, like t or k, than other English words.

It’s possible that one or a combination of these pressures has chiseled the cluster of profanity that we now see in English. But it could alternatively just be a matter of historical accident, like the case of gl. Without systematic studies across languages, we may have to settle for merely observing the pattern of profanity pronunciation in English, in combination with the kind of idle speculation that the last several paragraphs have illustrated.

But one avenue of human communication—a way in which we communicate profanity—has, unlike words, very clear motivation. Beyond words, we also use our bodies to communicate—articulating with our arms and hands, orienting our torsos, and shifting our eyes. We do so both in the everyday gestures that accompany or replace speech and also, among people who are deaf or hard of hearing, through the signs of signed languages. And in the hands, as opposed to the mouth, it’s much clearer why the signals we send—including the obscene ones—have the forms that they do.