Let's acknowledge this at the top: It's a thin slice.
To gaze across the great swath of written English over the past few centuries — that teeming, jostling, elbow-throwing riot of characters and places and stories and ideas — only to isolate, with dispassionate precision, some stray, infinitesimal data point such as which author uses cliches like "missing the forest for the trees" the most, would be like ...
Well. You get it. More like missing the forest for the raspberry seed stuck to the underside of the 395th leaf on the 139th branch of the 223,825th tree.
But that's what statistician Ben Blatt's new book, Nabokov's Favorite Word is Mauve, sets out to do, thin slice by thin slice.
He loaded thousands of books — classics and contemporary best-sellers — into various databases and let his hard drive churn through them, seeking to determine, for example, if our favorite authors follow conventional writing advice about using cliches, adverbs and exclamation points (they mostly do); if men and women write differently (yep); if an algorithm can identify a writer from his or her prose style (it can); and which authors use the shortest first sentences (Toni Morrison, Margaret Atwood, Mark Twain) versus those who use the longest (Salman Rushdie, Michael Chabon, Edith Wharton).
I can hear thousands of monocles dropping into thousands of cups of Earl Grey from here. "But what of literature?" you sputter. "What does any of that technical folderol have to do," — here you start wiping your monocle on your nosegay — "with ART?"
Not much, is the answer. Blatt's book isn't terribly interested in the art of writing. What it's fascinated by — and is fascinating about — is the craft of writing.
Technique. Word choice. Sentence structure. Reading level. There's something cheeky in the way Blatt throws genre best-sellers into his statistical blender alongside literary lions and hits puree, looking for patterns of style shared by, say, James Joyce and James Patterson.
A Balm For Bookish Know-it-Alls
To say that you likely won't find much that's truly surprising in Nabokov's Favorite Word is Mauve isn't a critique. In fact, it's kind of the point. Reading it, you experience the feeling, again and again, of having some vague, squishy notion you've always sort of held about a given author getting ruthlessly distilled into a stark, cold, numerical fact.
Which is, if you're the kind of person who likes to get proven right (hi!), a hell of a lot of fun.
Now: It's a book of statistics, and statistics rest on distinct sets of assumptions that must get made before any number can start getting well and truly crunched. So if you're curious about Blatt's methodology, boy are you in luck. Every chapter begins with Blatt chattily sharing with the reader — as chattily as a book this eager to walk us through the formula used to calculate Flesch-Kincaide Grade Levels can be — every aspect of his thinking. How he defines "Great Books." What constitutes a long sentence. Which chapter-endings qualify as cliffhangers, and which merely ... abrupt.
He drags you into the weeds with him, but he's a personable writer, and he's brought along a picnic lunch, so you don't mind the bugs.
Herewith, some of my favorite of Blatt's findings in Nabokov's Favorite Word is Mauve:
MEN WRITE LIKE THIS, BUT WOMEN WRITE LIKE THIS
It tuns out that — sit down for this next bit — authors who are women write equally about men and women, but men write overwhelmingly about men.
I know. I'm shaken, over here.
For every appearance of the word "she" in classics by male authors, Blatt found three uses of the word "he." In classics by women, the ratio was pretty much one-to-one.
Also: Male authors of classic literature are three times as likely to write that a female character "interrupted" than male characters. In contemporary popular and literary fiction, the ratio is smaller, but it's still there.
Blatt looked for the specific words that authors use much more frequently than the rate at which those words generally occur in the rest of written English (i.e., compared to a huge sample of literary works — some 385 million words in total — written in English between 1810 and 2009, assembled by linguists at Brigham Young Univeristy).
His criteria: A favorite word -
- Must occur in at least half of the author's books
- Must be used at a rate of at least once per 100,000 words
- Must not be so obscure that it's used less than once per million in the BYU sample of written English
- Is not a proper noun
Here's some that jumped out at me.
Jane Austen: civility, fancying, imprudence (Story checks out, right?)
Dan Brown: grail, masonic, pyramid (I am sagely nodding, over here.)
Truman Capote: clutter, zoo, geranium
John Cheever: infirmary, venereal, erotic (Boy howdy, that's a whole Cheever short story, right there.)
Agatha Christie: inquest, alibi, frightful
F. Scott Fitzgerald: facetious, muddled, sanitarium
Ian Fleming: lavatory, trouser, spangled ("Pardon me, Blofeld; must dash to the lavatory, got something spangled on me trouser.")
Ernest Hemingway: concierge, astern, cognac (Yuuup.)
Toni Morrison: messed, navel, slop
Vladimir Nabokov: mauve, banal, pun (As Blatt points out, Nabokov had synesthesia, a condition that caused him to associate various colors with the sound and shape of letters and words. "Mauve" was his favorite: He used the word at a rate that's 44 times higher than the rate at which it occurs in the BYU sample of written English.)
Jodi Picoult: courtroom, diaper, diner
Ayn Rand: transcontinental, comrade, proletarian
J.K. Rowling: wand, wizard, potion (Well, duh.)
Amy Tan: gourd, peanut, noodles
Mark Twain: hearted, shucks, satan
Edith Wharton: nearness, daresay, compunction (Man I love me some Edith Friggin' Wharton.)
Virginia Woolf: flushing, blotting, mantelpiece (Chandler Bing: "Could they BE more Virginia Woolf?")
You know: nearly, suddenly, sloppily, etc. Writing teachers tell you to avoid them, that they sap the energy from a sentence. Strong, clear writing is fueled by verbs and nouns, they say, not by adjectives and adverbs.
Turns out, the adverb thing holds up: When Blatt combined several lists of the "Great Books" of the 20th century, he came up with 37 which were generally considered great.
When comparing these to the same authors' other novels, the "Great Books" used significantly fewer adverbs. Of these authors' books that kept to a strict adverb rate (less than 50 per 10,000 words) 67% were considered "Great," whereas only 16% of their adverb-loaded books (containing more than 150 per 10,000 words) were ever considered "Great."
Well I mean: I hate 'em, at least. My husband uses them like they're powdered sugar and his emails are lemon bars. But I hate 'em.
You know who doesn't hate 'em? Besides my husband, I mean? James Joyce. Dude loved them.
Blatt took a sample of 50 authors of classics and contemporary best-sellers, totaling 580 books. The authors who used the most exclamation points per 100,000 words were:
5. J.R.R. Tolkien (767)
4. E.B. White (782. Gasp; nobody tell Mr. Strunk.)
3. Sinclair Lewis (844. I guess it CAN happen here.)
2. Tom Wolfe (929)
1. James Joyce (1,105)
Elmore Leonard — bless him — used the fewest: Just 49 per 100,000 words.
IT'S RAINING CATS AND DOGS AND CLICHES
When it comes to use of cliches, there's another gender split.
In Blatt's list of 50 classic and best-selling authors (scroll down to the bottom of this post to see them all), those who use cliches most frequently? All men.
5. Chuck Palahniuk (129 per 100,000 words)
4. Salman Rushdie (131)
3. Kurt Vonnegut (140. All those "And so it goes"es in Slaughterhouse-Five really hurt him here, I bet.)
2. Tom Wolfe (143)
1. James Patterson (160)
(In fairness to Patterson, Blatt includes cliches found in dialogue, and Patterson's characters aren't exactly going around coining new phrases with a Joycean fervor.)
The authors who used the fewest cliches? All women.
5. Veronica Roth (69)
4. Willa Cather (67)
3. Virginia Woolf (62)
2. Edith Wharton (62)
1. Jane Austen (A paltry 45 per 100,000 words, about 1/3 of the rate at which James "More Cliches Than You Can Shake A Stick At" Patterson busts them out.)
Now, again: It's a thin slice, looking at literature in this knowingly reductive way. It doesn't tell you everything, and of course it doesn't give you a true sense of the feeling you get when you read these authors for yourself.
But what it often succeeds in capturing, with astonishing clarity, is your feeling about these authors.
Case in point: The author who is most likely to mention the weather in the opening sentence?
She does it in — precisely — 46 percent of her books.
DAVID GREENE, HOST:
Bear with me because we're going to spend a minute or so talking about statistical analysis, you know, the kind used to analyze data and predict who might win March Madness or an election. But what if we used data to look at literature? NPR's Glen Weldon tells us about a new book called "Nabokov's Favorite Color Is Mauve" (ph) and the patterns it reveals.
GLEN WELDON, BYLINE: A statistician - his name is Ben Blatt - loaded thousands of novels into a huge database and crunched the numbers. One thing those numbers show is that male novelists write overwhelmingly about men. Women read about male and female characters roughly equally. Another thing - cliches.
UNIDENTIFIED MAN: Now or never.
UNIDENTIFIED WOMAN: With all my heart.
UNIDENTIFIED MAN: Nick of time.
WELDON: Writers are supposed to avoid cliches like the plague. Who uses the fewest? Virginia Woolf, Edith Wharton and Jane Austen, all women. The top three cliche abusers? Men - Kurt Vonnegut, Tom Wolfe, and coming in at number one, James Patterson, whose go-to cliche, believe it or not, is...
UNIDENTIFIED MAN: Believe it or not.
WELDON: Let's talk adverbs.
UNIDENTIFIED MAN: Happily.
UNIDENTIFIED WOMAN: Nearly.
UNIDENTIFIED MAN: Suddenly.
WELDON: Your teachers warned you not to use too many. And it turns out their advice holds up. Of the novels that dominate the lists of great books, 2 out of 4 contain few adverbs. Now, this stuff might not surprise you exactly, but it's fun to see your vague notions about writing turned into stark numerical facts. Here's one. Every author has a word that they use much more than others. Here's Jane Austen's.
UNIDENTIFIED WOMAN: Civility.
WELDON: Makes sense, right? How about Herman Melville?
UNIDENTIFIED MAN: Whale.
WELDON: You probably saw that coming. Here's Agatha Christie's three favorite words.
UNIDENTIFIED WOMAN: Inquest. Frightful. Alibi.
WELDON: And as for Tom Wolfe, his favorite word is [expletive]. Ben Blatt's book gets its title from Vladimir Nabokov's favorite word - mauve. He used it at a rate 44 times higher than it's found in other people's writing. That's a lot of mauve. I'd say something about purple prose here but that would be a cliche. I'm Glen Weldon. Transcript provided by NPR, Copyright NPR.