Originally published: Sep 30 2014 - 7:00pm, Inside Science News Service
By: Joel N. Shurkin, Contributor
(Inside Science)-- More than 100 years ago, the playwright Oscar Wilde had one of his British characters say that England and America "have everything in common nowadays except, of course, language.” It turns out, according to linguists, he was almost right. But lately, the two languages are getting closer.
Languages change over time -- some faster than others. Some reflect changes in the world around them, according to a new paper published by The Royal Society in London. There are universal and historical factors at work, and languages change at varying rates, the scientists found.
The researchers used the Google Books Ngram corpus to monitor word and phrase usage in the past five centuries in eight languages. They drew from 8 million books – roughly 6 percent of all the books ever published, according to Google's own estimates. The books were scanned into a database by Google.
While linguists have always known that the changes vary, this use of the gigantic Google database is by far the largest.
The researchers were an international group that ironically had its own language difficulties.
|Image credit: Magdalena Roeseler via Flickr | http://bit.ly/1rDZfLH|
Rights information: http://bit.ly/NL51dk
The lead author was Søren Wichmann, a Dane working at the Max Plank Institute for Evolutionary Anthropology in Leipzig, Germany. His coauthors were Valery Solovyev, a linguist at Kazan Federal University in the Republic of Tartarstan in Russia, and astrophysicist Vladimir Bochkarev, also at Kazan, who was interested in languages. The work was done at the Kazan linguistics lab.
Research was hampered by the fact Wichmann did not speak Russian, and Bochkarev didn’t speak English.
Wichmann’s wife translated part of the time. Otherwise they used Google’s translator, which was not always useful.
For this study, they delved into written languages, which are more conservative in their expressions, rather than tackle spoken languages for which there is no good record. They looked specifically at how frequently words were used. Each word form counted as one word; for instance "park" and "parked" were counted as two different words.
The process they used is called "glottochronology" by linguists.
Language Shaped by Culture
“One word which was earlier specialized might take on a broader meaning and can replace the word that had a broader meaning before,” Wichmann said.
Sometimes it is just a matter of fashion; sometimes it is outside events. For instance, the early English word for “dog” was “hound.” Now “hound" is a specific kind of dog. The same thing may be happening in reverse to the word “vodka,” which in some places is replacing “liquor.”
“Any major change in society will change the frequency of words,” Wichmann said.
Mostly, the researchers found, languages change at a similar rate but that rate usually is measured in terms of half a century unless something intervenes, like a war. When wars come, Wichmann said, changes in vocabulary came more rapidly as new words like “Nazis” came into the language and people start thinking about things they did not contemplate before hostilities, he said.
During the Victorian era, the height of the British Empire and a very stable time in Britain, the language was fairly steady. With the tumult and chaos of the 20th century, vocabulary changes came more rapidly.
From about 1850 on, British English and American English drifted apart. For the first half of the 19th century the Queen’s English and American English were the same except that the British English lagged behind about 20 years. New words came into the American English lexicon, but only appeared in Britain about 20 years later.
Then, the influence of the mass media began to bring the two languages together starting in 1950. Now, the two languages are far more similar than they were before, Wichmann said.
Challenges in Learning Languages
Ever wonder why some languages are harder for adults to learn than others? The researchers point out that languages contain what linguists call a “kernel lexicon,” meaning a list of words that constitute 75 percent of the written language. If you know those words, you can make out much of the literature. These also are the words least likely to change even as the language morphs.
The kernel lexicon for English is less than 2,400 words. If you know them you can read 75 percent of the text. The kernel lexicon for Russian is about 24,000 words. So, even though the whole of the English language has about 600,000 words and Russian only has about a sixth of that, without the crucial 21,000 kernel words, most Russian writing would be largely incomprehensible.
"The fact a given word might be used a lot in one period doesn't necessarily mean the word is new," said Brian Joseph, distinguished university professor of linguistics at the Ohio State University in Columbus. For instance, one word now trending in English is "cupcake."
Sometimes words combine, like "labradoodles," he said.
Definitions change too. Some words meant one thing to Shakespeare but mean something else to us, said David Lightfoot, a professor of linguistics at Georgetown University in Washington, D.C. "Scientist" is in the current lexicon but before the 19th century, they were called "natural philosophers."
Sometimes the change in wording tells us more than we think it would. In recent years, the use of the word “divorce” has become more frequent than “marry,” Wichmann said.
Perhaps more telling, “information” is replacing “wisdom.”
Joel Shurkin is a freelance writer based in Baltimore. He is the author of nine books on science and the history of science, and has taught science journalism at Stanford University, UC Santa Cruz and the University of Alaska Fairbanks. He tweets at @shurkin.