

Giessen, Germany — Germany spends 220 million euros (US$300 million) a year on the Goethe Institute, the agency promoting the German language, but Germany’s academic grants also promote research into Latin, Chinese and many other tongues.
The English of Sri Lanka, once reviled as the language of colonialism, is next up for study.
Linguists at the University of Giessen in central Germany are running the project. Another group of linguists at the same university are doing a matching survey of the English of Ghana.
In U.S. English, "taking a call" can only mean receiving a phone call initiated by someone else.
When Sri Lankan people speaking English say they will "take a call," it might equally well mean dialing a phone or answering a phone call.
The survey chief, Joybrato Mukherjee, 36, who is both professor of English and also head of Giessen University, says, "It is possible to see why, because there are similar constructions in both Sinhala and Tamil," the languages of Sri Lanka’s main ethnic groups.
Each variety of English has its own distinctive ways with words, as well as a typical accent.
English has a controversial history in Sri Lanka, where authorities tried in the 1950s and 1960s to limit its use, he explained.
"English was reintroduced in the 1980s as the so-called link language," he said. It was needed so the feuding Sinhalese and Tamil communities could interact through "a third, neutral language."
As part of a worldwide project, Mukherjee studies the varieties of English using computers. For each English variety, linguists create a 1-million-word database, or "corpus," containing a broad sample of the living language.
"A corpus is a large collection of English texts, spoken and written. The corpus is designed in such a way that it is representative of the variety of English that you are studying," explained Mukherjee.
"What we are doing is collecting the first corpus of 1 million words of Sri Lankan English. Sri Lankan English has been under- researched. There are hardly any empirical studies. This will be the first large database."
A corpus has to be "so big and so balanced across various genres that if you analyzed this corpus, you could treat that corpus as a statistical sample in its entirety. That’s the basic reasoning behind corpus linguistics," he said.
Each corpus is gathered the same way, so that one corpus can be compared with another by computer. The software can pick out distinctive words and syntax.
Mukherjee’s project is part of the International Corpus of English (ICE), which began in 1990 and is gradually growing into 25 or more matching electronic corpora (the plural of corpus).
ICE only studies users who speak educated English as their native or main language. Its website says samples are collected from speakers age 18 or older who have been educated through the medium of English to at least the end of secondary schooling.
Giessen will be hosting a workshop in early April for five assistants from Sri Lanka, plus five from Ghana. They will learn how to transcribe the spoken data which they will record while traveling round Sri Lanka.
The fares and the salaries of the people involved are expensive, but Mukherjee has won German government academic grants to push the project forward. He has "two or three" German research assistants as well as students helping out as part of degree studies.
"The written component is almost done. This year we will apply for major funding for the spoken component," Mukherjee said.
Transcribing the recordings and marking up all the texts is likely to be "tedious work," the professor said ruefully.
"You need many members in a team so various people can proofread," he said.
The German professor, who has ancestral links with Bengal, said he had not touched on South Asian English in his postgraduate studies.
"I had already been a professor here for two years when a colleague, Chris Tribble, who had already begun the Sri Lanka English corpus, asked me in 2005 if I would like to take it over," he explained.
The survey of the English of Ghana is being directed by Magnus Huber, the other professor of English at Giessen University.
ICE’s Hong Kong-based chief, Gerald Nelson, is now investigating cheaper ways of creating a "lite" corpus by crawling the internet. He has compiled more limited corpora of Ugandan and Sierra Leonean English using nothing but the World Wide Web.(Chinapost)