Drawing a line in the noodles (a lesson in bad translation [Chinese to English])

Your Chinese lesson for the day:

The following photograph was found on the internet by Charles Mok and was shared by Rebecca MacKinnon (of the Berkman Center) on Facebook:

Just make sure that you don’t slip on the pasta! Seriously, though, what is a traveler supposed to do when instructed to “wait outside rice-flour noodle”?

This is what the Chinese sign really says:

Qǐng zài yī mǐ xiàn wài děnghòu
“Please wait behind the one meter line”

Unlike several recent Chinglish signs that we’ve examined (e.g. “Difficult to find the translation“, “Google me with a fire spoon“), the English in this case was not taken directly from Google Translate, which gives “Waiting outside in a noodle”. But Babel Fish gives “Please wait outside rice-flour noodle.” Bingo!!

So how did Google Translate and Babel Fish bring noodles into the picture?

The answer is simple: both Google Translate and Babel Fish failed to parse the sentence correctly. They treated yī mǐ xiàn 米线 as if it were [yī [mǐ xiàn]] “one rice-noodle” instead of [yī mǐ] xiàn]] “one-meter line”, where mǐ 米 is an abbreviated transcriptional loan for English “meter”, rather than a morpheme meaning “rice”.

In overall statistical terms, mǐ 米 = “rice” is no doubt much more common than mǐ 米 = “meter”. And it doesn’t really help us to note that in Modern Standard Mandarin (MSM), mǐ 米 is a bound morpheme and only means “rice” in specific combinations such as cāomǐ 糙米 (“brown rice”), mǐfàn 米饭 (“[cooked] rice”), dào mǐ 稻米 (“[paddy] rice”), because in this case, the sign does have the sequence mǐxiàn 米线 , which really can mean “rice noodles”.

How does Baidu Fanyi handle Qǐng zài yī mǐ xiàn wài děnghòu 请在一米线外 等候? The output is perfect:
“Please wait behind the 1 metre line.”

It’s possible that this translation is perfect because the same exact pair of Chinese and English phrases were found in the training corpus, perhaps more than once. Still, Baidu’s treatment of the problematic part of this sentence (taken in different chunks) is:

在一米线外等候 “in line waiting outside” (no coverage of the 一米 in this or the next two iterations))

在一米线外 “in a line outside”

一米线外 “line outside”

米线外 “rice noodle and”

米线 “rice noodle”

At least in this sample, so long as the numeral 一 appears before the 米, Baidu Fanyi does not make the mistake of translating the sequence 米线 in this collocation as “rice noodle”. This is probably a safe heuristic, at least in translating signs, but it’s not clear how to ensure that this result emerges from a standard statistical MT algorithm. Perhaps a better method would be to adjust the language-model priors to take note of the fact that rice-flour noodles are less likely than one-meter lines in the context of airport crowd-control signage.

Or, more radically, you could pay someone who knows both Chinese and English to translate your signs.

[A tip of the hat to Joel Dietz]

August 14, 2011 @ 11:03 am
· Filed by Victor Mair under Lost in Translation


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s