It’s well-known that Asian languages cannot be easily classified. They come with complex alphabets, sometimes no spaces between words, in other cases there is simply no punctuation. While Latin-based languages have some type of consistency about them, it’s hard to say the same about Asian languages. Translators know this better than anyone. And even more so, translators who work with computer assisted translation, or CAT/CAT tools as they are known in the industry, there are several challenges with CAT tools that one needs to be aware of. So, what are these challenges and what types of CAT tool errors can you expect to see if you’re a professional translator? We take a look at some examples.
One of the first CAT tool bugs for Asian languages relates to the use of numerals in Japanese. While it is possible to use both Arabic numbers or Japanese characters to express numbers, by a linguist entering the Japanese characters representing the Arabic numbers, the CAT translation will record an error. This is because it doesn’t see the Arabic number in the translation as it searches for it. This type of error is known as a “false positive”, which essentially means that there’s an error that’s recorded but it is not an error per se.
The opposite is also true for Chinese when it comes to expressing numbers. For example, if the source indicates the date of 21 October 2021, the Chinese translation will be recorded as 2021年10月21日 and the computer assisted translation tools will record the error as there being too many extra numbers.
There is also an issue with fonts. For example, when a language such as Thai is used and there are no pre-installed fonts, the computer aided translation tools will simply show the Thai language as a set of blocks.
In addition to this, some CAT tools have the option of limiting the number of symbols per segment, which can really be a challenge for translators.
What’s also interesting to know is that in Thai, there are no punctuation marks. A simple space indicates the end of a sentence and a sentence in the tool will simply appear as one very long symbol.
Finally, memoQ is a CAT tool that doesn’t have the option for translating Thai, while Trados does have this but it does not really translate as accurately as possible. Transit is possibly one of the better options on the market with Asian languages translations owing to accuracy that’s in the region of 90%.
Returning to Chinese
Furthermore, there are challenges in formatting. For example, in English it’s common to leave a space after a period. However, there is no such thing in Chinese and the CAT tools translation will record an error with “formatting issues”.
Staying with the periods, in English, a period is expressed as (.), while in Chinese the period is expressed as (。). In such cases, the CAT tool will again issue an error saying that “The punctuation at the end of the sentence is incorrect.”
Next up is the issue of parentheses. In English, these are typically the most common types of parentheses used: ( ). Meanwhile, in Chinese the parentheses are expressed as follows（）. Not only do they look slightly different but they also have more space between them. Here is an example of this:
( Chinese ) , (English)
Back to Japanese
Other challenges with CAT (computer assisted translation) is when they mark as an error the repetition of certain characters. For example, this is what has been written in Japanese: ここがそののち、敵に見つかる. The repetition of のの will often appear as an error in a CAT tool even though it actually isn’t. Once again, this is considered a false positive and an error of “repetitive characters” will arise.
Although designed to make our lives as translators much easier, CAT tools are imperfect and it often takes a real human being to be able to pick up on such CAT errors and reconcile them with the text. From punctuation to fonts, or even missing language segments, CAT tools have their limitations and although extremely helpful for large-scale projects, the experience of a qualified professional will always be valued.