There was some mighty big news made today -- mighty big if you're a globalization geek -- the fifth iteration of Unicode was officially launched.

Says the press release: "The Unicode Consortium announces the release of a significant update of its widely-used Unicode Character Database (UCD). The new version, Version 5.0, defines more than 99,000 characters for the languages of the world, and provides the detailed properties needed for computer software implementations. This latest level of the UCD contains all the information needed to update software to support the characters and algorithms that are the foundation for all modern computer programs -- including the latest data for Unicode security mechanisms, collation, and locales."
A print version of the standard is forthcoming. I have version 3.0, which weighs in at more than a thousand pages; I can only imagine how big the 5.0 book will be. Actually, if you want to get a true feel for the significance of Unicode, you really need to get the book. I got such a kick out of browsing through all those characters from all those languages that I don't speak. It puts little ol' English in perspective. It's an impressive achievment.
At this point it seems the improvements to Unicode are more about wiring and plumbing than simple character additions. Fewer than 2,000 characters were added this time around. But those characters do represent five new scripts: Balinese, N'Ko, Phags-pa, Phoenician, and Sumero-Akkadian Cuneiform.
I'm still in awe of Unicode and the people who developed it. Thanks to Unicode we can post multiple scripts on one Web page (whether or not they all display properly is another issue). Thanks to Unicode, a global company can purchase one content management system and, assuming it supports Unicode, allow all of the offices to contribute content, in nearly any language.
One application; many languages.
When I got into this field in 1999, creating a Japanese-language Web page required purchasing the Japanese OS of Windows, for starters. Those were the dark ages indeed. Thanks to Unicode, so many of the technical hurdles are gone, allowing people to simply communicate.
You can read all the details of 5.0 here.
July 17, 2006
Posted by John Yunker
Lionbridge issued a press release today on the growth of its new Web-based translation memory (TM) application, Logoport. Here are some notable stats:
- The volume of words managed by Logoport has more than doubled since its launch two months ago.
- Exactly how many words is that? Lionbridge says Logoport is nearing one billion words.
- Logoport supports more than 1,000 unique daily users and more than 40-million database queries per hour.
- One client is seeing an additional 5% to 10% reuse rate thanks to Logoport because it is now leveraging more than 150 separate TMs.
Comments (0)
+ TrackBacks (0) | Category: Globalization Vendors
July 11, 2006
Posted by John Yunker
According to Reuters, China is doing its part to help the world learn to speak its language.
It just launched the Chinese-instruction Web site: Linese.com.

It's somewhat ironic that the site comes across as poorly translated; bridging the English-Chinese gap, in either direction, is no easy task.
I've talked to a number of people lately who tell that they or their children (or both) are now learning Chinese. Reuters puts the estimate at 30 million students globally.
When I was in college, learning Japanese was all the rage. Now it is Chinese. Something tells me that this particular language, as far as the US is concerned, is much more than a passing fad.
Comments (0)
+ TrackBacks (0) | Category: China