John Yunker is founder of Byte Level Research and author of the widely acclaimed book, Beyond Borders: Web Globalization Strategies and editor of Global By Design.
He has covered the emerging field of Web globalization for half a decade and has published a wide range of reports dedicated to best practices in Web localization and internationalization.
About this blog
Going Global focuses on the risks and rewards of expanding into new geographic and cultural markets, from Web globalization to international marketing to global usability.
Quechua is the language of the Incan Empire and is spoken by roughly 10 million people throughout South America, the majority of whom live in Peru and Bolivia.
Recent developments suggest that this “minority” language is not going gently into that good night.
Google currently supports Quechua with a localized search engine.
And it is not the only software company to support this language.
I read this morning, via Michael Kaplan’s blog, that Microsoft now supports Quechua in Windows and its Office software. It will be announcing this language support with the newly elected president of Bolivia, Evo Morales, who is an Indian. I should note that this level of support only applies to menus and commands; I don’t expect to see a knowledgebase translated anytime soon. Still, a little support is much much better than no support.
In addition, The Economist features an article on Quechua, noting that a recently elected member of Peru’s Congress is now speaking Quechuan instead of Spanish.
This high-level support for the language will help ensure that multinational companies provide support as well, a positive sign for the one of the world’s oldest surviving languages.
Google Desktop began as a simple application that would let you search your computer's hard drive as quickly as it searches the Internet. And, best of all, it was free.
It still is free, but now it supports lots of little desktop applications, known as gadgets, and 28 languages, including Finnish, Turkish, and Romanian (excerpted here).
You can reach more than 80% of the world's Internet users with just 10 languages. So Google is clearly making good progress in expanding the reach of this application.
However, for those of us on Macs (like me) Google Desktop is still out of reach.
I got an email last week from Chris Wood, one of the authors of "Bilingual Software Standards and Guidelines in Wales." I realize that this may not sound like the most exciting read, but I recommend downloading a copy (hey, it's free!).
I'm reading it now and have found that it includes good advice for anyone involved in Web or software globalization. Most of the concepts carry over to any bilingual application.
And there were some interesting little nuggets about Welsh that I wasn't aware of. For example, here is the appropriate way to alpha sort the following three words:
- label
- lori
- llefrith
The third word appears like it should be bumped up a notch, but the "ll" is actually one character: a diagraph. Digraph letters that occur in the Welsh alphabet include: ch, dd, ff, ng, ll, ph, rh, th. When sorting, the "ll" falls after "l."
It's often assumed that languages that use the same basic letters all sort the same way. Not so. You'll find these little quirks in a host of Latin-based languages.
If you've got a half hour to spare, it's worth a watch. It gets a bit techie at times but there are some grest nuggets of wisdom for anyone involved in software or Web globalization.
If you don't have the time, here are a few items that jumped out at me...
-> Vista is being localized into roughly 100 languages (some partially) -- this is, as I understand, about twice the number of languages that were supported by Windows XP. By the way, this blows away the number of languages support by Mac.
-> Microsoft is "opening it up" and "getting out of the way" -- which means that they know that they won't be able to localize Windows into a thousand languages anytime soon, so they are working to create the tools to allow folks around the world to customize Windows to their languages and cultures. I'm glad to see Microsoft doing this -- Michael introduced a nifty keyboard tool that you can use to create your own keyboard layouts. Very nice.
-> Vista will support roughly 200 locales. This is a big increase from XP. A locale includes such elements as language, date format, currency format, etc.
-> "You can't know everything" -- is Michael's advice to other world-be internationalization engineers. So true. This is one thing I really love about this field -- there are just too many languages and cultural nuances for anyone to know it all. It means that we're always learning something new and that teamwork is essential to success.
-> Get to know Unicode. Unicode came up several times during the interview. Microsoft was an early promoter of Unicode and Unicode truly has revolutionized global software development. The last remaining non-Unicode area on the Internet is the DNS -- which engineers are grappling with as we speak.
Lionbridge announced recently the completion of software localization for the new Microsoft Visual Studio 2005. This project entailed the localization of the user interface and documention, which included an estimated 15 million words across 8 languages. This project sure kept a lot of translators busy.
This is also the type of project that only a few localization firms can handle. It requires a mix of engineers, translators, and project managers skilled in software localization. Also, the effective use of translation memory tools is essential.
This project began as a Bowne Global Solutions project and -- due to the acquisition -- was completed as a Lionbridge project.
The standards organization OASIS recently approved Darwin Information Typing Architecture (DITA) version 1.0 as an OASIS Standard -- a "status that signifies the highest level of ratification."
So what exactly is DITA?
According to the press release, "DITA consists of a set of design principles for creating "information-typed" modules at a topic level. DITA enables organizations to deliver content as closely as possible to the point-of-use, making it ideal for applications such as integrated help systems, web sites, and how-to instruction pages. DITA's topic-oriented content can be used to exploit new features or delivery channels as they become available."
Still not clear?
I'm afraid this is one of those standards that only an information architect could love. Fortunately for me, I did have the benefit of an Idiom presentation on DITA recently. The presentation illustrated how the standard will aid in managing content across languages as well as across departments and media (Web, print, mobile).
And there is a real need among enterprises for an XML standard that allows them to "chunk" content in a way that allows for such wide-scale reuse and translation. I'll know more when I see some real-world success stories, of which there are none as of yet. But I'm sure the folks at Idiom and Arbortext are writing up their case studies as we speak.
So where will we be seeing DITA commercially?
Here are the principal vendor supporters: Idiom, Arbortext, BMC, IBM, Intel, Nokia, Oracle, and Sun. The ones to watch are not just these folks but also the folks not on the list, particularly Documentum, Interwoven, and Vignette. It will be interesting to see if other CMS vendors jump on the DITA bandwagon.
If you want to learn more, and have a few hours to spare, here are the tech specs on DITA.
The track promises that "developers will be provided with the tools needed to develop World-Ready solutions that support multiple writing systems that are easy to localize. IT professionals will learn important skills in deploying Windows XP and Office 2003 Multilingual User Interface version to support global businesses from New York to Beijing."
Session details can be found here. Here are a few titles that jumped out at me:
- Deploying Office 2003 in a Multilingual Environment
- Custom Cultures and International Data
- Planning a Global Release: Many countries, Many Languages, One Process
I also recommend sitting in on Michael Kaplan's sessions on designing databases for the world. If you're running a SQL Server database and you're not sure how to handle all those different character sets, this is the place to be.
I know over the years I have harped on Microsoft's reluctance to invest fully in Web and software globalization (how come Google has 100+ language interfaces and Microsoft is stuck in the 40s). Oh, I know it's all about ROI and piracy and so on and so forth. But Microsoft of all companies can afford a few loss leaders; for what the company has lost on the Xbox in one year it could localize MS Office into 125 languages.
But I digress.
Despite our differences, I really do recommend this event (and they didn't pay me to say that). Microsoft has some truly top notch people working in their internationalization group and the company as a whole has done a great deal to advance the use of Unicode as the world's default character set.
Unfortunately, I'm not going to be there, but if you are, send me an update - or a PPT.
Sun says that it localize its StarOffice software suite into five additional languages over the next year, which will include Russian, Polish and Dutch. The software is currently available in 11 languages.
StarOffice costs a great deal less than Microsoft's Office suite, which makes it particularly appealing in an emerging market like Russia. While Microsoft's Office suite is available more languages than StarOffice, the gap is narrowing.
Richard Koman for SiliconValley Watcher writes about the potential of Gmail in developing markets.
The greatest reason why I think Gmail has a good shot at popular usage is Google's expertise at localizing the platform for developing markets. I don't expect Microsoft to localize Outlook for Bihari, Tonga or Swahili anytime soon, but Google is well on its way toward supporting these languages on its main platform.
It remains to be seen if Google throws the resources fully localizing Gmail, but the potential is certainly there.
Localization industry organization LISA has posted an interesting Q&A with Ori Redler, co-founder of software vendor RedleX, which has developed a word processing application for the Mac that supports such "minority" languages as Welsh, Farsi and Hebrew. Not only does the software take Unicode support to the next level, it does so at an affordable price, at just $39 per license.
Why have the larger software developers overlooked (or ignored) minority languages thus far? Here's what Ori has to say:
"Do other developers show disrespect by not issuing localized versions of their products? Not necessarily. Many of them, I feel, simply lack the awareness or tend to ignore markets outside English-speaking countries as 'irrelevant.' I can understand the financial sense of this decision, at least with respect for some languages. But I think this is also a misunderstanding of the situation. The fact that Czech, Greek or Swedish users dont make a fuss about getting their localized versions doesnt mean that they dont need them. They do. Theyre just being civil about it, or, worse, are simply accustomed to being ignored."
RedleX uses an open source localization model asking volunteers to contribute in the effort. Meanwhile, a company like Microsoft says that it can't afford to provide its full-featured software to emerging markets at reduced prices; instead, it offers a stripped-down version of its software, known as "Microsoft Lite."
Yet there is nothing "lite" about RedleX's software. Instead of penalizing an emerging market with stripped-down software, it offers full-featured software to all markets at a reasonable price.
Says Ori: "From our point of view as software makers and I think this would apply to all types of makers the best way we can approach the Digital Divide is by 'ignoring' it. That is, were selling to countries considered across this Divide not because theyre across it, but because we think Mellel is something they can use and benefit by and help us pay the bills while we are at it. When making a deal with a dealer in an across country, or with a student or school, we lower the price significantly. Were not doing this as a 'favor' to anyone; we do it because it makes good business sense. Wed rather sell a million copies of Mellel to India for $2 a copy than sell ten for $40 a copy. This, I think, is the most practical way to treat this divide and other 'divides.'"
The Wall Street Journal documents Microsoft's efforts to profit from developing markets while battling piracy.
As I've mentioned before, Microsoft's response over the past decade to a piracy-prone market has been to simply ignore that market. That is, until Linux came along. Now Microsoft is selling a scaled-down, less expensive operating system referred to as Microsoft Lite. According to the article, Microsoft has begun rolling out Microsoft Lite in Thailand, and plans to launch it in Malaysia, Indonesia, India and Russia in early 2005.
According to the article "... the software offers fewer features for a lower price and is designed to appeal to first-time computer users. Microsoft isn't selling the software separately from PCs, nor disclosing how much it is charging computer makers for including it with their models. The company expects the PCs with the slimmed-down Windows to be priced as low as $300, hundreds of dollars less than low-end PCs equipped with Windows sell for in the U.S."
If Microsoft had invested the energy in truly understanding how developing markets operate, it would have released Microsoft Lite five years ago, well before Linux ever became a threat. The question now becomes -- is Microsoft Lite full-featured enough to compete with Linux? I still believe that $300 is too expensive to be successful in most developing markets.
However, to Microsoft's credit, it does have a mobile OS that could end up being the real success story in these markets. It won't provide the type of per-unit revenue that Microsoft is accustomed to, but volume could easily make up the difference.
Red Hat Linux is now avaailable in Bangla (also known as Bengali). Bangla is the official language of Bangladesh, a country with more than 140 million people. Why Bangla you ask?
Because a growing number of programmers in India and Bangladesh want software in their native tongue. And, more important, because Microsoft has so far largely ignored this market.
Javed Tapia, Director Red Hat India said "India's domestic software industry resembles the TV industry around nine years ago when the programming was only in Hindi or English. Similarly, today computers are predominantly used only in English." Over 90mn Indians speak Bengali language. He added, "Given that only a small percentage of our population communicates in English, it is imperative that software is available in Bengali and other local languages. The Red Hat Bangla desktop will definitely play a significant role in ensuring that benefits of the IT revolution are realized by millions of Bengalis," he said.
The Bangla Linux desktop has the potential to change how education and e-government work. In education, teaching school children will be easier through computer user interfaces that are in Bengali. In e-Government, the use of Bangla Linux will enable users to access and/or create information in their own language. Citizens can access Government services in Bengali. Localization also expands business opportunities of Independent Software Vendors developing applications for education, e-Governance, Rural Banking, Community Information Centers (CICs) etc.
This is a major milestone in Red Hat's long-term strategy for India. In addition to Bangla, Red Hat is working on localization of other Indian languages including Hindi, Gujarati, Punjabi and Tamil. All these will be available as a part of Red Hat Enterprise Linux version 4 in February 2005.
I'm critical of Microsoft because the company has largely ignored countries like Bangladesh for years. Microsoft has been so consumed with software piracy that it figures any localization investment is a waste of money. And since they don't provide software at a price that most consumers in these poorer markets can afford, they create a self-fulfillling prophecy. But then along comes Linux, a boom in outsourced software development, and, suddenly Microsoft is on the outside looking in.
According to this article Microsoft is now working on a Bangla OS, to be released in a year.
While Microsoft focuses on its "strategic markets" the rest of the world is making do with open source software. And they're doing quite well, thank you very much.
Swahili is the most spoken of the Bantu languages and conservative estimates indicate that is the first language spoken by more than 70 million people, chiefly in Kenya, Tanzania, Congo (Kinshasa), and Uganda.
I do not expect a Microsoft Office Swahili anytime soon.
Microsoft has the funds to localize its office suite into every human language and still have a few billion in change. But it chooses to focus only on those markets where it can make a big profit. It has no interest in "break even" markets.
Microsoft offers 47 languages the last I checked, a number that has increased only marginally over the past few years. Meanwhile, OpenOffice offers more than 30 localized versions with another 30 or so in the works.
I've said it before and I'll say it again: Every culture that Microsoft ignores today is a culture that it will lose tomorrow.
The iTunes Canada store is now up and running. That makes it a total of 14 local iTunes stores now available. Here is a screen shot of the iTunes global gateway:
To give you an idea of how quickly Apple has been expanding globally, here is a screen shot of the iTunes global gateway back in September:
Yep, just four stores were live in September. Talk about rapid globalization.
A Japan iTunes store is in the works, but that country won't be as simple as Canada. Character set challenges are never easy, particularly when it comes to text input, output and search engines. Nevertheless, Apple is going at a blistering pace and I won't be surprised if I see 30 stores by the end of 2005.
Just when I thought I'd seen every type of search engine, along comes Babelplex. Babelplex takes a search string, translates it into another language and searches on both languages at once.
Its name comes from the Web-based machine translation software, Babel Fish.
This could be a handy tool for testing Google adwords in other markets. Apart from that, I'm not sure how I'd use it just yet, at least not until I improve on my Spanish.
I can't imagine Google will look too kindly at its design:
According to News.com, Apple is launching iTunes for Japan in March 2005.
As I reported a few weeks agoiTunes is already localized for 12 European markets. Japan, however, will not be quite so simple due to the inevitable character set challenges.
Also, here are my initial thoughts of how effectively Apple is localizing each store.
Game localization is much more than a niche industry these days. Today, video games may be console-based, Internet-based, and even phone-based. It's a fascinating, emerging industry. In the interview, Heather provides a number of insights:
Game localization is a growth industry. Heather provides tips for those who want to join.
Game localization presents very unique challenges. Find out what Healther learned when localizing a WWII flight-simulation game for Germany.
Find out what game developer does the best job of game localization.
Find out how long it takes to localize a video game for a new market.
The W3C Internationalization Working Group continues to add valuable resources for those who need hands-on information regarding Web globalization. Need to learn more about internationalized domain names, bidirectional text display, or multilingual style sheets? This is a good place to start. It can be dense at times and challenging to navigate, but there's a lot of solid information there.
I found this Web page particularly interesting. It contains demonstration Web pages to illustrate key issues and techniques. For example, the screen shot below is from a page in which you can test your Web browser's ability to display and respond to internationalized domain names.
Linux developer Red Hat says it will add five new languages to its next generation of enterprise software - all Indian languages. Equally important, Red Hat plans to offer customer phone support in those languages.
Although the company didn't say which languages it plans to support (India has 15 national languages), this is a positive sign. It brings the total language count that Red Hat supports to 15.
According to ComputerWeekly, India "sees a great opportunity in India for Linux desktop deployments in education, e-governance, and small and medium-sized enterprise."
Microsoft, to my knowledge, does not offer enterprise software in any Indian languages. Like I've written in the past, should Microsoft fall from its mighty perch, lack of localized software will be one of the reasons why.
Here's a meaty article from CNET News on the latest XML developments. According to the co-creator of XML, Tim Bray, XML owes at least some of its success to its native support for Unicode. Here's a quote:
XML has succeeded, co-creator Bray said, because it has solved several of the more vexing challenges for electronic data exchange, including growing need to deal with diverse languages and character sets.
"One of the big problems is internationalization," Bray said. "One of the reasons XML took off is because it solved a lot of those issues with Unicode, which was fairly new at that point."
When XML hit the scene, HTML still advocated the Latin 1 character set and the Domain Name System was mired (and still is mired) in a subset of ASCII. Although HTML is now Unicode-friendly, XML was built to support the managing of the massive amounts of content that companies now struggle with. XML is far from perfect; because it is so flexible it allows for almost too much creativity from the vendors. Still, it's the best thing going and its support for Unicode has made XML the language for choice for companies that want to "future proof" their content.
CNET News features an entertaining article on some of the creative ways that Microsoft has offended people around the world through cultural and linguistic blunders. These anecdotes come from a recent presentation by a Microsoft executive, who is probably now being reprimanded.
Here are two blunders from the article that are bound to be endlessly repeated by localization vendors and consultants (such as me) for years to come:
Microsoft has also managed to upset women and entire countries. A Spanish-language version of Windows XP, destined for Latin American markets, asked users to select their gender between "not specified," "male" or "bitch," because of an unfortunate error in translation.
When coloring in 800,000 pixels on a map of India, Microsoft colored eight of them a different shade of green to represent the disputed Kashmiri territory. The difference in greens meant Kashmir was shown as non-Indian, and the product was promptly banned in India. Microsoft was left to recall all 200,000 copies of the offending Windows 95 operating system software to try and heal the diplomatic wounds. "It cost millions," [Microsoft's Tom] Edwards said.
In Microsoft's defense, mistakes like these are endemic to most companies. Expanding into new markets always looks a great deal easier than it is.
If there is one lesson to be taken from Microsoft, it is that poorly managed localization is almost always more expensive in the end than no localization at all.
PS: Here's another Microsoft anecdote from the Taipei Times:
One mistake that caused catastrophic offence was a game called Kakuto Chojin, a hand-to-hand fighting game. The fighting went on with rhythmic chanting in the background which in reviewing the game Edwards noticed appeared to be Arabic.
"I checked with an Arabic speaker in the company who was also a Muslim about what the chant meant and it was from the Koran. He went ballistic. It was an incredible insult to Islam," Edwards said.
He asked for the game to be withdrawn but it was issued against his advice in the US in the belief that it would not be noticed.
Three months later, the Saudi Arabian government made a formal protest. Microsoft withdrew the game worldwide.
The title of this latest Idiom press release makes it sound as if Idiom just won the Oracle account. However, if you read past the first paragraph you'll find that Oracle is not a new account. The deal appears to be an expansion of an existing software deployment. Make no mistake, this is very good news for Idiom. Still, I wish the PR folks would turn it down a notch.
Here is the press release:
Oracle Chooses WorldServer to Help Reduce the Time, Cost and Complexity of Translation and Localization
Aug. 2 /PRNewswire/ -- Globalization Management Systems
(GMS) leader, Idiom(R) Technologies, Inc., today announced that Oracle(R), the
world's largest enterprise software company, has selected Idiom
WorldServer(TM) as an integral component of its "Translation Factory", the
translation infrastructure used by Oracle to simultaneously ship products, Web
content, collateral and documentation in 32 languages across all geographies.
Oracle first purchased WorldServer in 2002 to support a strategic
initiative to better deliver its online content globally. The success of the
Oracle.com globalization effort suggested that similar benefits might be
achieved if WorldServer was used for other types of content that required
globalization. After an evaluation of competing GMS offerings, WorldServer
was again selected for a multi-month pilot project that focused on delivering
globalized product help, documentation and training material. This extensive
pilot confirmed that the same WorldServer benefits could apply to all of
Oracle's translation and localization efforts, based on its ability to address
the following needs:
Accelerate Time-to-Market: The pilot showed that WorldServer could be
seamlessly integrated with Oracle's internally developed globalization
tools, thereby delivering the process automation needed to achieve
"SimShip".
Improve Translation Quality: Oracle also found that it was able to more
consistently reuse commonly translated terms, phrases and sentences and
that they were able to share these translation assets across more
content types. As a result, they were able to eliminate translation
inconsistencies that often result from working with multiple third
party vendors from project to project.
Simplified Vendor Management: The pilot also showed that with
WorldServer, Oracle would be able to simplify the management of its
vendor base for many content types thus reducing the workload on its
Microsoft has greatly expanded its globalization resources Web site over the past few years. Although much of it is geared toward software developers, Web developers will also find useful nuggets of information. If you haven't been, it's definitely worth a look: http://www.microsoft.com/globaldev.
According to last week's newsletter, here are a few of the latest additions to the site:
A new section called Perspectives (http://www.microsoft.com/globaldev/perspectives/default.mspx) features articles highlighting different topics and issues that are of interest to users from different parts of the world.
Language Collection and Fonts in Windows XP: Article on the fonts that ship with Windows XP (http://www.microsoft.com/globaldev/handson/dev/font_install_xp.aspx).
International Support in Outlook Express: How to set up Outlook Express to support other languages in your e-mails or newsgroup communications (http://www.microsoft.com/globaldev/handson/user/OE_setting1.mspx).
Here is a very nice article by Evelyn Olsen on the New Zealand translation industry.
Her company is NZTC, and for those who manage competitive companies, I recommend taking a look at their Web site (excerpt below).
They present bios of their teams divided by language specialty - a nice touch. Translation companies increasingly run into prospective clients who naively believe that translation work could easily be managed by computers. By emphasizing the people behind the scenes, NZTC makes it very clear what value they add to the process.
Microsoft likes to make a big deal of how global a company it is, and for good reason. The company now makes more money from outside the US than from within. But the company still has a long way to go before it is truly a global company and, as this CNET News article makes clear, the company may face a tough road getting there.
Right now, the company offers its operating system in 47 languages. This is no simple feat -- localizing software into a new language can easily exceed a million dollars in engineering and translation costs. But at 47 languages, Microsoft is still only serving a small portion of the world; there are more than a thousand languages in use today.
As the CNET article points out, most small and emerging markets have been overlooked by the folks at Microsoft.
Enter OpenOffice, an open-source alternative to Microsoft's office software suite. A grassroots effort has been gradually localizing the software for more than 30 languages, with many more on the way -- from Basque to Kinyarwanda (Rwanda).
With open source software, anybody with the time and expertise can assist with software localization. So what we are witnessing are people volunteering their time to do something that Microsoft won't spend a dime on -- creating software for people who don't speak a major language. This is a noble cause and one that will inevitably add to the growing global resentment toward Microsoft.
While I can understand why a company decides that the ROI (return on investment) of software localization doesn't add up for certain markets, I don't understand how Microsoft can justify turning its head, given how many billions of dollars it has stashed in the bank. In one year, the company could localize its Office suite into 100+ languages without breaking much of a sweat, yet it doesn't, and in not doing so it opens the door a little wider to open source software -- software that one day may lead to the downfall of Microsoft as we know it.
The translation industry used to have four vendors that towered over everyone else: Bowne Global Solutions, Berlitz GlobalNet, Lionbridge, and SDL International. In September, Bowne bought Berlitz, and then there were three.
What I find interesting is what a tough time these big vendors are having making a profit. Everyone thought consolidation was the key to success, and Bowne went on a buying spree, as did Lionbridge and SDL. Now they have loans to pay and the revenues aren't quite what they hoped. They bought market share and then the market shrunk. Temporarily, we hope.
I am fairly optimistic about the big 3. And in many ways the industry needs a big 3 -- large firms that can pump serious money into serious advertising and PR, helping educate corporate America about the value of translation. (I can't tell you how many execs I talk to who think computers are going to wipe out the entire industry in a year or two.)
But those advertising plans are going to be on hold for awhile, at least until the bills get paid down a bit and corporate America starts spending again.