Using Crowdsourcing for Expanding Localization of Products
UPDATE: I wanted to add that these translation APIs are all part of Microsoft Translator services and are available for developers to use and build their own localized communities. The documentation is up on MSDN for AJAX/JSON, SOAP or POX (Plain old XML) APIs you can put in your apps today. Also, be sure to check out the Microsoft Translator Blog for more technical details on the V2 APIs and translator widget.
Not everyone in the world speaks English. Such a silly thing to say, but if you live in an English-speaking country it's easy to forget that many (most?) people in the world would prefer to do their work in the language of their choice.
Microsoft ships documentation in Visual Studio that is human-translated (a huge effort) into 9 major world languages. That's millions and millions of words * 9 languages. How can we cover more languages? How can we make documentation easier for folks who are trying to learn about our products and don't speak English fluently? How can we make English interfaces easier to use for non-English speakers who want to learn English?
Last month, I spoke to members of the internationalization/globalization team in DevDiv (Developer Division) about some of the little-known stuff they are doing. I think deserves more attention as there's some pretty innovative things being done. Some are experimental, but there's hope to expand them if they succeed.
MSDN uses Machine Translation and Crowdsourcing for Documentation
Doing a lot of work with a few people is hard. Doing a lot of work with a lot of people is confusing and expensive. However, doing a little bit of work with a LOT of interested people can be useful, cheap and fun if you "crowd-source" rather than outsource. Check out the screenshot below or visit the Brazilian MSDN site and check out the Translation Wiki v2.
You'll see there's the English MSDN documentation on the left, and Brazilian Portuguese on the right.
Make sure to select "side-by-side" or "Lado a Lado." If you hover over a sentence on the Portuguese side, a small Edit button will appear.
Click Edit, and you can suggest a better translation, and they'll go into a queue for community moderators to approve. Notice also that under "Other Suggestions" you'll see existing suggested translations that are in the queue for moderation.
The initial Portuguese text comes from the Machine Translation team. For some reason, Portuguese is the best language that the Machine Translation team understands.
The text on the site is roughly 80% MT (Machine Translated) and 20% humans via these technique, and growing. There's a goal to include more languages for the next version of Visual Studio, including possibly Arabic, Czech, Polish and Turkish, although things are still a little up in the air.
If you know a Brazilian developer, spread the word about this project and encourage them to make edits to the Brazilian MSDN site and check out the Translation Wiki v2.
Big thanks to our community partners: a group of 30 CS students, partly from the team of Prof. Hirata and Prof. Forster of Instituto Tecnologico de Aeronautica and the team of Prof. Simone Barbosa from Pontifícia Universidade Católica who post-edited 1.8 million words of MT'ed content; the Brazilian Terminologist who managed the glossary project with our MVPs; and finally the Academic Evangelist Team in DPE in Brazil who gave us their support throughout the project.
It'll be interesting to see how far this project goes and what other languages can benefit from it.
Captions Language Interface Pack (CLIP) - includes 9 more partial language translations for Visual Studio
Here's a description of the CLIP from a launching page:
"The Microsoft Captions Language Interface Pack (CLIP) is a simple language translation solution that uses tooltip captions to display results. Use CLIP as a language aid, to see translations in your own dialect, update results in your own native tongue or use it as a learning tool."
This is pretty clever. It's a background application that will show balloon tooltip help in your language while you work in the English version of Visual Studio. For example, in the screenshot below, I'm hovering my mouse over Start Debugging, and the Arabic CLIP pops up with a human translation of that menu item.
It'll even help with other applications within Windows if it thinks it's got a decent translation, but for now, it is focused on correct translation for common Visual Studio options.
Even better, you can add translations of your own. In future versions, there's talk about setting up sharing (I figure you can hack it today, though, unsupported, by sharing the language database.
Visual Studio CLIP is available in these languages so far, all created with community and student help!
- Arabic (المنطقة العربية)
- with students from King Fahd University of Petroleum and Minerals (KFUPM), managed by Prof. Abdullah Al-Zamel
- Czech (Česká republika)
- with students from VŠB-TU Ostrava managed by Eng. Jan Martinovič
- Hebrew (ישראל)
- with students from the Computer Department of the College Of Management managed by by Prof. Samuel Itzikowitz
- Hindi (हिन्दी) and Tamil (தமிழ்)
- with a team from the Central Institute of Indian Languages
- Malayalam (മലയാളം)
- in cooperation with a team from the Central Institute of Indian Languages
- Oriya (ଓଡ଼ିଆ)
- with students from Ravenshaw University in India, managed by Prof. Mishra
- Polish (Polska)
- with students from Wroclaw University managed by Prof. Zbigniew Fryžlewicz
- Turkish (Türkiye)
- with students from Hacettepe Üniversitesi lead by Prof. Ercin Töreci
In addition to the CLIP, there's also the ability to do a Language Pack for the Visual Studio interface itself, as exemplified by the Brazilian Visual Studio Express Language Pack for SP1 that does about a 70% translation of VS into Portuguese. There's talk to do more of these also. That should make Carlos Quintero happy!
There's a lot of cool possibilities for all this technology, expanding MSDN and VS to as many languages as possible!
If you think this kind of thinking is pretty cool, leave a comment or blog about it and maybe we'll be heard by *ahem* the boss when he next (soon) reviews plans for this kind of community involvement. ;)
About Scott
Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.
About Newsletter
English is not my native language, but still I believe that this effort to be somewhat misguided. I generally prefer to read documentation and programming articles in english, because it has become a uniform language that most, if not all programmers out there can understand and communicate. Most articles, knowledge bases, books and so on are in English, so if you want to read up on something in depth, you need to have at least basic reading skills in English. Translating tooltips inside Visual Studio could end up causing confusion for at least new developers, as what they would see on-screen potentially did not match up with what the tutorial/book they were following.
And don't even get me started on the translation of .Net exceptions. Because of policy at work, I run the Norwegian version of Vista, and the .Net produces exception error messages in Norwegian. First, they are poorly translated(perhaps by the aforementioned machine-translation team?), so they make absolutely no sense, and second, more importantly, there's ABSOLUTELY ZERO results when you try and google for them. With XP, we could at least install a english version of .Net framework, but given that .Net is bundled with Vista, there's no way that can be done any more. I am seriously considering down-grading to XP so that I can get english error messages back. And if scottgu reads this, can I please have my english exception messages back??? Thanks.
End rant. I truly do appreciate that Microsoft is trying to make an effort, and I believe that MSDN has had a vast improvement in usability the past year or so. And the fact that MSFT are allowing community contribution is absolutely fantastic, but at least to me, the translation effort just seems a bit unneccessary.
Does Microsoft has any plan to support Persian language?
Developers might as well get used to learning new languages (even if they aren't programming languages).
I found that gaining a more profound knowledge of the english language can also lead to a more solid undertanding of certain concepts of a programming language. You'd be surprised how many developers use keywords for which they don't even know the english meaning (e.g. 'yield' in c#)
In most cases this won't cause any serious harm but I think you are a better dev if you have at least basic english reading skills.
I believe that in an ideal world every programmer should speak and read enough English to be able to work, learn and interact. However (and specially in Latin America) this is still a long term goal. I really applaud the effort being put in by Microsoft and other companies to make resources more available for everyone.
Like Hexagon says, there is a lot of untapped talent out there, trapped by the lack of understanding of the English language. Let's make it easier for everyone, we'll get greater software in return.
MSDN redirect me to Polish page: http://msdn.microsoft.com/pl-pl/tfs2008/default.aspx (translated version: http://translate.google.com/translate?u=http://msdn.microsoft.com/pl-pl/tfs2008/default.aspx&hl=pl&ie=UTF-8&sl=pl&tl=en). Polish is my native language.
Do you see what is on top? "Team Foundation Server software is ready!". What great news. Using this page is waste of time. I prefer English version:
http://msdn.microsoft.com/en-us/tfs2008/default.aspx.
In this case I don't blame Microsoft. I think that it's not possible to keep up to date all national pages.
In my opinion every programmer has to be able to read English documentation. Everything changes so fast now that waiting few months for translation is too much.
1. english is international language;
2. I don't know how with translation to other languages, but when I'm reading a books in Russian (my native language) I'm shocked. Our translates make up new definitions and I lost in theirs, however before reading a books I thought that I know this field. So, my opinion that it would be better to read a document in the English.
Hurray to application localization and globalization, but the dev tools and docs can stay in English for me.
BTW, slighty OT: The OpenID box for commenting here is translated in French as "Clic au signe dedans", which is about as clear as "A click at sign into". This tells a lot about how inadecate computer-made translations are.
Languages should coexist, side by side, and this is a good example.
But if there's a bluebadge from the CLR team reading this, any chance that you can shed some light on the reason for translating the error messages for .Net framework for localized OS versions? IMHO exception messages should NEVER be presented to end users. And developers and IT staff, which are the normal recipients for error messages, would have a much easier time finding answers on the internet if the error messages were given in a uniform language, ie. English.
Give us our exception messages back! :)
S'all good. Translation: It's all good.
However, I would agree that you need to know english in order to work on any application I am currently on. That's because we try to use the Ubiquitous Language and Domain Driven Development. If you didn't know english, you'd have a hell of a time with most everything in our code. Then again, maybe it would be a really good way to learn.
Still, there'd be a lot of fighting over what to rename something a non-native english speaker would name some of their classes/methods in our app :D
Discussing (blogs, references, articles) in more languages makes it harder to share information (especially with technical words).
Therefore, I would say that you "SHOULD" (a very big "SHOULD") know at least some basic english, in order to be a programmer.
Of course, having references in both english and your own native language helps a lot :-)
a) After so many years I got used to the English terms. Portuguese words in this topic sound weird to my ears, like "depurar" instead of "debugging". Yuck! It is more common among Brazilian developers to turn those terms into Portuguese-ish words, like "debugar" (to debug as if it was Portuguese verb);
2) A lot of book don't have Portuguese version. And for those that have, translation quality vary a lot;
3) There is no consistency in translations. The translation for "build" may vary between softwares and documents.
4) 10 years ago It was common to say that the patches come first for English version. Is it true nowadays?
5) Global economy. What if you need to send screen shots or make a webcast for non Portuguese speakers?
Nevertheless, this effort of providing localized information is honorable. Congratulations to the team
The Product Group enabled a group of MVPs and other Influencers to localize TSWA into more languages than what's available out of the box. It was a great team effort between the localization group and the PG.
Not trying to sound like 'that guy'...but the author of the statment is not a native english speaker.
This discussion is very interesting. It would *seem* (totally non-scientific sampling) that the non-english speakers (as a first language anyway) tend to agree with the statement "If you don't know English, you're not a programmer" more than native english speakers.
There is an opportunity to have DSL languages for translation of code maybe?
Knowing a different languages pays, be that English, French, Spanish or Chinese. If you are over 60 it even helps your brain to remain healthy.
Maybe the statement was a rebuff?
Thanks for the info on CLIPS.
We're talking exceptions here! Presumably the point of these is to convey information back to the programmer so steps can be taken to correct the problem. These aren't user-directed generic errors like "invalid password", they are things like "value passed to argument in method xxx cannot be negative at yyy in zzz". Please explain to me the logic in localizing these to the user's language! Are they going to fix my bug for me?
There doesn't even appear to be any way to get an English version of an exception manually. Localizing exceptions was a silly idea and a poor design decision, imo.
Comments are closed.