Monday, May 2, 2011

Language Translation in the Crowd: Part 1

In the past I have tried involving crowds for translating text to be used in seeding an automatic translation systems.(refer: Ambati.et.al 2010, Ambati and Vogel 2010)

A few problems with crowd:
1. Too many spammers, how do you know who is doing the right thing, when you don't know what's right.
2. Too few bilingual speakers for any language-pair you pick. There are 50 major language pairs, and 3950 other languages in the world. Think of creating translation systems for translating between 4000X4000 !!!
3. How do you make it interesting for the users to contribute the site and not feel that they are being stolen of their Intellectual property ! (Give them money. Not feasible when you think of the few thousands of language-pairs you are considering)

Now there are some projects out there which have looked at a sub-set of the problems I mention above, although I am not convinced yet we have a silver bullet yet.
1. Monotrans: Effort from Maryland, which is by now well published now and the results have been applied to translation of Children books from the ICDL
2. Duolinguo has been making some noise for about a year now, but its not yet seen by the world outside. I hope and wish its really good, coz the success of such projects give focus to similar efforts !

My take on this is the success of translation in the crowd is going to need the following: (Some of which I am working on and will be publishing in my research!):
1. Translation task needs to become verifiable: Task-breakdown
2. Involve two vs. one person: Collaboration
3. Make it fun or a learning experience for all: Challenging Innovative Games

In a continuing post, I will talk about some of the designs we have come up with to build, collaborative, constructive and verifiable methods for involving the crowd and hopefully its fun and motivating enough for people to contribute without making them feel they are robbed of their time or knowledge. After all, knowledge can only be shared and it rightly should be !

No comments:

Post a Comment