Twitter has become a great hub that gives a sneak peek at what the world is talking about. However we are only equipped to analyze the English tweets (e.g recent work on dialect identification on twitter, part of speech analysis, sentiment analysis etc.). Can we go beyond in analyzing other languages ?
Why do we need to do this?
More than 50 percent of the messages on twitter are non-English. Infact as of today the world speaks about 4000 languages, and restricting ourselves to English tweets does not give us a complete picture of the world's communication.
For instance, Egypt problem as viewed from united states is different from what people in Egypt sees it. What about the opinion of people who are on twitter but blog/tweet in a different language? Has this to do with the language twitter speaks? What if they don't tweet in English and don't follow the conventions of #hash tags
Project:
1. Visualize the twitter/facebook streams for a particular query "egypt revolution" and overlay it on the world-map
2. Use a different lens (a dictionary for Arabic-English) and translate more of the twitter streams (word for word) and overlay it on the map. Does this look radically different?
3. Zoom-in and out on different countries to see what they think about an issue.
Well, what about the rest of the people who don't blog, tweet, facebook? Thats a problem for another day.
No comments:
Post a Comment