Wednesday, June 8, 2011

CollabSum: Collaborative Summarization of Webpages

Every page on the internet is read by at least more than two users. We can observe a power-law distribution over number of users that read a webpage. I am convinced at this point that most useful pages on the internet are read by a few thousands of users everyday. Imagine some of those users clicking on at least one sentence from the article to let us know its importance to the meaning of the article. Now imagine accumulating that information from a few hundreds of such users and producing a synopsis of the article for the quick consumption of everyone else.

I think technically this is a easy to build application which has a far-reaching impact and changes the way we consume content on Internet.

So what do we need to achieve this:
1. A client-side plugin that records clicks of users on a sentence of the news article.
2. A server-side that records these clicks and accumulates them
3. A method to push the accumulated summary back to the Client.
4. A way to overlay the summary on the article (highlight the sentences, heatmap, pop-up)

Who are involved:
1. User: He provides a single click, but benefits from the summaries of others. One way of addressing information overload.
2. Newspapers: Can understand their users better and create synopses and daily digests much faster.
3. Third-party: APIs can be provided to anyone that needs these summaries
4. Researchers: Automatic summarization systems can benefit from such data

What do we do with such data:
1. Improve existing summarization systems for the web
2. Build iphone apps that can provide summarized versions of the existing blogs
3. Create better RSS feeds with summaries and smarter digests for paid subscription users
4. Provide SEO support for user-generated social media by guiding Adsense programs towards relevant text
5. Kindle can now use the data to provide "highlights" for newspaper subscriptions


And all it takes is a single click of the user. At the risk of sounding a cliche - "A single click a day, can keep the information overload away".

No comments:

Post a Comment