I really think that this is a very challenging and motivating topic. As an example, in Opfine.com we can observe the potential benefits of this kind of approaches. I do not know how it is implemented or what kind of algorithm lays beyond this web, nevertheless, they claim an accuracy of around 97% [1]. Pretty much from my point of view.
I am guessing if, as Bing Liu suggests in his latest works [2, 3], the recall is so important in this kind of algorithms, in contrast with the preservation of the natural distribution of the sentiments. I mean, perhaps is more practical to filter out strange values or individuals which are hard to classify, on behalf of a higher accuracy of clearer individuals. Maybe, this is the reason for a so high claimed accuracy.
A recent article by Johan Bollen, Huina Mao and Xiao-Jun Zeng [4] applies the Profile of Mood States (POMS) to twitter updates, expanded with the Web 1T 5-gram Version 1. This Google dataset contains 1,176,470,663 five-word sequences that appear at least 40 times. The enlarged lexicon for POMS, with 976 terms, is what the authors of [4] called GPOMS lexicon. After their experimentation, they conclude that the accuracy of Dow Jones predictions can be significantly improved with some specific public mood dimensions, but not with others. Among the six identified dimensions (Calm, Alert, Sure, Vital, Kind, and Happy), adding Calm time series enhanced the accuracy to 87.6% in predicting the daily up and down changes in the closing values of DJIA. This article was recently discussed in ReadWriteWeb by Sarah Perez, where she also mentions two interesting webs related with stock market and twitter: StockTwits.com and FINIF Financial Informatics.
Obviously, I have found other recent articles related with sentiment analysis applied to market and politics prediction. However, they are still in my pending-to-read box. Some of the most interesting are [5, 6, 7, 8].
References
- About Opfine.com. Available at: http://opfine.com/about.jsp
- Liu, B. (2010). Sentiment Analysis: A Multi-Faceted Problem. IEEE Intelligent Systems 25 (3). Available at: http://www.cs.uic.edu/~liub/FBS/IEEE-Intell-Sentiment-Analysis.pdf
- Liu, B. (2010). Sentiment Analysis and Subjectivity. Handbook of Natural Language Processing. Available at: http://www.cs.uic.edu/~liub/FBS/NLP-handbook-sentiment-analysis.pdf
- Bollen, J., Mao, H., and Zeng, X.J. (2010). Twitter Mood Predicts the Stock Market. Arxiv preprint arXiv:1010.3003
- Das, S.R. and Chen, M.Y. (2007). Yahoo! for Amazon: Sentiment Extraction from Small Talk on the Web. Management Science 53 (9). Available at: http://algo.scu.edu/~sanjivdas/chat_FINAL.pdf
- Gilbert, E. and Karahalios, K. (2010). Widespread Worry and the Stock Market. Proceedings of AAAI ICWSM'10. Available at: http://social.cs.uiuc.edu/people/gilbert/pub/icwsm10.worry.gilbert.pdf
- Asur, S. and Huberman, B.A. (2010). Predicting the Future with Social Media. Arxiv preprint arXiv:1003.5699
- Tumasjan, A., Sprenger, T. O., Sandner, P. G., and Welpe, I. M. (2010). Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment. Proceedings of AAAI ICWSM'10. Available at: http://www.aaai.org/ocs/index.php/ICWSM/ICWSM10/paper/viewFile/1441/1852