Monday, November 8, 2010

Is there any correlation between the social sentiment stream in microblogs and the stock market?

Since I started to investigate in Sentiment Analysis and Opinion Mining last September, I have found many interesting papers. However, only a few of those that I have found are related with stock market, or even other markets in general.

I really think that this is a very challenging and motivating topic. As an example, in Opfine.com we can observe the potential benefits of this kind of approaches. I do not know how it is implemented or what kind of algorithm lays beyond this web, nevertheless, they claim an accuracy of around 97% [1]. Pretty much from my point of view.

I am guessing if, as Bing Liu suggests in his latest works [2, 3], the recall is so important in this kind of algorithms, in contrast with the preservation of the natural distribution of the sentiments. I mean, perhaps is more practical to filter out strange values or individuals which are hard to classify, on behalf of a higher accuracy of clearer individuals. Maybe, this is the reason for a so high claimed accuracy.

A recent article by Johan Bollen, Huina Mao and Xiao-Jun Zeng [4] applies the Profile of Mood States (POMS) to twitter updates, expanded with the Web 1T 5-gram Version 1. This Google dataset contains 1,176,470,663 five-word sequences that appear at least 40 times. The enlarged lexicon for POMS, with 976 terms, is what the authors of [4] called GPOMS lexicon. After their experimentation, they conclude that the accuracy of Dow Jones predictions can be significantly improved with some specific public mood dimensions, but not with others. Among the six identified dimensions (Calm, Alert, Sure, Vital, Kind, and Happy), adding Calm time series enhanced the accuracy to 87.6% in predicting the daily up and down changes in the closing values of DJIA. This article was recently discussed in ReadWriteWeb by Sarah Perez, where she also mentions two interesting webs related with stock market and twitter: StockTwits.com and FINIF Financial Informatics.

Obviously, I have found other recent articles related with sentiment analysis applied to market and politics prediction. However, they are still in my pending-to-read box. Some of the most interesting are [5, 6, 7, 8].

References

  1. About Opfine.com. Available at: http://opfine.com/about.jsp
  2. Liu, B. (2010). Sentiment Analysis: A Multi-Faceted Problem. IEEE Intelligent Systems 25 (3). Available at: http://www.cs.uic.edu/~liub/FBS/IEEE-Intell-Sentiment-Analysis.pdf
  3. Liu, B. (2010). Sentiment Analysis and Subjectivity. Handbook of Natural Language Processing. Available at: http://www.cs.uic.edu/~liub/FBS/NLP-handbook-sentiment-analysis.pdf
  4. Bollen, J., Mao, H., and Zeng, X.J. (2010). Twitter Mood Predicts the Stock Market. Arxiv preprint arXiv:1010.3003
  5. Das, S.R. and Chen, M.Y. (2007). Yahoo! for Amazon: Sentiment Extraction from Small Talk on the Web. Management Science 53 (9). Available at: http://algo.scu.edu/~sanjivdas/chat_FINAL.pdf
  6. Gilbert, E. and Karahalios, K. (2010). Widespread Worry and the Stock Market. Proceedings of AAAI ICWSM'10. Available at: http://social.cs.uiuc.edu/people/gilbert/pub/icwsm10.worry.gilbert.pdf
  7. Asur, S. and Huberman, B.A. (2010). Predicting the Future with Social Media. Arxiv preprint arXiv:1003.5699
  8. Tumasjan, A., Sprenger, T. O., Sandner, P. G., and Welpe, I. M. (2010). Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment. Proceedings of AAAI ICWSM'10. Available at: http://www.aaai.org/ocs/index.php/ICWSM/ICWSM10/paper/viewFile/1441/1852

Friday, October 29, 2010

Hey guy, where are you?

Well, as a matter of fact, I have no time to write a blog, I have never had. I would like to be more active, but it is almost impossible to find time. I will try, but at least I am now also trying to do a bit of microblogging in twitter (@jbtolosa), which is far more realistic...

So, since my last post, I had finished two official european masters from the University of Oviedo (Universidad de Oviedo):
Regarding publications, I have also published several papers:
  • José Barranquero Tolosa, Jose E. Labra Gayo, Ana B. Martínez Prieto, Sheila Méndez Núñez and Pratricia Ordóñez de Pablos (March 2010). Interactive Web Environment for Collaborative and Extensible Diagram Based Learning. Computers in Human Behavior, 26 (2), pp. 210-217. DOI: 10.1016/j.chb.2009.10.003
  • José Barranquero Tolosa, Sergio Guadarrama (July 2010). Collecting Fuzzy Perceptions from Non-expert Users. In Proceedings of the 19th IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2010), pp. 1-8, Barcelona (Spain). DOI: 10.1109/FUZZY.2010.5584816
  • José Barranquero Tolosa, Oscar Sanjuán-Martínez, Vicente García-Díaz, Cristina Pelayo G-Bustelo and Juan Manuel Cueva Lovelle (2010). Towards the Systematic Measurement of ATL Transformation Models. To appear in Software Practice and Experience (published online in Wiley Online Library). DOI: 10.1002/spe.1033
Moreover, since November 2009, I am enjoying a full-time pre-doc scholarship (FPI) at the Artificial Intelligence Center (Gijón, Spain). This grant is funded by the Spanish Ministry of Science and Innovation (MICINN).

After completing my second master, I have started to work in Sentiment Analysis and Opinion Mining, which covers my present activities nowadays. Hence, future posts will be focused on this topic... so starting again with a new topic... no problem, I can deal with it... again!