Not logged in.

Contribution Details

Type Master's Thesis
Scope Discipline-based scholarship
Title Predicting Stock Price Correlations - An ROC Curve Based Classifier Performance Analysis
Organization Unit
  • Martin Castrischer
  • Thorsten Hens
  • Sven Christian Steude
  • English
Institution University of Zurich
Faculty Faculty of Economics, Business Administration and Information Technology
Date January 2015
Zusammenfassung This thesis investigates the predictive power of news and social media flows about stock price correlations. Most research on the one hand focuses on the relationship between news and social media flows and stock returns. On the other hand, existing literature analyzes this information in a qualitative way, e.g. good and bad news. This approach brings along some difficulties. Some piece of information may represent good news for one company while it is bad news for another company. Furthermore, the positive expressions are often used with a negation such as ‘no’, ‘not’ or ‘none’ making the categorization even more complex. For circumventing such difficulties, this work applies a purely quantitative approach to investigate the predictive information contained in news and social media flows, free of any sentiment rating. Co-Occurrences, corresponding to the amount of news and social media items addressing to a stock pair, are applied as correlation predictors. Besides investigating the direct relationship between Co-Occurrences and correlations, several measures will be defined based on the Co-Occurrences and correlations such as the difference in the respective measures between two consecutive trading days or the variation of the stock price correlation. The underlying dataset consists of two different kinds of information flows. The first dataset contains news data provided by Thomson Reuters News Analytics while the second dataset consists of social media data provided by Moreover Technologies, Inc. via Thomson Reuters. The overall analysis considers a total of about 7.3 million Co-Occurrences related to 255 stocks between July 2011 and March 2012. ROC Curve analysis is used to investigate the relationship between Co-Occurrences and correlations. This technique is hardly used in finance but brings along very beneficial characteristics because it is free of any distributional assumptions. Furthermore, no literature could be found applying ROC Curve analysis to time series data and this thesis is a methodology contribution for the implementation of ROC Curves for this kind of data. Besides introducing an algorithm to calculate ROC Curves for big data time series, arising difficulties and interpretation pitfalls will be elaborated. Some evidence is provided that news and social media flows contain predictive information about stock price correlations. The results of this thesis are as follows: 1. Most striking results can be found for the relationship between the absolute level of Co- Occurrences and the difference in correlations between two consecutive trading days. A high number of news items reporting on a stock pair seems to imply a positive shift in the stock price correlation within the subsequent trading days. 2. This thesis further demonstrates an ambiguous causality of Co-Occurrences and the variation of the stock price correlation. While the correlation variation reacts on Co-Occurrences for the social media data, the inversion of arguments is valid for the news data. This might indicate that social media data is rather rumor based whereas news data reports on realized facts. 3. Finally, there seems to be a relationship between the performance of the model and the base level of news and social media coverage of stocks. Predictive power can be found for stock pairs that are jointly mentioned in news and social media items on a regular basis. However, the relationship between Co-Occurrences and stock price correlations seems to be weak for stock pairs with rare joint appearance in news and social media items.
Export BibTeX