Study on news trading

Aron Vizkeleti
Elöd Egyed-Zsigmond
DOI
10.24348/coria.2021.long_16
Résumé

Stock market prediction using text mining and machine learning methods has received scientific attention in the last years. The success of these methods hinges on the efficient-market hypothesis and the precision of relevant information retrieval. This paper provides and compares methods to evaluate the relevance of retrieved information used to predict stock price changes, based on informational entropy and statistical methods. Our proposed prediction method compares textual information from a test period with previously retrieved information (over a learning period), thus the problem is reduced to multi-class classification. The textual and stock price data was retrieved using online articles and NASDAQ stock price data from the study period, while the relevant information has been extracted from the articles using content analysis and statistical methods. The study found that informational entropy of the similarities between the learning and the test period strongly correlates with the accuracy of the prediction. It is also shown that the amount of information - after steeply increasing – saturates as we use more and more articles.