Stock market prediction using social media data and finding the covariance of the LASSO

None, None

Stock market prediction using social media data and finding the covariance of the LASSO

Master Thesis (2014)

Author(s)

J.F. Kooijman

Contributor(s)

M. Verhaegen – Mentor

Copyright

Stock market Prediction LASSO L1 regularization SVR Covariance Feature selection

To reference this document use:

https://resolver.tudelft.nl/uuid:588ea23b-4723-4332-bc9f-4b7aec8f8b66

More Info

expand_more

Publication Year

2014

Copyright

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Stock market prediction has been a research topic for decades; recently, efforts to increase the accuracy by including data from social media like Google and Twitter received a lot of attention. Social media can be regarded as indicator for sentiments and sentiments are known to influence the stock market. Current models lack interpretation; it is difficult to determine what data is relevant for stock market prediction, since there is an abundance of social media data. A regression method that induces sparsity is thus required; data that is not useful is discarded automatically. The LASSO induces sparsity via L1-regularization; however, the covariance and confidence of the found regression coefficients cannot be derived easily, while this is important for interpretation. This thesis therefore reviews all known methods for approximating the covariance and confidence interval for the LASSO and determines their accuracy using numerical simulations. A new method is proposed based on the Unscented Transform, which outcompetes all methods in the underdetermined scenario, where there are more features than data points. Unfortunately, linear regression via the LASSO has limited use for stock markets as the achieved prediction accuracy is low. Nonlinear models are often applied for stock market prediction to achieve higher accuracies. Therefore a new feature selection method is proposed for the nonlinear Support Vector Regression (SVR) to select the correct data for stock market prediction using the SVR. This method yields accurate feature selection when the number of features to select from is low.

Files

Thesis_Joep_Kooijman_Stock_mar... (pdf)

(pdf | 1.6 Mb)

License info not available