Quantitative investment (QI) products (models/tools/systems) can provide accurate stock market prediction and help investors significantly alleviate risks of mispricing and irrational trading because of psychological factors, such as overconfidence, mental accounting, loss aversion, and so on
Article: A causal feature selection algorithm for stock prediction modeling
Zhang, Xiangzhou, et al. “A Causal Feature Selection Algorithm for Stock Prediction Modeling.”
NeuroImage, Academic Press, 9 May 2014,
www.sciencedirect.com/science/article/pii/S0925231214005359.
The main issue of quantitative investment (QI) is what features to include. This article uses algorithms in data-based analysis to enhance the QI. It also states that CFS (casual feature selection) algorithms are the best predictors. They are able to identify the variable and generate a feature subset based on the results. Other algorithms in existence are principal component analysis (PCA), decision trees (DT:CART), and the least absolute shrinkage and selection operator (LASSO). CFS is the most accurate and precise and has the ability to develop the QI product. Other common stock algorithms can only reveal base level detail and not stock features (inputs) and stock return (output). CFS’s have two unique aspects, one being to identify direct influences between variables and to verify the algorithm using extensive experiments (accuracy, precision, sharpe ratio, sortino ratio, information ratio, and maximum drawdown (MDD). PCA can reduce a large set of variable to a smaller set of uncorrelated factors while keeping variance. The results of PCA is a subset of original components but not of variables, making the resulting data hard to comprehend. DT is one root (node) and a number of branches. With only features that contribute to the classification. DT uses the entropy theory that selects specific variables with the most important information. LASSO which selects individual variables.
Additionally, there are two types of stock prediction categories, time series forecasting and trend prediction. A time series forecasting model is trained to fit the historical return/ price series of individual stock, utilised to predict the future return/ price. A trend prediction model is used to obtain the relationship between various fundamental and technical variables and the movement of stock price. Dataset, prediction models and selection algorithms include SRA, PCA, and genetic algorithm (GA), informational gain, etc. Inputs vary but often both technical and fundamental variables exist such as economic variables.
Other data mining algorithms exist such as: logistic regression (LR), neural network (NN), support vector machine (SVM), decision tree (DT). Supplementary algorithms include Bayesian network (BN), Zuo nad Kita.
Feature selection can be grouped into two categories, filter and wrapper approaches. The filter uses general characteristics of the training data to select key input features. Wrapper uses the prediction performance of a special learning algorithm to evaluate and determine the best feature subset. Evolutionary algorithm and GA are used in the latter, allowing it to perform better, however it is more expensive.
Article: Using Bitcoin Pricing Data to Create a Profitable Algorithmic Trading Strategy
Uses data from historical GDAX prices which was able to output the current open, close, high, low price and volume every minute time interval over the past year approximately 450,000 data points. This projects goal was to buy and sell in x minutes through predicting the ratio of the price x minutes later to current prices rather than predicting a standard up/down value.
The features they used were High/Current price, Low price/Current price, Average price/Current price, volume of trading in BTC, Proportion of increase in price every minute, proportion of convex change every minute, ratio of price n minutes ago to current price, average price/current price, volume n minutes ago/current volume, etc.
(PCA and Feature Selection) WAP, VR, and R are not viable options. WAP is less varied than AP while being highly correlated. VR has low variance and an inaccurate predictor. R was similar to A however, A captured more information. Meaning, Baseline, LinReg, LonReg PCA and Neural Networks remained.
The methods that were used were Baselines, Weighted Logistic Regression, Principal Component Analysis and Neural Networks. Weighted average, gains and AUC (area under curve which measured the true positive rate vs the false positive weight) were used as measuring increments. *Refer back to article for indepth*
In conclusion, all of the listed, approved models had gains that outperformed the average increase per minute of bitcoin. Meaning that actually using them in market like situations was more effective than buying and holding. PCA had small increased gains but with conjunction to neural networks, significant gains were represented. Their algorithm was questioned to whether or not it was actually 100% reliable due to recent spikes. *This was not tested in real time*
Article: Price Prediction Evolution: from Economic Model to Machine Learning
Used the stock index to gain data and comprehensive market tendency. They extracted four features: Max, Min, Mean, and Standard Deviation every 5 days. They inputted the data by a Shifting Window Pattern. They first ran multilinear regressions adding features one by one. The input features they used were univar, bivar, CPI and GDP. Using GDP as the main predictor of the closing prices. They regretted using increments of 5 days as it may not have reflected the real behavior of the market. Concluding that the more macroeconomic features they used in their algorithm diluted their results. Recommending: Using only closing prices and certain macroeconomic features as a good approach.
LWR is used to predict certain features in the market. Defeisiancis of LWR is that it has time lag which then wont reflect the most up to time data. However, it is one of the most accurate predictions. NN used the max, min, mean and standard deviation to feed into the neural network. WIth 2 hidden layers activated by ReLU and no output activation layers. Using cross validation to tune the parameters. They then split the data into the train set, dev set and test set. Importantly using the dev set to adjust the models topology (number of layers, neurons per hidden layer, size of mini batch). Outputting 500 data with 481 MSE plotted into the data. SVR (support vector regression) overcomes difficulties in dimensions. Developed by SV and used it to predict stock behavior (forecasting the curve tendency). SVR is more flexible as it uses relaxation variables. In SVR they used libsvm in MATLAB and divided data into 3 groups 4000(train), 500(dev), 500(test).
Article: Application of Deep Learning to Algorithmic Trading
Project was based around LSTM (Long Short Term Memory) Networks. Their algorithm predicted stock data with the goal to forecast the next day’s stock price of Intel Corporation (NASDAQ: INTC) by adjusting the closing price based on information/ features available the next present day. Trading Intel stock accordingly to the strategy developed. Mostly using Locally Weighted Regression.
The Trading Framework used had 4 steps. (1) Input: used daily trading data, technical indicators and indexes. (2) Model: LSTM Network, LWR model. (3) Output: Predicting next day’s price. (4) Decision: Trading decision of whether to buy or sell.
Variables used: (3) historical daily trading data was composed of INTC’s last 5 days adjusted closing price, log returns, open/close price, high/low price, trading volume. The second set was composed of technical indicators that demonstrated various characteristics of the stock behavior. The final set included the S and P 500, CBOE Volatility index and PHLX Semiconductor Sector.
The technical indicators used were the Rolling Average/Standard Deviation with 5 and 10 days window, Bollinger Band: two standard deviations from a moving average, Average True Range: a measure to volatility of price, 1 month Momentum: the difference between current price and the price 1 month ago, Commodity Channel Index: an identification of cyclical trends, Rate of Change: the momentum divided by the price 3 months ago, Moving Average Convergence Divergence: a display trend following characteristics and momentum characteristics, Williams Percent Range: a measure of the buying and selling pressure
*Reference for model, data, outcome, price prediction and return plots*
Strategy: used trained models to compute the predicted price. If the price the next day was higher than the current price, one share of INTC was bought. If the predicted price of INTC is lower than the current price, one share was sold.
Conclusion: LSTM Network and LWR models can predict the general trend of the INTC stock price. LSTM outperforms LWR in terms of profitability and accuracy over 3 periods. LSTM is more robust than LWR. LSTM has a smaller MSE than LWR for Dev set and Test set. LSTM yields higher returns and Sharpe ratio than LWR based strategy and simple buy and hold strategies. *Exclude dramatic price changes*. Using hyper parameters and adding regularized term would improve the performance and reinforced learning could generate more stable and higher returns.
*Reference report for LSTM equations and explanation
Article: Predicting stock prices for large-cap technology companies
Used data from previous days and financial news articles to predict the changes in the future of a given stock. Prediction was excluded 10 days before and after the company’s Earnings report dates to reduce dramatic changes. Overall, this algorithm was 59% accurate and annualized a return of about 15%.
Used NASDAQ(.com) to gain data from 5 years ago. News data was pulled from XIGNITE(.com) and are processed to remove duplications and have exact dates assigned to each headline.
To digest the news articles, Naive Bayes theory was used which modeled 400 days and tested on 200 days of data. The output is the probability of a positive relative price change. Fixed tokens were used as replacements to numbers, percentages, and money amounts.
Features of the NN model: Relative price changes for the same stock from the past 20 trading days, News predictors from the last 20 trading days, news predictor of the current day. The results show that percentage gain is the most important factor. This project has 56.35% accurate predictions with news and without news resulted in 52.38%. NN w/ news produces the most accurate results with 59% accuracy and a daily gain of 0.0423%
In reflecting the author realized that stock prediction was too complex to be captured by price changes and news. XIGNITE(.com)’s news could have had higher quality news. Reducing and removing stemmed words into base forms, insignificant words, etc would improve prediction accuracy. And including trends in industries, political influences, competitor trends, etc to capture all the complexities of a stock market.
Article: Using AI to Make Predictions on Stock Market
Goal is to create an automated tool for managing investments with limited amounts of stocks. First, they designed an algorithm to predict increases and decreases in the next n days, using the stock prices and volumes in the past m days. Alpha Vantage API was used to access the daily open price, high price, low price, close price and daily volume since 2000.
Predictions based on Technical Indicators: Moving Average Convergence Divergence (MACD), Stochastic oscillator (STOCH), Relative strength index values (RSI), Average directional movement index (ADX) values, Absolute price oscillator values with SMA (Apo-SMA/ APO-EMA), Commodity channel index values (CCI), Aroon, Bollinger bands (BBANDS) values, Chaikin A/D line (AD) values, On balance volume (OBV) values.
Results: The variable targeted was the difference between the prices over a n-day period using the predictors: the stock price at the end of the day, beginning of period, volume of the same day, and the 11 indicators from the previous section. Using the training set, dev set, and test set in a ratio of 7:2:1. And evaluating the data with classification (of the price trend (increase/decrease)) and the mean squared error. Data shows that the support vector regression gives the best result however, it is expensive. So, for larger data being processed, linear regression is used.
Article: Optimised Prediction of Stock Prices with Newspaper Articles
Explored the predictive power of using newspaper articles on the stock prices of companies. The testing accuracy was up to 61% using supervised learning. Additionally, the prediction algorithm ran through MDP (Markov decision process) which buys and sells shares on a simulation that is programmed. Binary responses along with machine learning and newspaper articles were used to predict up or down changes in stock prices. Past algorithms include Naive Bayes algorithm, SVM, the Perceptron, Boosting and bag of words. In addition, they expanded the bag of words to include shorter and more words. Term frequency-Inverse Document Frequency (TF-IDF) was used to improve the algorithms ability to grasp words and cross validate. Afterwords, a Markov Decision Process (MDP) learner was created that builds the base predictor and reinforces its ability to buy and sell shares.
Previous work includes (1) Gidofalvi: implemented naive bayes text classifier on financial news to track short term fluctuation prices, concluding that there is a strong correlation between news articles and stock prices in a 20 minute window before and after the publishing of the article.
Data: accessing New York Times and the Wall street Journal was difficult do to previous set up guidelines that hindered website access. Making the newspaper research reliant on Proquest Newsstand database. The input went from item term searches to federated search algorithms to PQNS XML TREE to XML tree parsing algorithm to PQNS article URL to Article Parsing algorithm to raw text to stemming algorithm to stemmed text to output.
Proquest newsstand archives all articles to a searchable database. With a written algorithm to generate federated search URL’s you can automatically obtain PQMS XML Tree which contains the URLs of the full texts of articles that mention the company of interest. Writing an XML tree parser will allow data to automatically be compiled into a list of PQNS URLs where the full texts are contained. Proquest needs a validating URL, meaning that there must also be a web scraper specifically to work around limitations of the Proquest database generating cURLs so that the article algorithmic gatherer can act like a real user’s browser. The use of regular expressions to scan full texts from webpages to obtain raw data to go to the Python stemming library for pre-processing.
Article: Using News Articles to Predict Stock Price Movements
Short term stock price movements can be predicted by financial news articles.