Stock Market Prediction Using Machine Learning Model(Linear Regression)
Stock market are volatile in nature. It keep on changing based on the company performance, past records, market value and also depends on the news & timings.
One can prejudge the stock prices by doing trend analysis. There are couple of Machine Learning Techniques that can predict the future stock prices.
Every stock has a different trend and therefore one machine learning model can not be applied to other stock. One model giving high accuracy can not guaranteed to work on the other.
There are different algorithm that can be used for the future prediction, I will show how Linear Regression can Predict the future values. Here we will predict future 15 days values.
# Load the data directly from NSE (National Stock Exchange)
Functions from pandas_datareader.data
and pandas_datareader.wb
extract data from various Internet sources into a pandas DataFrame
Take a look at the data where xx is the name of the stock, get the stock name from https://in.finance.yahoo.com/quote/%5ENSEI/
# Import Python Librariesimport numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
import pandas_datareader.data as web
import datetime as dt
stock='^NSEI'df1=web.DataReader('xx.NS','yahoo',start='1996-01-01',end='2020-06-18')
xx.NS is the name of the stock, yahoo is the method name, start/end is the start & end date of data extraction.
df1.tail()
Note : Adjusted closing price amends a stock’s closing price to accurately reflect that stock’s value after accounting for any corporate actions. The closing price is the ‘raw’ price which is just the cash value of the last transacted price before the market closes
df = df1[['Adj Close']]
df[['Adj Close']].tail()
Above data frame is having Date and Stock Prices. lets create one more column which is having stock values 15 days in advance, by shifting 15 rows up. This can be done by shifting the rows of new column by 15, in that case last 15 values will become NaN, we can remove those rows from the datasets for cleanup.
so our aim is create Machine learning model based on the current and 15 days later values and predict the values 15 days in advance. so ideally NaN values will be replaced by the future 15 days values.
#Create one more column Prediction shifted 15 days up.
df['Prediction'] = df[['Adj Close']].shift(-15)
#print data set
print(df)
#Create a data set X and convert it into numpy array , which will be having actual values
X = np.array(df.drop([‘Prediction’],1))#Remove the last 15 rows
X = X[:-15]
print(X)
# Create a dataset y which will be having Predicted values and convert into numpy array
y = np.array(df[‘Prediction’])
# Remove Last 15 rows
y = y[:-15]
print(y)
# Split the data into train and test with 90 & 10 % respectively
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.1)
Apply Linear Regression Model on the train data set
# Linear Regression Model
lr = LinearRegression()
# Train the model
lr.fit(x_train, y_train)
Testing Model: Return the coefficient of determination R² of the prediction.
https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html
# The best possible score is 1.0
lr_confidence = lr.score(x_test, y_test)
print(lr_confidence)
get the last 15 rows of the original data set from Adj. Close column and convert into numpy array
forecast = np.array(df.drop([‘Prediction’],1))[-15:]
print(forecast)
# linear regression model predictions for the next 15 days
lr_prediction = lr.predict(forecast)
print(lr_prediction)
Thanks for reading..