Stock Price Prediction using LSTM Neural Network
AdvancedBuild a time-series forecasting model with LSTM using real market data
🧭 1. Project Overview
This project aims to predict future stock prices using Long Short-Term Memory (LSTM) neural networks — a type of Recurrent Neural Network (RNN) well-suited for time series forecasting.
Real-World Use Case
Stock price prediction is one of the most popular applications of AI in finance. Analysts and traders use machine learning to forecast price trends and make informed investment decisions.
Technical Goals
- Collect historical stock price data.
- Preprocess and normalize time series data.
- Build and train an LSTM model.
- Visualize predicted vs actual stock prices.
- Evaluate model performance.
⚙️ 2. Key Technologies & Libraries
| Library | Purpose |
|---|---|
| NumPy | Numerical computations |
| Pandas | Data handling and cleaning |
| Matplotlib / Seaborn | Data visualization |
| Scikit-learn | Data preprocessing and scaling |
| TensorFlow / Keras | Building the LSTM neural network |
| yfinance | Fetching real-time historical stock data |
Install Required Packages
pip install numpy pandas matplotlib seaborn scikit-learn tensorflow yfinance🎯 3. Learning Outcomes
By completing this project, you’ll learn:
- How to handle time series data for AI models.
- How to use LSTM networks for sequential prediction.
- How to preprocess data for neural networks.
- How to evaluate and visualize deep learning results.
- How to fetch real-world data using APIs (yfinance).
🧩 4. Step-by-Step Explanation
- Step 1: Import Required Libraries — Load all dependencies for data processing, visualization, and model creation.
- Step 2: Load the Stock Data — Use yfinance to fetch historical prices of a specific company (e.g., Apple — “AAPL”).
- Step 3: Data Exploration and Visualization — Visualize the stock’s closing price trend to understand patterns.
- Step 4: Preprocess the Data — Normalize the data for LSTM input using MinMaxScaler and create sequences.
- Step 5: Split into Training and Testing Data — Divide the dataset into training and testing (e.g., 80/20).
- Step 6: Build and Train the LSTM Model — Define the LSTM architecture using Keras Sequential API.
- Step 7: Make Predictions — Use the trained model to predict future prices.
- Step 8: Visualize the Results — Plot actual vs predicted stock prices.
- Step 9: Evaluate Performance — Use metrics like RMSE (Root Mean Square Error) to evaluate performance.
💻 5. Full Working and Verified Python Code
# =====================================================
# Stock Price Prediction using LSTM Neural Network
# =====================================================
# Import necessary libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import yfinance as yf
from sklearn.preprocessing import MinMaxScaler
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout
from sklearn.metrics import mean_squared_error
import math
# -------------------------
# Step 1: Load Data
# -------------------------
stock_symbol = 'AAPL' # You can change this to any stock symbol
data = yf.download(stock_symbol, start='2015-01-01', end='2024-01-01')
print(data.head())
print(f"\nData Shape: {data.shape}")
# -------------------------
# Step 2: Visualize Closing Price
# -------------------------
plt.figure(figsize=(10, 6))
plt.plot(data['Close'], label='Closing Price', color='blue')
plt.title(f'{stock_symbol} Stock Price History')
plt.xlabel('Date')
plt.ylabel('Price (USD)')
plt.legend()
plt.show()
# -------------------------
# Step 3: Preprocess Data
# -------------------------
# Use only 'Close' column
close_data = data['Close'].values.reshape(-1, 1)
# Normalize data to range (0,1)
scaler = MinMaxScaler(feature_range=(0, 1))
scaled_data = scaler.fit_transform(close_data)
# -------------------------
# Step 4: Prepare Sequences
# -------------------------
def create_dataset(dataset, time_step=60):
X, Y = [], []
for i in range(len(dataset) - time_step - 1):
X.append(dataset[i:(i + time_step), 0])
Y.append(dataset[i + time_step, 0])
return np.array(X), np.array(Y)
time_step = 60
X, y = create_dataset(scaled_data, time_step)
# Reshape input for LSTM [samples, time steps, features]
X = X.reshape(X.shape[0], X.shape[1], 1)
# -------------------------
# Step 5: Split Data
# -------------------------
train_size = int(len(X) * 0.8)
X_train, X_test = X[:train_size], X[train_size:]
y_train, y_test = y[:train_size], y[train_size:]
# -------------------------
# Step 6: Build LSTM Model
# -------------------------
model = Sequential([
LSTM(50, return_sequences=True, input_shape=(time_step, 1)),
Dropout(0.2),
LSTM(50, return_sequences=False),
Dropout(0.2),
Dense(25, activation='relu'),
Dense(1)
])
model.compile(optimizer='adam', loss='mean_squared_error')
model.summary()
# -------------------------
# Step 7: Train Model
# -------------------------
history = model.fit(X_train, y_train, validation_data=(X_test, y_test),
epochs=20, batch_size=64, verbose=1)
# -------------------------
# Step 8: Predictions
# -------------------------
predictions = model.predict(X_test)
predictions = scaler.inverse_transform(predictions.reshape(-1, 1))
actual = scaler.inverse_transform(y_test.reshape(-1, 1))
# -------------------------
# Step 9: Evaluate Model
# -------------------------
rmse = math.sqrt(mean_squared_error(actual, predictions))
print(f"Root Mean Square Error (RMSE): {rmse:.2f}")
# -------------------------
# Step 10: Visualization
# -------------------------
plt.figure(figsize=(10, 6))
plt.plot(actual, label='Actual Price', color='blue')
plt.plot(predictions, label='Predicted Price', color='red')
plt.title(f'{stock_symbol} Stock Price Prediction')
plt.xlabel('Days')
plt.ylabel('Price (USD)')
plt.legend()
plt.show()
✅ Code Verified:
• Tested and runs without syntax/import errors.
• Works with real-time Yahoo Finance data.
• Produces accurate predictions and clean visualizations.
• Tested and runs without syntax/import errors.
• Works with real-time Yahoo Finance data.
• Produces accurate predictions and clean visualizations.
📊 6. Sample Output or Results
Console Output Example:
Root Mean Square Error (RMSE): 5.32
Root Mean Square Error (RMSE): 5.32
Graph Output:
A line graph displaying two curves:
- Blue Line – Actual stock prices
- Red Line – Predicted stock prices
The curves should follow a similar pattern, showing good prediction accuracy.
🚀 7. Possible Enhancements
- Integrate with Flask or Streamlit to create a web dashboard.
- Add multiple features (Volume, Open, High, Low) for multivariate analysis.
- Implement Bidirectional LSTM or GRU for improved accuracy.
- Deploy the model using AWS / Google Cloud.
- Add real-time stock prediction using WebSocket data.