Skip to content

Commit

Permalink
Fix GA training window to include more data.
Browse files Browse the repository at this point in the history
  • Loading branch information
Francisco Silva committed Nov 25, 2024
1 parent 9b64e0d commit c10f147
Show file tree
Hide file tree
Showing 5 changed files with 15 additions and 33 deletions.
39 changes: 9 additions & 30 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)


This project implements an intelligent dynamic stock selection system using a **Genetic Algorithm-optimized XGBoost** (GA-XGBoost) classifier to identify stocks with potential market outperformance. The model analyzes quarterly financial statements, market data, insider trading patterns and other external data to predict whether a stock will outperform the S&P 500 index over a one-year horizon over a large margin. The project includes a **Streamlit-based analytics dashboard** that provides comprehensive stock analysis tools, including technical indicators, financial metrics visualization, and model-driven insights.
This project implements an intelligent dynamic stock selection system using an **Adaptive Genetic Algorithm-optimized XGBoost** (GA-XGBoost) classifier to identify stocks with potential market outperformance. The model analyzes quarterly financial statements, market data, insider trading patterns and other external data to predict whether a stock will outperform the S&P 500 index over a one-year horizon over a large margin. The project includes a **Streamlit-based analytics dashboard** that provides comprehensive stock analysis tools, including technical indicators, financial metrics visualization, and model-driven insights.


## Table of Contents
Expand All @@ -23,7 +23,7 @@ This project implements an intelligent dynamic stock selection system using a **

## Project Overview

The stock classifier is built using GA-XGBoost and trained on:
The stock classifier is built using Adaptive GA-XGBoost and trained on:
- Quarterly financial statements
- Market data and technical indicators
- Insider trading information
Expand All @@ -39,9 +39,9 @@ The model predicts whether a stock will outperform the S&P 500 over a one-year h
## Features

- **Model Training**
- GA-XGBoost classifier with optimized hyperparameters
- Adapative GA-XGBoost classifier with optimized hyperparameters
- Feature engineering including growth ratios, financial metrics, price momentum, and volatility
- Cross-validation and performance metrics
- Expanding window cross-validation and performance metrics

- **Streamlit App**
- Market overview dashboard
Expand Down Expand Up @@ -77,8 +77,8 @@ The model predicts whether a stock will outperform the S&P 500 over a one-year h

2. Install pre-commit hooks:
```bash
chmod +x scripts/install-hooks.sh
./scripts/install-hooks.sh
chmod +x install-hooks.sh
install-hooks.sh
```

## Usage
Expand All @@ -104,11 +104,13 @@ Score stocks for a given trade date:
stocksense --score --trade-date YYYY-MM-DD
```

In order to evaluate for the last trading date, don't specify a trade date.

### Streamlit App

To open the Streamlit app:
```bash
stocksense --app
stocksense-app
```


Expand All @@ -130,29 +132,6 @@ Ye, Z. J., & Schuller, B. W. (2023). Capturing dynamics of post-earnings-announc

Liu, X. Y., Yang, H., & Chen, Q. (2019). A Sustainable Quantitative Stock Selection Strategy Based on Dynamic Factor Adjustment. *Columbia University*. [[paper]](add_link_if_available)

```bibtex
@article{yang2020practical,
title={A Practical Machine Learning Approach for Dynamic Stock Recommendation},
author={Yang, Hongyang and Liu, Xiao-Yang and Wu, Qingwei},
institution={Columbia University},
year={2020}
}
@article{ye2023capturing,
title={Capturing dynamics of post-earnings-announcement drift using a genetic algorithm-optimized XGBoost},
author={Ye, Zhengxin Joseph and Schuller, Bj{\"o}rn W.},
institution={Imperial College London},
year={2023}
}
@article{liu2019sustainable,
title={A Sustainable Quantitative Stock Selection Strategy Based on Dynamic Factor Adjustment},
author={Liu, Xiao-Yang and Yang, Hongyang and Chen, Qingwei},
institution={Columbia University},
year={2019}
}
```


## License

Expand Down
File renamed without changes.
6 changes: 4 additions & 2 deletions stocksense/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,12 +16,14 @@ def prepare_data():
@click.option("-u", "--update", is_flag=True, help="Update stock data.")
@click.option("-t", "--train", is_flag=True, help="Train model.")
@click.option("-s", "--score", is_flag=True, help="Score stocks.")
@click.option("-f", "--force", is_flag=True, default=False, help="Force model retraining.")
@click.option(
"-tdq",
"--trade-date",
type=click.DateTime(formats=["%Y-%m-%d"]),
help="Trade date for model operations (format: YYYY-MM-DD)",
)
def main(update, train, score, trade_date):
def main(update, train, score, force, trade_date):
"""
CLI handling.
"""
Expand All @@ -36,7 +38,7 @@ def main(update, train, score, trade_date):
stocks = DatabaseHandler().fetch_sp500_stocks()
handler = ModelHandler(stocks, trade_date)
if train:
handler.train(data)
handler.train(data, retrain=force)
if score:
handler.score(data)

Expand Down
2 changes: 1 addition & 1 deletion stocksense/model/genetic_algorithm.py
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,7 @@ def get_train_val_splits(data: pl.DataFrame, stocks: list[str], min_train_years:
if i + 1 < min_train_years:
continue

train_years = years[: i + 1]
train_years = years[: i + 2]
val_years = [years[i + 2], years[i + 3]]

train = data.filter(pl.col("tdq").dt.year().is_in(train_years))
Expand Down
1 change: 1 addition & 0 deletions stocksense/model/model_handler.py
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,7 @@ def train(self, data: pl.DataFrame, retrain: bool = False):
model.save_model(model_file)
except Exception as e:
logger.error(f"ERROR: failed to train model - {e}")
raise

def score(self, data):
"""
Expand Down

0 comments on commit c10f147

Please sign in to comment.