STEMM Institute Press
Science, Technology, Engineering, Management and Medicine
Study on the Prediction of Shanghai and Shenzhen 300 Index Returns and Trading Strategies Based on the LSTM-ARIMA Hybrid Model and Multi-Factor Characteristics
DOI: https://doi.org/10.62517/jse.202611207
Author(s)
Yunhao Yang
Affiliation(s)
Apply Mathematics, Center South University, Changsha, Hunan, China
Abstract
This study constructs a return prediction framework for the Shanghai and Shenzhen 300 Index using an LSTM-ARIMA hybrid model with multi-factor features and develops corresponding quantitative trading strategies. Starting from mathematical theory, it decomposes time series into linear (ARIMA) and nonlinear (LSTM) parts, achieving model fusion via residual learning[4]. In data processing, the ADF test ensures time series stationarity, while LASSO regression screens effective factors from multidimensional features like macroeconomic indicators and market sentiment, reducing dimensionality and avoiding overfitting[5]. A rolling time window cross-validation method was designed, and the Diebold-Mariano test verified the hybrid model's predictive superiority over single models[11]. Experimental data includes daily market data, monthly macroeconomic data, and market sentiment indicators for the Shanghai and Shenzhen 300 Index from 2005 to 2023, sourced from Tushare Pro API and Wind database. The research uses R language and related packages for analysis and modeling, including ARIMA forecasting, Keras for LSTM networks[3], and glmnet for LASSO regression[5]. Based on the predictive model, a quantitative trading strategy was researched, designed, and backtested. Trading decisions were generated from predictive signals, with key performance indicators like annualized return and Sharpe ratio evaluated. Results show the LSTM-ARIMA hybrid model significantly outperforms single models in forecasting accuracy. The strategy based on this model generates substantial positive expected returns, with a Sharpe ratio superior to traditional buy-and-hold strategies. Comparative and ablation experiments further validated the model's effectiveness and robustness, particularly its predictive ability during high market volatility periods.
Keywords
LSTM-ARIMA Hybrid Model; Multi Factor Feature Selection; Shanghai and Shenzhen 300 Index; Yield Forecast; Quantitative Trading Strategy
References
[1] Box, G. E., & Jenkins, G. M. (1976). Time Series Analysis: Forecasting and Control. San Francisco: Holden-Day. [2] Fama, E. F., & French, K. R. (1993). Common risk factors in the returns on stocks and bonds. Journal of Financial Economics, 33(1), 3-56. [3] Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory. Neural Computation, 9(8), 1735-1780. [4] Zhang, G. P. (2003). Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing, 50, 159-175. [5] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267-288. [6] Fischer, T., & Krauss, C. (2018). Deep learning with LSTM networks for financial market predictions. European Journal of Operational Research, 270(2), 654-669. [7] Choi, H. K. (2018). Stock Price Correlation Coefficient Prediction with ARIMA-LSTM Hybrid Model. arXiv preprint arXiv:1808.01560. [8] Shi, Z., & Hu, Y. (2024). Predicting CSI 300 Index and NASDAQ Index by Simple RNN and LSTM. Advances in Economics, Management and Political Sciences, 105(1), 226-231. [9] Wang, J., & Li, Y. (2025). The return rate prediction of China's CSI 300 index based on the ARIMA model. SHS Web of Conferences, 185, 02011. [10] Gu, S., Kelly, B., & Xiu, D. (2020). Empirical Asset Pricing via Machine Learning. The Review of Financial Studies, 33(5), 2223-2273. [11] Diebold, F. X., & Mariano, R. S. (1995). Comparing Predictive Accuracy. Journal of Business & Economic Statistics, 13(3), 253-263. [12] Bao, W., Yue, J., & Rao, Y. (2017). A deep learning framework for financial time series using stacked autoencoders and LSTM. PLOS ONE, 12(7), e0180944.
Copyright @ 2020-2035 STEMM Institute Press All Rights Reserved