Document Type : Research Article


Faculty of Management, Economics and Accounting, Yazd University, Yazd, Iran


Stock trading is a significant decision-making problem in asset management. This study introduces a financial trading system (FTS) that leverages artificial intelligence (AI) techniques to automate buy and sell orders specifically in Iran's stock market. Due to limited availability of labeled data in financial markets, the FTS utilizes reinforcement learning (RL), a subset of AI, for training. The model incorporates technical analysis and a constrained policy to enhance decision-making capabilities. The proposed algorithm is applied to the Tehran Securities Exchange, evaluating its efficiency across 45 periods using three different stock market indices. Performance comparisons are made against common strategies such as buy and hold, randomly selected actions, and maintaining the initial stock portfolio, with and without transaction costs. The results indicate that the FTS outperforms these methods, exhibiting excellent performance metrics including Sharp ratio, PP, PF, and MDD. Consequently, the findings suggest that the FTS serves as a valuable asset management tool in the Iranian financial market.


[1] Necchi PG, Reinforcement learning for automated trading, Math Eng Di Milano Milano,
Italy., (2016), pp. 120.
[2] Molga M, Smutnicki C, Test functions for optimization needs, 2007330 Httpwww Zsd Ict
Pwr Wroc Plfilesdocsfunctions Pdf,. (2005), pp. 143.
[3] Bishop CM, Nasrabadi NM, Pattern recognition and machine learning, 4th ed. New York:
springer; (2006).
[4] Li Y, Deep Reinforcement Learning: An Overview, arXiv:1810.06339, (2018), p. 1–85.
[5] Chakole J, Kurhekar M, Trend following deep QLearning strategy for stock trading, Expert
Syst,. 37 (2020).
[6] Niaki STA, Hoseinzade S, Cvx: Forecasting SP 500 index using artificial neural networks
and design of experiments, J Ind Eng Int,. 9 (2013) pp. 1–9.
[7] Schwager JD, A complete guide to the futures market: Technical analysis, trading systems,
fundamental analysis, options, spreads, and trading principles, John Wiley Sons., (2017).
[8] Chan E, trading: how to build your own algorithmic trading business, John Wiley Sons.,
[9] Plotnikov AP, Shishlov RA, Arsenov V V, An algorithm for organizing long volatility
trading based on a delta-neutral strategy, Vestn SAMARA Univ Econ Manag., 13 (2022), pp.
[10] Thakrar K, Research Report on Delta Neutral Trading Strategy, Vidhyayana-An Int Multidiscip Peer-Reviewed E-Journal-ISSN., 6 (2020).
[11] Longstaff FA, Schwartz ES, Valuing American options by simulation: a simple leastsquares approach, Rev Financ Stud., 14 (2001).
[12] Tsitsiklis JN, Van Roy B, Regression methods for pricing complex American-style options,
IEEE Trans Neural Networks., 12 (2001), pp. 694–703.
[13] Zhang X, Shen H, Lv Z, Deployment optimization of multi-stage investment portfolio service
and hybrid intelligent algorithm under edge computing, PLoS One., 16 (2021).
[14] Kim JH, Lee Y, Kim WC, Fabozzi FJ, Goal-based investing based on multi-stage robust
portfolio optimization, Ann Oper Res., (2022).
[15] Chong E, Han C, Park FC, Deep learning networks for stock market analysis and prediction: Methodology, data representations, and case studies, Expert Syst Appl., 83 (2017), pp.
[16] Recht B, A tour of reinforcement learning: The view from continuous control, Annu Rev
Control Robot Auton Syst., 2 (2019).
[17] Heaton JB, Polson NG, Witte JH, Deep learning for finance: deep portfolios, Appl Stoch
Model Bus Ind., 33 (2017), pp. 3–12.
[18] Atsalakis GS, Valavanis KP, Surveying stock market forecasting techniquesPart II: Soft
computing methods, Expert Syst Appl., 36 (2009).
[19] Martinez LC, da Hora DN, Palotti JR de M, Meira W, Pappa GL, From an artificial
neural network to a stock market day-trading system: A case study on the bmf bovespa, Int.
Jt. Conf. Neural Networks., (2009).
[20] Ding X, Zhang Y, Liu T, Duan J, Deep learning for event-driven stock prediction, Twentyfourth Int. Jt. Conf., (2015).
[21] Akita R, Yoshihara A, Matsubara T, Uehara K, Deep learning for stock prediction using
numerical and textual information, IEEE/ACIS 15th Int. Conf. Comput. Inf. Sci., (2016),
pp. 1–6.
[22] Won J, Lee JW, Stock price prediction using reinforcement learning, ISIE 2001., (2001), p.
[23] Lei K, Zhang B, Li Y, Yang M, Shen Y, Time-driven feature-aware jointly deep reinforcement learning for financial signal representation and algorithmic trading, Expert Syst Appl.,
140 (2020).
[24] Moody J, Wu L, Liao Y, Saffell M, Performance functions and reinforcement learning
for trading systems and portfolios, J Forecast., 17 (1998), pp. 441–70.
[25] Gueanta O ´ , Reinforcement Learning Methods in Algorithmic Trading, Mach Learn Data
Sci Financ Mark A Guid to Contemp Pract., (2023).
[26] Moody J, Saffell M, Learning to trade via direct reinforcement, IEEE Trans Neural Networks., 12 (2001), pp. 875–889.
[27] Moody J, Saffell M, Learning to trade via direct reinforcement, IEEE Trans Neural Networks., 12 (2001), pp. 875–889.
[28] Sutton RS, McAllester DA, Singh SP, Mansour Y, Policy gradient methods for reinforcement learning with function approximation, NIPs., (1999), pp. 1057–63.
[29] Gao X, Chan L, An algorithm for trading and portfolio management using Q-learning and
sharpe ratio maximization, Proc. Int. Conf. neural Inf. Process., Citeseer., (2000), pp. 832–
[30] Pendharkar PC, Cusatis P, Trading financial indices with reinforcement learning agents,
Expert Syst Appl., 103 (2018), pp. 1–13.
[31] Wang Y, Wang D, Zhang S, Feng Y, Li S, Zhou Q, Deep Q-trading, Cslt Riit Tsinghua
Edu Cn., 12 (2017).
[32] Deng Y, Bao F, Kong Y, Ren Z, Dai Q, Member S, Deep direct reinforcement learning
for financial signal representation and trading, IEEE Trans Neural Networks Learn Syst., 28
(2016), pp. 653–664.
[33] Tsantekidis A, Passalis N, Toufa A-S, Saitas-Zarkias K, Chairistanidis S, Tefas A,
Price trailing for financial trading using deep reinforcement learning, IEEE Trans Neural
Networks Learn Syst., 32 (2020), pp. 2837–2846.
[34] Yuan Y, Wen W, Yang J, Using data augmentation based reinforcement learning for daily
stock trading, Electronics., 9 (2020).
[35] Vishal M, Satija Y, Babu BS, Trading agent for the indian stock market scenario using Actor-Critic based reinforcement learning, IEEE Int. Conf. Comput. Syst. Inf. Technol.
Sustain. Solut., (2021), pp. 1–5.
[36] Kabbani T, Duman E, Deep Reinforcement Learning Approach for Trading Automation in
the Stock Market, IEEE Access., 10 (2022), pp. 93564–74.
[37] Theate T, Ernst D ´ , An application of deep reinforcement learning to algorithmic trading,
Expert Syst Appl., 173 (2021).
[38] Fawcett T, An introduction to ROC analysis, Pattern Recognit Lett., 27 (2006), pp. 861–74.
[39] Wilder JW, New concepts in technical trading systems, Trend Research., (1978).
[40] Granville JE, Granvilles new strategy of daily stock market timing for maximum profit,
Prentice-Hall., (1976).