1Finance

MQL5 Wizard Techniques you should know (Part 45): Reinforcement Learning with Monte-Carlo

Rate this Entry

0 Comments

, 12-08-2024 at 01:20 PM (445 Views)

This article continues our look at reinforcement learning by considering another algorithm, namely the Monte-Carlo. This algorithm is very similar and in fact arguably encompasses both Q-Learning and SARSA in that it can be either on-policy or off-policy. What sets it apart though is the emphasis on episodes. These simply are a way of batching the reinforcement learning cycle updates, that we introduced in this article, such that the updating of the Q-Values of the Q-Map happens less frequently.
With the Monte Carlo algorithm, Q-Values are only updated after the completion of an episode. An episode is a batch of cycles. For this article, we have assigned this number of cycles the input parameter ‘m_episodes_size’ and it is optimizable or adjustable. Monte Carlo is attributed to being quite robust to market variability because it can better simulate a wide range of possible market scenarios, allowing traders to determine how different strategies perform under a variety of conditions. This variability helps traders understand potential tradeoffs, risks and returns, enabling them to make more informed decisions.

more...