In this article, we will focus our attention on another algorithm - Soft Actor-Critic (SAC). It was first presented in the article "Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor" (January 2018). The method was presented almost simultaneously with TD3. It has some similarities, but there are also differences in the algorithms. The main goal of SAC is to maximize the expected reward given the maximum entropy of the policy, which allows finding ...
In the previous article, we discussed relational models which use attention mechanisms in their architecture. We used this model to create an Expert Advisor, and the resulting EA showed good results. However, we noticed that the model's learning rate was lower compared to our earlier experiments. This is due to the fact that the transformer block used in the model is a rather complex architectural solution performing a large number of operations. The number of these operations grows in a quadratic ...
We have released the most comprehensive guide to MQL5 programming, authored by experienced algorithmic trader Stanislav Korotky with MetaQuotes' support. The book is intended for programmers of all levels. Beginners will learn the fundamentals as the book introduces key development tools and basic programming concepts. With this material, you can create, compile, and run your first application in the MetaTrader 5 trading platform. Users with experience in other programming languages can ...
We have released the most comprehensive guide to MQL5 programming, authored by experienced algorithmic trader Stanislav Korotky with MetaQuotes' support. The book is freely available online, under the "Book" section of the MQL5.community website. read more here
The issue of efficient exploration of the environment is one of the main problems of reinforcement learning methods. We have discussed this issue more than once. Each time, a proposed solution led to additional complication of the algorithm. In most cases, we resorted to using additional internal reward mechanisms to encourage the model to explore new actions and search for unexplored paths. more...