Neural networks made easy (Part 47): Continuous action space
Quote:
The EA assessed the market situation at each new trading candle and made a decision on a trading operation. But every upcoming bar carries risks for our account. Price movement within a bar can be detrimental to our balance. This is why it is always recommended to use stop losses. This simple approach allows us to limit risks per trade.
more...
Neural networks made easy (Part 48): Methods for reducing overestimation of Q-function values
Quote:
As you might remember, in DDPG, the Critic model learns the Q-function (prediction of expected reward) based on the results of interaction with the environment, while the Agent model is trained to maximize the expected reward, based only on the results of the Critic’s assessment of actions. Consequently, the quality of the Critic’s training greatly influences the Agent’s behavioral strategy and its ability to make optimal decisions.
more...
Neural networks made easy (Part 49): Soft Actor-Critic
Quote:
In this article, we will focus our attention on another algorithm - Soft Actor-Critic (SAC). It was first presented in the article "
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor" (January 2018). The method was presented almost simultaneously with TD3. It has some similarities, but there are also differences in the algorithms. The main goal of SAC is to maximize the expected reward given the maximum entropy of the policy, which allows finding a variety of optimal solutions in stochastic environments.
more...
Neural networks made easy (Part 50): Soft Actor-Critic (model optimization)
Quote:
We continue to study the Soft Actor-Critic algorithm. In the previous
article, we implemented the algorithm but were unable to train a profitable model. Today we will consider possible solutions. A similar question has already been raised in the article "
Model procrastination, reasons and solutions". I propose to expand our knowledge in this area and consider new approaches using our Soft Actor-Critic model as an example.
more...
Neural networks made easy (Part 51): Behavior-Guided Actor-Critic (BAC)
Quote:
The last two articles were devoted to the Soft Actor-Critic algorithm. As you remember, the algorithm is used to train stochastic models in a continuous action space. The main feature of this method is the introduction of an entropy component into the reward function, which allows us to adjust the balance between environmental exploration and model operation. At the same time, this approach imposes some restrictions on the trained models. Using entropy requires some idea of the probability of taking actions, which is quite difficult to directly calculate for a continuous space of actions.
more...
Neural networks made easy (Part 52): Research with optimism and distribution correction
Quote:
One of the basic elements for increasing the stability of Q-function learning is the use of an experience replay buffer. Increasing the buffer makes it possible to collect more diverse examples of interaction with the environment. This allows our model to better study and reproduce the Q-function of the environment. This technique is widely used in various reinforcement learning algorithms, including algorithms of the Actor-Critic family.
more...
Neural networks made easy (Part 53): Reward decomposition
Quote:
We continue to explore reinforcement learning methods. As you know, all algorithms for training models in this area of machine learning are based on the paradigm of maximizing rewards from the environment. The reward function plays a key role in the model training process. Its signals are usually pretty ambiguous.
more...
Neural networks made easy (Part 54): Using random encoder for efficient research (RE3)
Quote:
The issue of efficient exploration of the environment is one of the main problems of reinforcement learning methods. We have discussed this issue more than once. Each time, a proposed solution led to additional complication of the algorithm. In most cases, we resorted to using additional internal reward mechanisms to encourage the model to explore new actions and search for unexplored paths.
more...
Presenting the book "MQL5 Programming for Traders"
Quote:
We have released the most comprehensive guide to MQL5 programming, authored by experienced algorithmic trader Stanislav Korotky with MetaQuotes' support.
The book is freely available online, under the "Book" section of the MQL5.community website.
read more here
Experiments with neural networks (Part 7): Passing indicators
Quote:
In the current article, we will talk in more detail about the importance of passing meaningful data, the so-called time series, in a neural network. In particular, we will pass our favorite indicators. To achieve this, I will introduce some new concepts that I use while working with neural networks. Although, I think this is not the limit, and over time I will have a new vision in understanding what and how exactly needs to be passed.
more...