View RSS Feed

Uncategorized

Entries with no category

  1. Neural networks made easy (Part 52): Research with optimism and distribution correction

    by , 12-29-2023 at 07:48 AM
    One of the basic elements for increasing the stability of Q-function learning is the use of an experience replay buffer. Increasing the buffer makes it possible to collect more diverse examples of interaction with the environment. This allows our model to better study and reproduce the Q-function of the environment. This technique is widely used in various reinforcement learning algorithms, including algorithms of the Actor-Critic family.
    more...
    Categories
    Uncategorized
  2. Neural networks made easy (Part 51): Behavior-Guided Actor-Critic (BAC)

    by , 12-27-2023 at 07:48 AM
    The last two articles were devoted to the Soft Actor-Critic algorithm. As you remember, the algorithm is used to train stochastic models in a continuous action space. The main feature of this method is the introduction of an entropy component into the reward function, which allows us to adjust the balance between environmental exploration and model operation. At the same time, this approach imposes some restrictions on the trained models. Using entropy requires some idea of the probability of taking
    ...
    Categories
    Uncategorized
  3. Neural networks made easy (Part 50): Soft Actor-Critic (model optimization)

    by , 12-25-2023 at 07:48 AM
    We continue to study the Soft Actor-Critic algorithm. In the previous article, we implemented the algorithm but were unable to train a profitable model. Today we will consider possible solutions. A similar question has already been raised in the article "Model procrastination, reasons and solutions". I propose to expand our knowledge in this area and consider new approaches using our Soft Actor-Critic model as an example.
    more...
    Categories
    Uncategorized
  4. Neural networks made easy (Part 49): Soft Actor-Critic

    by , 12-24-2023 at 07:48 AM
    In this article, we will focus our attention on another algorithm - Soft Actor-Critic (SAC). It was first presented in the article "Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor" (January 2018). The method was presented almost simultaneously with TD3. It has some similarities, but there are also differences in the algorithms. The main goal of SAC is to maximize the expected reward given the maximum entropy of the policy, which allows finding
    ...
    Categories
    Uncategorized
  5. Neural networks made easy (Part 37): Sparse Attention

    by , 12-21-2023 at 02:49 AM
    In the previous article, we discussed relational models which use attention mechanisms in their architecture. We used this model to create an Expert Advisor, and the resulting EA showed good results. However, we noticed that the model's learning rate was lower compared to our earlier experiments. This is due to the fact that the transformer block used in the model is a rather complex architectural solution performing a large number of operations. The number of these operations grows in a quadratic
    ...
    Categories
    Uncategorized
Page 13 of 336 FirstFirst ... 3 11 12 13 14 15 23 63 113 ... LastLast