In the previous article "Neural networks made easy (Part 39): Go-Explore, a different approach to exploration", we familiarized ourselves with the Go-Explore algorithm and its ability to explore the environment. In this article, we will take a closer look at possible optimization methods for the Go-Explore algorithm to improve its efficiency over longer training periods. more...
The Decision Transformer and all its modifications, which we discussed in recent articles, belong to the methods of Behavior Cloning (BC). We train models to repeat actions from "expert" trajectories depending on the state of the environment and the target outcomes. Thus, we teach the model to imitate the behavior of an expert in the current state of the environment in order to achieve the target. more...
PDT jointly learns an embedding space of future trajectory as well as a future prior conditioned only on past information.. By conditioning action prediction on the target future embedding, PDT is endowed with the ability to "reason over the future". This ability is naturally task-independent and can be generalized to different task specifications. To achieve efficient online fine-tuning in downstream tasks, you can easily adapt the framework to new conditions by associating each ...
Previously, we considered hierarchical models for solving problems with, so to speak, the classical approach of the Markov process. However, the advantages of using hierarchical approaches also apply to sequence analysis problems. One such algorithm is the Control Transformer presented in the article "Control Transformer: Robot Navigation in Unknown Environments through PRM-Guided Return-Conditioned Sequence Modeling". The method authors position it as a new architecture designed to solve ...
I sometimes receive private messages from those who want to learn how to create their own Expert Advisors or indicators. Although there is a lot of material on this site and on the Internet in general, including very good resources with examples, beginners still need help. Some users seek more consistency in presentation, others require clarity or something else. Sometimes users ask: "Add comments to the code of a working Expert Advisor, I will understand everything and make the same one myself!" ...