Neural networks made easy (Part 58): Decision Transformer (DT)
by
, 09-05-2024 at 09:48 AM (142 Views)
more...In this series, we have already examined a fairly wide range of different reinforcement learning algorithms. They all use the basic approach:
- The agent analyzes the current state of the environment.
- Takes the optimal action (within the framework of the learned Policy - behavior strategy).
- Moves into a new state of the environment.
- Receives a reward from the environment for a complete transition to a new state.
The sequence is based on the principles of the Markov process. It is assumed that the starting point is the current state of the environment. There is only one optimal way out of this state and it does not depend on the previous path.