Premium section subscription was temporarily stopped for global technical reason. All subscriptions will be prolonged and free time will be provided after global repairing.

HuntedRelated

Neural networks made easy (Part 61): Optimism issue in offline reinforcement learning

Rate this Entry

0 Comments

, 06-15-2024 at 11:58 AM (561 Views)

Recently, offline reinforcement learning methods have become widespread, which promises many prospects in solving problems of varying complexity. However, one of the main problems that researchers face is the optimism that can arise while learning. The agent optimizes its strategy based on the data from the training set and gains confidence in its actions. But the training set is quite often not able to cover the entire variety of possible states and transitions of the environment. In a stochastic environment, such confidence turns out to be not entirely justified. In such cases, the agent's optimistic strategy may lead to increased risks and undesirable consequences.

more...