View RSS Feed

mql5

Neural networks made easy (Part 38): Self-Supervised Exploration via Disagreement

Rate this Entry
by , 03-30-2024 at 02:24 PM (226 Views)
      
   
This algorithm is based on a self-learning method, where the agent uses information obtained during interaction with the environment to generate "intrinsic" rewards and update its strategy. The algorithm is based on the use of several agent models that interact with the environment and generate various predictions. If the models disagree, it is considered an "interesting" event and the agent is incentivized to explore that space of the environment. In this way, the algorithm incentivizes the agent to explore new areas of the environment and allows it to make more accurate predictions about future rewards.
more...

Submit "Neural networks made easy (Part 38): Self-Supervised Exploration via Disagreement" to Google Submit "Neural networks made easy (Part 38): Self-Supervised Exploration via Disagreement" to del.icio.us Submit "Neural networks made easy (Part 38): Self-Supervised Exploration via Disagreement" to Digg Submit "Neural networks made easy (Part 38): Self-Supervised Exploration via Disagreement" to reddit

Categories
Uncategorized

Comments