View RSS Feed

mql5

Neural networks made easy (Part 63): Unsupervised Pretraining for Decision Transformer (PDT)

Rate this Entry
by , 06-21-2024 at 07:27 AM (360 Views)
      
   
PDT jointly learns an embedding space of future trajectory as well as a future prior conditioned only on past information.. By conditioning action prediction on the target future embedding, PDT is endowed with the ability to "reason over the future". This ability is naturally task-independent and can be generalized to different task specifications.

To achieve efficient online fine-tuning in downstream tasks, you can easily adapt the framework to new conditions by associating each future embedding to its return, which is realized by training a reward prediction network for each future embedding.
more...

Submit "Neural networks made easy (Part 63): Unsupervised Pretraining for Decision Transformer (PDT)" to Google Submit "Neural networks made easy (Part 63): Unsupervised Pretraining for Decision Transformer (PDT)" to del.icio.us Submit "Neural networks made easy (Part 63): Unsupervised Pretraining for Decision Transformer (PDT)" to Digg Submit "Neural networks made easy (Part 63): Unsupervised Pretraining for Decision Transformer (PDT)" to reddit

Categories
Uncategorized

Comments