2019-11-10から1日間の記事一覧

2019-11-10

論文紹介: Bridging the Gap Between Value and Policy Based Reinforcement Learning

紹介する論文 Bridging the Gap Between Value and Policy Based Reinforcement Learning 概要 on-policyの学習の安定性とoff-policyのサンプル効率の高さを備えた手法エントロピー正則化強化学習の考え方に基づいた時に導き出される，価値関数と政策関数の…

DDN's Library

強化学習に関する論文まとめやちょっとしたシステム開発について

2019-11-10から1日間の記事一覧

論文紹介: Bridging the Gap Between Value and Policy Based Reinforcement Learning