強化學習是目前熱門的研究方向。對不同強化學習的方法與paper進行分類有助于我們進一步了解針對不同的應用場景，如何使用合適的強化學習方法。本文將對強化學習進行分類并列出對應的paper。

5. Memory系列

算法名稱：MFEC
論文標題：Model-Free Episodic Control
發(fā)表會議：Arxiv
論文鏈接：https://arxiv.org/abs/1606.04460
當前谷歌學術引用次數(shù)：138

算法名稱：NEC
論文標題：Neural Episodic Control
發(fā)表會議：ICML, 2017
論文鏈接：https://arxiv.org/abs/1703.01988
當前谷歌學術引用次數(shù)：171

算法名稱：Neural Map
論文標題：Neural Map: Structured Memory for Deep Reinforcement Learning
發(fā)表會議：ICLR, 2018
論文鏈接：https://arxiv.org/abs/1702.08360
當前谷歌學術引用次數(shù)：173

算法名稱：MERLIN
論文標題：Unsupervised Predictive Memory in a Goal-Directed Agent
發(fā)表會議：Arxiv
論文鏈接：https://arxiv.org/abs/1803.10760
當前谷歌學術引用次數(shù)：108

算法名稱：RMC
論文標題：Relational Recurrent Neural Networks
發(fā)表會議：ICLR, 2018
論文鏈接：https://arxiv.org/abs/1806.01822
當前谷歌學術引用次數(shù)：121

6. Model-Based RL系列

a. Model is Learned

算法名稱：I2A
論文標題：Imagination-Augmented Agents for Deep Reinforcement Learning
發(fā)表會議：NIPS, 2017
論文鏈接：https://arxiv.org/abs/1707.06203
當前谷歌學術引用次數(shù)：182

算法名稱：MBMF
論文標題：Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning
發(fā)表會議：ICRA, 2018
論文鏈接：https://arxiv.org/abs/1708.02596
當前谷歌學術引用次數(shù)：503

算法名稱：MVE
論文標題：Model-Based Value Expansion for Efficient Model-Free Reinforcement Learning
發(fā)表會議：Arxiv
論文鏈接：https://arxiv.org/abs/1803.00101
當前谷歌學術引用次數(shù)：109

算法名稱：STEVE
論文標題：Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion
發(fā)表會議：NIPS, 2018
論文鏈接：https://arxiv.org/abs/1807.01675
當前谷歌學術引用次數(shù)：127

算法名稱：ME-TRPO
論文標題：Model-Ensemble Trust-Region Policy Optimization
發(fā)表會議：ICLR, 2018
論文鏈接：https://openreview.net/forum?id=SJJinbWRZ&noteId=SJJinbWRZ
當前谷歌學術引用次數(shù)：195

算法名稱：MB-MPO
論文標題：Model-Based Reinforcement Learning via Meta-Policy Optimization
發(fā)表會議：Conference on Robot Learning, 2018
論文鏈接：https://arxiv.org/abs/1809.05214
當前谷歌學術引用次數(shù)：108

算法名稱：MB-MPO
論文標題：Recurrent World Models Facilitate Policy Evolution
發(fā)表會議：NIPS, 2018
論文鏈接：https://arxiv.org/abs/1809.01999
當前谷歌學術引用次數(shù)：316

b. Model is Learned

算法名稱：AlphaZero
論文標題：Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
發(fā)表會議：Science, 2018
論文鏈接：https://arxiv.org/abs/1712.01815
當前谷歌學術引用次數(shù)：971

算法名稱：ExIt
論文標題：Thinking Fast and Slow with Deep Learning and Tree Search
發(fā)表會議：NIPS, 2017
論文鏈接：https://arxiv.org/abs/1705.08439
當前谷歌學術引用次數(shù)：174

參考
https://spinningup.openai.com/en/latest/

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

強化學習之分類與重點paper 3

強化學習之分類與重點paper 3

5. Memory系列

6. Model-Based RL系列

a. Model is Learned

b. Model is Learned

相關閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

強化學習之分類與重點paper 3

5. Memory系列

6. Model-Based RL系列

a. Model is Learned

b. Model is Learned

相關閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av