Abstract: Traditional performance potential-based learning algorithms can obtain optimal policies in MDP problems.

 
  • 文章摘要: 传统基于性能势的学习算法能获得马尔可夫决策问题的最优策略。
今日热词
目录 附录 查词历史