#importancesampling #offpolicy #montecarlo
In this lecture we look at off policy control for monte carlo algorithms via importance sampling. We look at techniques such as discounting aware importance sampling, that help us reduce the variance of our off policy estimators, and finally wrap up our conversation on Monte Carlo algorithms
Негізгі бет Reinforcement Learning - Lecture 14 (Off policy Control for MC via Importance Sampling )
Пікірлер: 1