Machine Learning#

Reinforcement Learning#

Reinforcement learning (RL) is a subfield of Machine Learning associated with how an intelligent Agent takes some Actions in an Environment, thereby transitioning through States, while seeking a Reward.

The goal of reinforcement learning is for the Agent to learn a Policy (decision making process, or collection of probabilities of taking an action ‘a’ while in state ‘s’.) that maximizes its Rewards over time.

Policy Gradient Methods#

Part of the solution of the RL problem is to search the space of possible Policies when choosing the most suitable one. This can be large or infinite - so usually some proxy or surrogate of the space is needed, i.e. a stochastic optimization problem. Policy gradient methods employ gradient ascent techniques in the search.