Deep Reinforcement learning is the Framework for actions in order to maximiSe the future rewards. This can be applied for sequential decision making problems in many fields.

 

Key components of DEEP REINFORCEMENT LEARNING

  • Environmet
  • Agent
  • Obseervation
  • Action
  • Reward
DRL2.png
Create Machine intelligence for digital transformation TOSHISTATS20180201.png

 

 

Estimation of functions

  • Policy function : functions to provide agent’s actions
  • Value function: how good each state and/or action is, based on the policy 

 

 

Reinforcement Learning algorithm

  • Value-based RL

Obtain the optimal value function.  This is the maximum value achievable under any policy

  • Policy-based RL

Obtain directly for the optimal policy.  This is the policy achieving maximum future reward

Mixed approach of Policy-based and Value-based RL