mlpy.learners.online.rl.RLDTLearner.learn¶

RLDTLearner.learn(experience=None)[source]¶

Learn a policy from the experience.

A policy is learned from the experience by building the MDP model.

Parameters:

experience : Experience

The actor’s current experience consisting of previous state, the action performed in that state, the current state, and the reward awarded.