mlpy.mdp.discrete.UnknownBonusExplorer.update¶
-
UnknownBonusExplorer.
update
(model)[source]¶ Update the reward model.
Update the reward model according to a RMax based exploration policy. States for which the decision tree was unable to predict a reward are considered unknown. These states are given a bonus of RMax to drive exploration.
Parameters: model : StateActionInfo
The states-action information.