mlpy.planners.explorers.discrete.EGreedyExplorer¶
-
class
mlpy.planners.explorers.discrete.
EGreedyExplorer
(epsilon=None, decay=None)[source]¶ Bases:
mlpy.planners.explorers.discrete.DiscreteExplorer
The -greedy explorer.
The -greedy explorer policy chooses as next action the action with the highest q-value, however with -probability a random action is chosen to drive exploration of unknown states.
Parameters: epsilon : float, optional
The probability. Default is 0.5.
decay : float, optional
The value by which decays. This value should be between 0 and 1. The probability to decreases over time with a factor of decay. Set this value to 1 if should remain the same throughout the experiment. Default is 1.
Methods
activate
()Turn on exploration mode. choose_action
(actions, qvalues)Choose the next action. deactivate
()Turn off exploration mode.