mlpy.planners.explorers.discrete.EGreedyExplorer¶
-
class
mlpy.planners.explorers.discrete.EGreedyExplorer(epsilon=None, decay=None)[source]¶ Bases:
mlpy.planners.explorers.discrete.DiscreteExplorerThe
-greedy explorer.The
-greedy explorer policy chooses as next action
the action with the highest q-value, however with
-probability a random action is chosen to
drive exploration of unknown states.Parameters: epsilon : float, optional
The
probability. Default is 0.5.decay : float, optional
The value by which
decays. This value should be
between 0 and 1. The probability
to decreases
over time with a factor of decay. Set this value to 1 if
should remain the same throughout the experiment.
Default is 1.Methods
activate()Turn on exploration mode. choose_action(actions, qvalues)Choose the next action. deactivate()Turn off exploration mode.