mlpy.planners.explorers.discrete.EGreedyExplorer¶

class mlpy.planners.explorers.discrete.EGreedyExplorer(epsilon=None, decay=None)[source]¶

Bases: mlpy.planners.explorers.discrete.DiscreteExplorer

The $\epsilon$ -greedy explorer.

The $\epsilon$ -greedy explorer policy chooses as next action the action with the highest q-value, however with $\epsilon$ -probability a random action is chosen to drive exploration of unknown states.

Parameters:

epsilon : float, optional

The $\epsilon$ probability. Default is 0.5.

decay : float, optional

The value by which $\epsilon$ decays. This value should be between 0 and 1. The probability $\epsilon$ to decreases over time with a factor of decay. Set this value to 1 if $\epsilon$ should remain the same throughout the experiment. Default is 1.

Methods

`activate`()	Turn on exploration mode.
`choose_action`(actions, qvalues)	Choose the next action.
`deactivate`()	Turn off exploration mode.