mlpy.planners.explorers.discrete.EGreedyExplorer

class mlpy.planners.explorers.discrete.EGreedyExplorer(epsilon=None, decay=None)[source]

Bases: mlpy.planners.explorers.discrete.DiscreteExplorer

The \epsilon-greedy explorer.

The \epsilon-greedy explorer policy chooses as next action the action with the highest q-value, however with \epsilon-probability a random action is chosen to drive exploration of unknown states.

Parameters:

epsilon : float, optional

The \epsilon probability. Default is 0.5.

decay : float, optional

The value by which \epsilon decays. This value should be between 0 and 1. The probability \epsilon to decreases over time with a factor of decay. Set this value to 1 if \epsilon should remain the same throughout the experiment. Default is 1.

Methods

activate() Turn on exploration mode.
choose_action(actions, qvalues) Choose the next action.
deactivate() Turn off exploration mode.