mlpy.planners.IPlanner¶

class mlpy.planners.IPlanner(explorer=None)[source]¶

Bases: mlpy.modules.UniqueModule

The planner interface class.

Parameters:

explorer : Explorer

The exploration strategy to employ. Available explorers are:

EGreedyExplorer

With $\epsilon$ probability, a random action is chosen, otherwise the action resulting in the highest q-value is selected.

SoftmaxExplorer

The softmax explorer varies the action probability as a graded function of estimated value. The greedy action is still given the highest selection probability, but all the others are ranked and weighted according to their value estimates.

Attributes

mid The module’s unique identifier.

Methods

`activate_exploration`()	Turn the explorer on.
`create_policy`([func])	Creates a policy (i.e., a state-action association).
`deactivate_exploration`()	Turn the explorer off.
`get_best_action`(state)	Choose the best next action for the agent to take.
`get_next_action`(state[, use_policy])	Returns the optimal action for a state according to the current policy.
`load`(filename)	Load the state of the module from file.
`plan`()	Plan for the optimal policy.
`save`(filename)	Save the current state of the module to file.
`visualize`()	Visualize of the planning data.