class mlpy.agents.modules.LearningModule(learner_type, cb_get_reward=None, *args, **kwargs)[source]

Bases: mlpy.agents.modules.IAgentModule

Learning agent module.

The learning agent module allows the agent to learn from passed experiences.


learner_type : str

The learning type. Based on the type the appropriate learner module is created. Valid learning types are:


The learner performs q-learning, a reinforcement learning variant (QLearner).


The learner performs reinforcement learning with decision trees (RLDT), a method introduced by Hester, Quinlan, and Stone which builds a generalized model for the transitions and rewards of the environment (RLDTLearner).


The learner performs apprenticeship learning via inverse reinforcement learning, a method introduced by Abbeel and Ng which strives to imitate the demonstrations given by an expert (ApprenticeshipLearner).


The learner incrementally performs apprenticeship learning via inverse reinforcement learning. Inverse reinforcement learning assumes knowledge of the underlying model. However, this is not always feasible. The incremental apprenticeship learner updates its model after every iteration by executing the current policy (IncrApprenticeshipLearner).

cb_get_reward : callable, optional

A callback function to retrieve the reward based on the current state and action. Default is None.

The function must be of the following format:

>>> def callback(state, action):
>>>     pass

learner_params : dict, optional

Parameters passed to the learner for initialization. See the appropriate learner type for more information. Default is None.


mid The module’s unique identifier.


enter(t) Enter the module and perform initialization tasks.
execute(state) Execute the learner.
exit() Exit the module and perform cleanup tasks.
get_next_action() Return the next action.
is_complete() Check if the agent module has completed.
load(filename) Load the state of the module from file.
reset(t, **kwargs) Reset the module for the next iteration.
save(filename) Save the current state of the module to file.
terminate(value) Set the termination flag.
update(dt) Update the module at every delta time step dt.