mlpy.mdp.stateaction.RewardFunction¶

class mlpy.mdp.stateaction.RewardFunction[source]¶

Bases: object

The reward function.

The reward function is responsible for calculating the proper value of the reward. Callback functions can be specified for custom calculation of the reward value.

Notes

To ensure that the correct value of the reward is being accessed, the user should not access the class variables directly but instead use the methods set and get to set and get the reward respectively.

Examples

>>> RewardFunction.cb_get = staticmethod(lambda r, s: np.dot(s, RewardFunction.reward))

In this cas the reward function is calculated by taking the dot product of the stored reward and a passed in value.

>>> RewardFunction.reward = [0.1, 0.9. 1.0, 0.0]

This sets the reward for all instances of the reward function.

>>> reward_func = RewardFunction()
>>> print reward_func.get([0.9, 0.5, 0.0, 1.0])
0.54

This calculates the reward r according to previously defined the callback function.

Attributes

bonus The bonus added to the reward to encourage exploration.

cb_get	(callable) Callback function to retrieve the reward value.
cb_set	(callable) Callback function to set the reward value.
reward	(float) The reward value.
rmax	(float) The maximum possible reward.
activate_bonus	(bool) Flag activating/deactivating the bonus.

Methods

`get`(args, *kwargs)	Retrieve the reward value.
`set`(value, args, *kwargs)	Set the reward value.