mlpy.optimize.algorithms.EM

class mlpy.optimize.algorithms.EM(n_iter=None, thresh=None, verbose=None)[source]

Bases: object

Expectation-Maximization module base class.

Representation of the expectation-maximization (EM) model. This class allows for the execution of the expectation- maximization algorithm by providing functionality for random restarts and convergence checking.

See the instance documentation for details specific to a particular implementation of the EM algorithm.

Parameters:

n_iter : int, optional

The number of iterations to perform. Default is 100.

thresh : float, optional

The convergence threshold. Default is 1e-4.

verbose : bool, optional

Controls if debug information is printed to the console. Default is False.

See also

HMM, GMM

Notes

Classes that deriving from the EM base class must overwrite the following private functions:

_initialize(obs, init_count)

Perform initialization before entering the EM algorithm. The expected parameters are:

obs : array_like, shape (n, ni, nfeatures)
List of observation sequences, where n is the number of sequences, ni is the length of the i_th observation, and each observation has nfeatures features.
init_count : int
Restart counter.
_estep(obs)

Perform the expectation step of the EM algorithm and return the log likelihood of the observation obs. The expected parameters are:

obs : array_like, shape (n, ni, nfeatures)
List of observation sequences, where n is the number of sequences, ni is the length of the i_th observation, and each observation has nfeatures features.
_mstep()
Perform maximization step of the EM algorithm.

Optionally, the private function _plot can be overwritten to visualize the results at each iteration. The _plot function is called by the EM algorithm before the maximization step is performed.

The deriving class must call the private method _em(x, n_init=None) to initiate the the EM algorithm. Pass the following parameters:

x : array_like, shape (n, ni, ndim)
List of data sequences, where n is the number of sequences, ni is the length of the i_th sequence, and each data point in the sequence has ndim dimensions.
n_init : int, optional
Number of restarts to prevent getting stuck in a local minimum. Default is 1.

The function returns the log likelihood of the data sequences x.

Examples

>>> from mlpy.optimize.algorithms import EM
>>>
>>> class MyEM(EM):
...     def _initialize(self, obs, init_count):
...         pass
...
...     def _estep(self, obs):
...         pass
...
...     def _mstep(self):
...         pass
...
...     def _plot(self):
...         pass
...
...     def fit(self, x):
...         return self._em(x, n_init=5)
...

This creates a new class capable of performing the expectation-maximization algorithm.

Note

Adapted from:

Copyright (2010) Kevin Murphy and Matt Dunham
License: MIT