ostore.oil
Class HMM

java.lang.Object
  |
  +--ostore.oil.HMM
All Implemented Interfaces:
Model, QuickSerializable, SegmentedModel

public class HMM
extends Object
implements SegmentedModel

Hidden Markov Model: a statistical tool which models a chain of observed variables as the output of a chain of hidden states. Given such a chain of observations, it can deduce the relationships among the hidden states and between the hidden states and the output variables. It can thus also predict future hidden states and output variables.

Version:
$Id: HMM.java,v 1.31 2002/07/20 19:38:03 srhea Exp $
Author:
Dennis Geels
See Also:
Model

Nested Class Summary
static class HMM.Delta
          A HMM.Delta object contains a diff of the sufficient statistics for an HMM.
static class HMM.Prediction
          A Prediction object contains an array of states and a matching array of weights.
static class HMM.Segment
          A Segment object contains the internal model state associated with a single state.
 
Nested classes inherited from class ostore.oil.SegmentedModel
 
Nested classes inherited from class ostore.oil.Model
 
Field Summary
protected  Matrix emissions
           
static int INIT_CODE_CLUSTERING
          Initialize HMM with priors which are useful for a clustering application.
static int INIT_TRANSITIONS_CLUSTER
          Use random transition priors, but weight self-transitions heavily.
static int INIT_TRANSITIONS_RANDOM
          Use random transition priors.
protected  Array marginals
           
protected  QSVector states
           
protected  Matrix transitions
           
 
Constructor Summary
HMM(InputBuffer buffer)
          Constructs a HMM from its serialized form.
HMM(int type, int num_states, int num_outputs, double epsilon)
          Create a new HMM.
 
Method Summary
 void add_delta(Model.Delta d)
          Incorporate the information from a Delta into this Model.
 void add_segment(SegmentedModel.Segment s)
          Incorporates the portion of the model contained in the specified Segment.
 void choose_segments(int num)
          Selects the num most relevant Segments, discarding the rest.
 void clear()
          Forget all observations recorded since the last call to clear or recalculate.
 void clear(int num)
          Forget the least-recent recorded observations.
 double loglikelihood()
          Calculates the log-likelihood of all recorded observations.
 QuickSerializable[] outliers()
          Returns the recent observations which the model assigns the lowest likelihood.
 Model.Prediction predict(int horizon)
          Estimates the current and future occurrences of each state.
 Model.Delta recalculate()
          Updates the internal model parameters using the EM algorithm.
 void record(QuickSerializable output)
          Notes an observation for later processing.
 void record(QuickSerializable[] observations)
          Notes a group of observations for later processing.
 void serialize(OutputBuffer buffer)
          Add the object to the buffer.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

INIT_TRANSITIONS_RANDOM

public static final int INIT_TRANSITIONS_RANDOM
Use random transition priors.

See Also:
Constant Field Values

INIT_TRANSITIONS_CLUSTER

public static final int INIT_TRANSITIONS_CLUSTER
Use random transition priors, but weight self-transitions heavily.

See Also:
Constant Field Values

INIT_CODE_CLUSTERING

public static final int INIT_CODE_CLUSTERING
Initialize HMM with priors which are useful for a clustering application.

See Also:
Constant Field Values

states

protected QSVector states

marginals

protected Array marginals

transitions

protected Matrix transitions

emissions

protected Matrix emissions
Constructor Detail

HMM

public HMM(int type,
           int num_states,
           int num_outputs,
           double epsilon)
Create a new HMM. Initialize the states and sufficient statistic priors using the specified prototype.

Parameters:
type - A code specifying how to initialize the sufficient statistic priors should be generated. (See INIT_CODE_* above).
num_states - The number of hidden states to assume.
num_outputs - The total number of outputs to expect (this variable is not binding; it just produces better emission priors).
epsilon - The convergence threshold. The inference algorithm will iterate until the log-likelihood converges to within epsilon times the sequence length.

HMM

public HMM(InputBuffer buffer)
    throws QSException
Constructs a HMM from its serialized form.

Method Detail

record

public void record(QuickSerializable output)
Description copied from interface: Model
Notes an observation for later processing. This method should execute very quickly, so that OIL does not create a bottleneck in a critical path. Any significant processing should be postponed until the recalculate method is called.

Specified by:
record in interface Model
Parameters:
output - Any event, value, etc. that this Model understands.

record

public void record(QuickSerializable[] observations)
Description copied from interface: Model
Notes a group of observations for later processing. This method allows the Model to process a group of observations in bulk, potentially saving resources.

Specified by:
record in interface Model
Parameters:
observations - An array of events, values, etc., not necessarily all the same type, that this Model understands.

clear

public void clear(int num)
Description copied from interface: Model
Forget the least-recent recorded observations.

Specified by:
clear in interface Model

clear

public void clear()
Description copied from interface: Model
Forget all observations recorded since the last call to clear or recalculate.

Specified by:
clear in interface Model

outliers

public QuickSerializable[] outliers()
Description copied from interface: Model
Returns the recent observations which the model assigns the lowest likelihood. This method is useful both for standard outlier detections as well as for swapping in model segments using reverse lookup.

Specified by:
outliers in interface Model
Returns:
an array containing recent observations which the model did not expect.

add_segment

public void add_segment(SegmentedModel.Segment s)
Description copied from interface: SegmentedModel
Incorporates the portion of the model contained in the specified Segment.

Specified by:
add_segment in interface SegmentedModel
Parameters:
s - The Segment to add.
See Also:
SegmentedModel.choose_segments(int)

add_delta

public void add_delta(Model.Delta d)
Description copied from interface: Model
Incorporate the information from a Delta into this Model. The resulting Model should be equivalent to that which produced the Delta, given that they began in the same state. Subclasses of Model employ specific subclasses of Delta.

Specified by:
add_delta in interface Model

loglikelihood

public double loglikelihood()
Description copied from interface: Model
Calculates the log-likelihood of all recorded observations. Only observations recorded since the last call to clear or recalculate are considered.

Specified by:
loglikelihood in interface Model
Returns:
a non-positive number equal to the logarithm of the probability of the observations (D) given the current model (M); log(p(D|M)). If no observations exist, it returns Double.NaN.

recalculate

public Model.Delta recalculate()
Updates the internal model parameters using the EM algorithm. Newly calculated Expected Sufficient Statistics are bundled and returned as a Delta. They are also added to this model's sufficient statistics, so they will affect future calculations.

Specified by:
recalculate in interface Model
Returns:
a Delta containing the expected transition, emission, and marginal counts for each state. If no observations have been made, a null delta will be returned.
See Also:
Model.recalculate()

choose_segments

public void choose_segments(int num)
Selects the num most relevant Segments, discarding the rest. This method chooses the top Segments found by predict(int).

Specified by:
choose_segments in interface SegmentedModel
Parameters:
num - The number of Segments to retain.

predict

public Model.Prediction predict(int horizon)
Estimates the current and future occurrences of each state. This method uses much of the same algorithm as recalculate to estimate marginal counts for each state, but extends the current sequence of observations with a series of observations to new imaginary observation. Thus the states expected to occurr in the near future (estimated by transition probabilities) are considered in addition to the states believed to have just occurred.

Specified by:
predict in interface Model
Parameters:
horizon - The number of observations to look into the future
Returns:
a Prediction containing the estimated state marginal counts, sorted.
See Also:
Model.predict(int)

serialize

public void serialize(OutputBuffer buffer)
Description copied from interface: QuickSerializable
Add the object to the buffer.

Specified by:
serialize in interface QuickSerializable
Parameters:
buffer - the output buffer to add the object to