We will assume that we have access to some training data. The training data consists of a set of examples where each example is a sequence consisting of the observations, every observation being associated with a state. Given this data, how do we estimate the parameters of the model?