Really great, I have some questions,
- What hardware configuration used for above huge data with CRF?
- In feature engineering, given that “previous tag + current word”, How I get “previous tag” at the time of training.
- Is uses incremental training? or need to training from start if added new samples?