Trains models for sequences using the perceptron algorithm. Each outcome is represented as
a binary perceptron classifier. This supports standard (integer) weighting as well
average weighting. Sequence information is used in a simplified was to that described in:
Discriminative Training Methods for Hidden Markov Models: Theory and Experiments
with the Perceptron Algorithm. Michael Collins, EMNLP 2002.
Specifically only updates are applied to tokens which were incorrectly tagged by a sequence tagger
rather than to all feature across the sequence which differ from the training sequence.