Class PerceptronTrainer

java.lang.Object
opennlp.tools.ml.AbstractTrainer<opennlp.tools.util.TrainingParameters>
opennlp.tools.ml.AbstractEventTrainer<opennlp.tools.util.TrainingParameters>
opennlp.tools.ml.perceptron.PerceptronTrainer
All Implemented Interfaces:
opennlp.tools.commons.Trainer<opennlp.tools.util.TrainingParameters>, opennlp.tools.ml.EventTrainer<opennlp.tools.util.TrainingParameters>

public class PerceptronTrainer extends opennlp.tools.ml.AbstractEventTrainer<opennlp.tools.util.TrainingParameters>
Trains models using the perceptron algorithm.

Each outcome is represented as a binary perceptron classifier. This supports standard (integer) weighting as well average weighting as described in:

Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with the Perceptron Algorithm. Michael Collins, EMNLP 2002.

See Also:
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    static final String
     
    static final double
     

    Fields inherited from class opennlp.tools.ml.AbstractEventTrainer

    DATA_INDEXER_ONE_PASS_REAL_VALUE, DATA_INDEXER_ONE_PASS_VALUE, DATA_INDEXER_PARAM, DATA_INDEXER_TWO_PASS_VALUE

    Fields inherited from interface opennlp.tools.ml.EventTrainer

    EVENT_VALUE
  • Constructor Summary

    Constructors
    Constructor
    Description
    Instantiates a PerceptronTrainer with default training parameters.
    PerceptronTrainer(opennlp.tools.util.TrainingParameters parameters)
    Instantiates a PerceptronTrainer with specific TrainingParameters.
  • Method Summary

    Modifier and Type
    Method
    Description
    opennlp.tools.ml.model.AbstractModel
    doTrain(opennlp.tools.ml.model.DataIndexer<opennlp.tools.util.TrainingParameters> indexer)
     
    boolean
     
    void
    setSkippedAveraging(boolean averaging)
    Enables skipped averaging, this flag changes the standard averaging to special averaging instead.
    void
    setStepSizeDecrease(double decrease)
    Enables and sets step size decrease.
    void
    setTolerance(double tolerance)
    Specifies the tolerance.
    opennlp.tools.ml.model.AbstractModel
    trainModel(int iterations, opennlp.tools.ml.model.DataIndexer<opennlp.tools.util.TrainingParameters> di, int cutoff)
    Trains a PerceptronModel with given parameters.
    opennlp.tools.ml.model.AbstractModel
    trainModel(int iterations, opennlp.tools.ml.model.DataIndexer<opennlp.tools.util.TrainingParameters> di, int cutoff, boolean useAverage)
    Trains a PerceptronModel with given parameters.
    void

    Methods inherited from class opennlp.tools.ml.AbstractEventTrainer

    getDataIndexer, train, train

    Methods inherited from class opennlp.tools.ml.AbstractTrainer

    getAlgorithm, getCutoff, getIterations, getTrainingConfiguration, init, init

    Methods inherited from class Object

    equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

    Methods inherited from interface opennlp.tools.commons.Trainer

    init, init
  • Field Details

  • Constructor Details

    • PerceptronTrainer

      public PerceptronTrainer()
      Instantiates a PerceptronTrainer with default training parameters.
    • PerceptronTrainer

      public PerceptronTrainer(opennlp.tools.util.TrainingParameters parameters)
      Instantiates a PerceptronTrainer with specific TrainingParameters.
      Parameters:
      parameters - The parameter to use.
  • Method Details

    • validate

      public void validate()
      Overrides:
      validate in class opennlp.tools.ml.AbstractEventTrainer<opennlp.tools.util.TrainingParameters>
      Throws:
      IllegalArgumentException - Thrown if the algorithm name is not equal to PERCEPTRON_VALUE.
    • isSortAndMerge

      public boolean isSortAndMerge()
      Specified by:
      isSortAndMerge in class opennlp.tools.ml.AbstractEventTrainer<opennlp.tools.util.TrainingParameters>
    • doTrain

      public opennlp.tools.ml.model.AbstractModel doTrain(opennlp.tools.ml.model.DataIndexer<opennlp.tools.util.TrainingParameters> indexer) throws IOException
      Specified by:
      doTrain in class opennlp.tools.ml.AbstractEventTrainer<opennlp.tools.util.TrainingParameters>
      Throws:
      IOException
    • setTolerance

      public void setTolerance(double tolerance)
      Specifies the tolerance. If the change in training set accuracy is less than this, stop iterating.
      Parameters:
      tolerance - The level of tolerance. Must not be negative.
      Throws:
      IllegalArgumentException - Thrown if parameters are invalid.
    • setStepSizeDecrease

      public void setStepSizeDecrease(double decrease)
      Enables and sets step size decrease. The step size is decreased every iteration by the specified value.
      Parameters:
      decrease - The step size decrease in percent. Must not be negative.
      Throws:
      IllegalArgumentException - Thrown if parameters are invalid.
    • setSkippedAveraging

      public void setSkippedAveraging(boolean averaging)
      Enables skipped averaging, this flag changes the standard averaging to special averaging instead.

      If we are doing averaging, and the current iteration is one of the first 20, or if it is a perfect square, then updated the summed parameters.

      The reason we don't take all of them is that the parameters change less toward the end of training, so they drown out the contributions of the more volatile early iterations. The use of perfect squares allows us to sample from successively farther apart iterations.

      Parameters:
      averaging - Whether to skip 'averaging', or not.
    • trainModel

      public opennlp.tools.ml.model.AbstractModel trainModel(int iterations, opennlp.tools.ml.model.DataIndexer<opennlp.tools.util.TrainingParameters> di, int cutoff)
      Trains a PerceptronModel with given parameters.
      Parameters:
      iterations - The number of iterations to use for training.
      di - The DataIndexer used as data input.
      cutoff - The Parameters.CUTOFF_PARAM value to use for training.
      Returns:
      A valid, trained perceptron model.
    • trainModel

      public opennlp.tools.ml.model.AbstractModel trainModel(int iterations, opennlp.tools.ml.model.DataIndexer<opennlp.tools.util.TrainingParameters> di, int cutoff, boolean useAverage)
      Trains a PerceptronModel with given parameters.
      Parameters:
      iterations - The number of iterations to use for training.
      di - The DataIndexer used as data input.
      cutoff - The Parameters.CUTOFF_PARAM value to use for training.
      useAverage - Whether to use 'averaging', or not. See setSkippedAveraging(boolean) for details.
      Returns:
      A valid, trained perceptron model.