Package opennlp.tools.ml.model
Class TwoPassDataIndexer
- java.lang.Object
-
- opennlp.tools.ml.model.AbstractDataIndexer
-
- opennlp.tools.ml.model.TwoPassDataIndexer
-
- All Implemented Interfaces:
DataIndexer
public class TwoPassDataIndexer extends AbstractDataIndexer
Collecting event and context counts by making two passes over the events.The first pass determines which contexts will be used by the model, and the second pass creates the events in memory containing only the contexts which will be used. This greatly reduces the amount of memory required for storing the events. During the first pass a temporary event file is created which is read during the second pass.
- See Also:
DataIndexer
,AbstractDataIndexer
-
-
Field Summary
-
Fields inherited from class opennlp.tools.ml.model.AbstractDataIndexer
CUTOFF_DEFAULT, CUTOFF_PARAM, SORT_DEFAULT, SORT_PARAM
-
-
Constructor Summary
Constructors Constructor Description TwoPassDataIndexer()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
index(ObjectStream<Event> eventStream)
Performs the data indexing.-
Methods inherited from class opennlp.tools.ml.model.AbstractDataIndexer
getContexts, getNumEvents, getNumTimesEventsSeen, getOutcomeLabels, getOutcomeList, getPredCounts, getPredLabels, getValues, init
-
-
-
-
Method Detail
-
index
public void index(ObjectStream<Event> eventStream) throws IOException
Performs the data indexing.Note: Make sure the
DataIndexer.init(TrainingParameters, Map)
method is called first.- Parameters:
eventStream
- AObjectStream
of events used as input.- Throws:
IOException
- Thrown if IO errors occurred during indexing.
-
-