Package opennlp.tools.util.featuregen
Class BrownCluster
- java.lang.Object
-
- opennlp.tools.util.featuregen.BrownCluster
-
- All Implemented Interfaces:
SerializableArtifact
public class BrownCluster extends Object implements SerializableArtifact
Class to load a Brown cluster document: word\tword_class\tprobOriginally available at: http://metaoptimize.com/projects/wordreprs/. Further details can be found in the related research paper.
The file containing the clustering lexicon has to be passed as the value of the dict attribute of each
BrownCluster
feature generator.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
BrownCluster.BrownClusterSerializer
-
Constructor Summary
Constructors Constructor Description BrownCluster(InputStream in)
Generates the token to cluster map from Brown cluster anInputStream
.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Class<?>
getArtifactSerializerClass()
Retrieves the class which can serialize and recreate this artifact.String
lookupToken(String string)
Check if a token is in the Brown:paths, token map.void
serialize(OutputStream out)
-
-
-
Constructor Detail
-
BrownCluster
public BrownCluster(InputStream in) throws IOException
Generates the token to cluster map from Brown cluster anInputStream
.Note: we only add those tokens with frequency bigger than
5
.- Parameters:
in
- A valid, openInputStream
to read from.- Throws:
IOException
- the io exception
-
-
Method Detail
-
lookupToken
public String lookupToken(String string)
Check if a token is in the Brown:paths, token map.- Parameters:
string
- the token to look-up- Returns:
- the brown class if such token is in the brown cluster map
-
serialize
public void serialize(OutputStream out) throws IOException
- Throws:
IOException
-
getArtifactSerializerClass
public Class<?> getArtifactSerializerClass()
Description copied from interface:SerializableArtifact
Retrieves the class which can serialize and recreate this artifact.Note: The serializer class must have a
public zero argument constructor
or an exception is thrown during model serialization/loading.- Specified by:
getArtifactSerializerClass
in interfaceSerializableArtifact
- Returns:
- The corresponding
ArtifactSerializer
class.
-
-