Class BrownCluster

java.lang.Object
opennlp.tools.util.featuregen.BrownCluster
All Implemented Interfaces:
SerializableArtifact

public class BrownCluster extends Object implements SerializableArtifact
Class to load a Brown cluster document: word\tword_class\tprob

Originally available at: http://metaoptimize.com/projects/wordreprs/. Further details can be found in the related research paper.

The file containing the clustering lexicon has to be passed as the value of the dict attribute of each BrownCluster feature generator.

  • Constructor Details

    • BrownCluster

      public BrownCluster(InputStream in) throws IOException
      Generates the token to cluster map from Brown cluster an InputStream.

      Note: we only add those tokens with frequency bigger than 5.

      Parameters:
      in - A valid, open InputStream to read from.
      Throws:
      IOException - the io exception
  • Method Details

    • lookupToken

      public String lookupToken(String string)
      Check if a token is in the Brown:paths, token map.
      Parameters:
      string - the token to look-up
      Returns:
      the brown class if such token is in the brown cluster map
    • serialize

      public void serialize(OutputStream out) throws IOException
      Throws:
      IOException
    • getArtifactSerializerClass

      public Class<?> getArtifactSerializerClass()
      Description copied from interface: SerializableArtifact
      Retrieves the class which can serialize and recreate this artifact.

      Note: The serializer class must have a public zero argument constructor or an exception is thrown during model serialization/loading.

      Specified by:
      getArtifactSerializerClass in interface SerializableArtifact
      Returns:
      The corresponding ArtifactSerializer class.