Class MorfologikDictionaryBuilder


  • public class MorfologikDictionaryBuilder
    extends Object
    Utility class to build Morfologik dictionaries from a tab separated values file.

    The first column is the word, the second its lemma and the third a POS tag (base,inflected,tag). If there is no lemma information leave the second column empty.

    • Constructor Detail

      • MorfologikDictionaryBuilder

        public MorfologikDictionaryBuilder()
    • Method Detail

      • build

        public Path build​(Path input,
                          boolean overwrite,
                          boolean validate,
                          boolean acceptBom,
                          boolean acceptCr,
                          boolean ignoreEmpty)
                   throws Exception
        Helper to compile a morphological dictionary automaton.
        Parameters:
        input - The input file (base,inflected,tag). An associated metadata (*.info) file must exist.
        overwrite - Whether to overwrite the output file if it exists, or not.
        validate - Whether to validate input to make sure it makes sense.
        acceptBom - Whether to accept leading BOM bytes (UTF-8), or not.
        acceptCr - Whether to accept CR bytes in input sequences (\r), or not.
        ignoreEmpty - Whether to ignore empty lines in the input, or not.
        Returns:
        The resulting dictionary Path.
        Throws:
        Exception - Thrown if errors occurred during dictionary compilation.
      • build

        public Path build​(Path input)
                   throws Exception
        Helper to compile a morphological dictionary automaton using default parameters.
        Parameters:
        input - The input file (base,inflected,tag). An associated metadata (*.info) file must exist.
        Returns:
        The resulting dictionary Path.
        Throws:
        Exception - Thrown if errors occurred during dictionary compilation.