Package opennlp.tools.tokenize
Klasse WhitespaceTokenizer
java.lang.Object
opennlp.tools.tokenize.WhitespaceTokenizer
- Alle implementierten Schnittstellen:
Tokenizer
-
Feldübersicht
FelderModifizierer und TypFeldBeschreibungstatic final WhitespaceTokenizerUse this static reference to retrieve an instance of theWhitespaceTokenizer. -
Methodenübersicht
Modifizierer und TypMethodeBeschreibungvoidsetKeepNewLines(boolean keepNewLines) Switches whether to keep new lines or not.String[]Splits a string into its atomic parts.Span[]Finds the boundaries of atomic parts in a string.
-
Felddetails
-
INSTANCE
Use this static reference to retrieve an instance of theWhitespaceTokenizer.
-
-
Methodendetails
-
tokenizePos
Beschreibung aus Schnittstelle kopiert:TokenizerFinds the boundaries of atomic parts in a string.- Parameter:
d- The string to be tokenized.- Gibt zurück:
- The
spans (offsets intofor each token as the individuals array elements.s)
-
tokenize
Beschreibung aus Schnittstelle kopiert:TokenizerSplits a string into its atomic parts. -
setKeepNewLines
public void setKeepNewLines(boolean keepNewLines) Switches whether to keep new lines or not.- Parameter:
keepNewLines-Trueif new lines are kept,falseotherwise.
-