ModelParameterChunker (Apache OpenNLP Tools 2.1.1 API)

java.lang.Object
- opennlp.tools.ml.model.ModelParameterChunker

```
public final class ModelParameterChunker
extends Object
```
A helper class that handles Strings with more than 64k (65535 bytes) in length. This is achieved via the signature SIGNATURE_CHUNKED_PARAMS at the beginning of the String instance to be written to a DataOutputStream.
Background: In OpenNLP, for large(r) corpora, we train models whose (UTF String) parameters will exceed the MAX_CHUNK_SIZE_BYTES bytes limit set in DataOutputStream. For writing and reading those models, we have to chunk up those string instances in 64kB blocks and recombine them correctly upon reading a (binary) model file.
The problem was raised in ticket OPENNLP-1366.
Solution strategy:
- If writing parameters to a DataOutputStream blows up with a UTFDataFormatException a large String instance is chunked up and written as appropriate blocks.
- To indicate that chunking was conducted, we start with the SIGNATURE_CHUNKED_PARAMS indicator, directly followed by the number of chunks used. This way, when reading in chunked model parameters, recombination is achieved transparently.
Note: Both, existing (binary) model files and newly trained models which don't require the chunking technique, will be supported like in previous OpenNLP versions.
Author:

Martin Wiesner, Mark Struberg

Field Summary

Fields
Modifier and Type Field Description

static String SIGNATURE_CHUNKED_PARAMS

Method Summary

All Methods Static Methods Concrete Methods
Modifier and Type	Method	Description
`static String`	`readUTF(DataInputStream dis)`	Reads model parameters from `dis`.
`static void`	`writeUTF(DataOutputStream dos, String s)`	Writes the model parameter `s` to `dos`.

Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - SIGNATURE_CHUNKED_PARAMS
```
public static final String SIGNATURE_CHUNKED_PARAMS
```
    See Also:
    
    Constant Field Values
- Method Detail
  - readUTF
```
public static String readUTF(DataInputStream dis)
                      throws IOException
```
    Reads model parameters from dis. In case the stream start with SIGNATURE_CHUNKED_PARAMS, the number of chunks is detected and the original large parameter string is reconstructed from several chunks.
    
    Parameters:
    
    dis - The stream which will be used to read the model parameter from.
    
    Throws:
    
    IOException
  - writeUTF
```
public static void writeUTF(DataOutputStream dos,
                            String s)
                     throws IOException
```
    Writes the model parameter s to dos. In case s does exceed MAX_CHUNK_SIZE_BYTES in length, the chunking mechanism is used; otherwise the parameter is written 'as is'.
    
    Parameters:
    
    dos - The DataOutputStream stream which will be used to persist the model.
    
    s - The input string that is checked for length and chunked if MAX_CHUNK_SIZE_BYTES is exceeded.
    
    Throws:
    
    IOException