Information about the format are found on this web site.
A 1-million word corpus can be found on this web site.
The NKJP schema can be found here.
static NKJPTextDocument
parse(InputStream is)
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
IOException