public class ChunkString
extends java.lang.Object
{<DT><JJ><NN>}<VBN><IN>{<DT><NN>}<.>{<DT><NN>}<VBD><.>
``ChunkString`` are created from tagged texts (i.e., lists of
``tokens`` whose type is ``TaggedType``). Initially, nothing is
chunked.
The chunking of a ``ChunkString`` can be modified with the ``xform()``
method, which uses a regular expression to transform the string
representation. These transformations should only add and remove
braces; they should *not* modify the sequence of angle-bracket
delimited tags.
:type _str: str
:ivar _str: The internal string representation of the text's
encoding. This string representation contains a sequence of
angle-bracket delimited tags, with chunking indicated by
braces. An example of this encoding is::
{<DT><JJ><NN>}<VBN><IN>{<DT><NN>}<.>{<DT><NN>}<VBD><.>
Constructor and Description |
---|
ChunkString(Node chunk_struct)
"""
Construct a new ``ChunkString`` that encodes the chunking of
the text ``tagged_tokens``.
|
Modifier and Type | Method and Description |
---|---|
Node |
to_chunkstruct(java.lang.String label)
Return the chunk structure encoded by this ``ChunkString``.
|
java.lang.String |
toString() |
void |
xform(Regex regexp,
java.lang.String repl)
"""
Apply the given transformation to the string encoding of this
``ChunkString``.
|
public ChunkString(Node chunk_struct)
chunk_struct:
- The chunk structure to be further chunked.public Node to_chunkstruct(java.lang.String label)
label
- : a String
label to use for chunk nodespublic void xform(Regex regexp, java.lang.String repl)
regexp:
- A regular expression matching the substring
that should be replaced. This will typically include a
named group, which can be used by ``repl``.repl:
- An expression specifying what should replace the
matched substring. Typically, this will include a named
replacement group, specified by ``regexp``.
"""public java.lang.String toString()
toString
in class java.lang.Object