Configuring Woodstox XMLStreamReaders
(to be edited!)
Woodstox readers have two kinds of configurable properties; ones defined by the
StAX specification, and ones added by Woodstox itself.
StAX 1.0 specified properties
All property ids in this property group refer to constants defined in javax.xml.stream.XMLInputFactory.
StAX specification explains the properties to some degree; following table lists actual implementation details of Woodstox' implementation of the properties.
IS_COALESCING:
<br /> <span class="className">java.lang.Boolean</span>
<br /> False
Whether the reader should combine all adjacent text
events (events of type CHARACTERS, CDATA and SPACE) into a single event.
If set to true, the reader will combine all such adjacent text events
into a single event of type CHARACTERS.
If set to false, Woodstox will not combine events of different
type (and may in fact – depending on setting
<a href="#P_MIN_TEXT_SEGMENT">P_MIN_TEXT_SEGMENT</a>, see below – split
individual physical events into multiple returned events).
<p>
Turning this option on may slightly reduce performance.
</td>
</tr>
<tr>
<td><span class="propertyId">IS_NAMESPACE_AWARE</span>
<br /> <span class="className">java.lang.Boolean</span>
<br /> False
</td>
<td>
Whether parser will do namespace handling (as specified by the XML Namespaces
specification) or not.
<p>
If true, Reader will do namespace handling, proper URI
resolution is done using prefixes and namespace declarations, and local names
can not contain commas.
<p>
If false, namespace declarations get no special
handling but are included as normal attributes, full element names
(prefix and local name) are accessible via "local name" accessor methods.
Prefix accessors will then always return null, and namespace URI accessors
empty String to indicate the default namespace.
</td>
</tr>
<tr>
<td><a name="IS_REPLACING_ENTITY_REFERENCES"></a><span
class="propertyId">IS_REPLACING_ENTITY_REFERENCES</span>
<br /> <span class="className">java.lang.Boolean</span>
<br /> False
</td>
<td>
Whether the Reader will automatically expand general parsed entities or not.
<p>
If set to true, the reader will automatically resolve reference if
necessary (external entities), and then expand entity value.
<p>
If set to false, the reader will return return all general entities
(except for the 4 pre-defined entities – &, ', <
and >
as events of type ENTITY_REFERENCE.
<p>
Note: whether external (parsed) entities are handled at all depends
on value of property <a href="#IS_SUPPORTING_EXTERNAL_ENTITIES"
>IS_SUPPORTING_EXTERNAL_ENTITIES</a>.
<br />
Note: this does NOT affect the way <b>character entities</b> (like
<code>&</code>) are handled – they are always automatically expanded.
</td>
</tr>
<tr>
<td><a name="IS_SUPPORTING_EXTERNAL_ENTITIES"></a><span
class="propertyId">IS_SUPPORTING_EXTERNAL_ENTITIES</span>
<br /> <span class="className">java.lang.Boolean</span>
<br /> True
</td>
<td>
Whether the reader will support references to general external (parsed)
entities or not.
<p>
If true, the reader will support such entities normally, either
automatically resolving and replacing such entities (if enabled by
property <a href="#IS_REPLACING_ENTITY_REFERENCES"
>IS_REPLACING_ENTITY_REFERENCES</a>), or, returning entity reference
event.
<p>
If false, reader will not support such references, and will throw an
exception if one is encountered. It is legal to define,
but not use (refer to), such entities.
</td>
</tr>
<tr>
<td><span class="propertyId">SUPPORT_DTD</span>
<br /> <span class="className">java.lang.Boolean</span>
<br /> true
</td>
<td>
Whether the reader will do any handling of internal and external DTD
subsets.
<p>
If true, the reader will parse both the internal and external DTD subsets,
reading all constructs. General entities declared can then be used by
the document; external DTD subsets read may also be cached (if property
<a href="#CACHE_DTDS">CACHE_DTDS</a> is set to true; internal subsets
are never cached as there is no way to reliably identify reuse).
Also, if property
<a href="#IS_VALIDATING">IS_VALIDATING</a> is set to true, document
will be validated.
<p>
Note: Turning this feature off will also prevent
<a href="#IS_VALIDATING">IS_VALIDATING</a> from having any effect as
DTD subsets will not be read.
</td>
<tr>
<td><span class="propertyId">IS_VALIDATING</span>
<br /> <span class="className">java.lang.Boolean</span>
<br /> False
</td>
<td>
Whether the reader will validate the XML document against DTD specified
by the document or not
<p>
If true (and property <a href="#SUPPORT_DTD">SUPPORT_DTD</a>
is true), and the document contains the DTD declaration (DOCTYPE directive
that refers to a DTD that can be read and/or has embedded internal DTD subset),
the reader will try to do following things:
<ul>
<li>Validate element structure using content declarations of ELEMENT entries
found in DTD
</li>
<li>Resolve attribute types and default values for attributes, accessible
via StAX accessors.
</li>
<li>Validate attribute values against definitions, including id uniqueness
checks, if any.
</li>
<li>Recognize type of whitespace, so that <b>ignorable whitespace</b>
can be detected.
</li>
</ul>
<p>
If false, the reader may still process DTD declaration (internal and external
subsets), but will not do validating, nor access or use attribute type or default
value information. Ignorable white space detection may be done by the
reader if that is feasible [note: need to clarify exact rules]
</td>
</tr>
<tr>
<td><span class="propertyId">REPORTER</span>
<br /> <span class="className">javax.xml.stream.XMLReporter</span>
<br /> null
</td>
<td>
Object to use for notifying calling application about recoverable problems
the document has. These include things like multiple ENTITY and ATTLIST
declarations in DTDs. If null, no problem notifications are sent.
</td>
</tr>
<tr>
<td><span class="propertyId">RESOLVER</span>
<br /> <span class="className">javax.xml.stream.XMLResolver</span>
<br /> null
</td>
<td>
Object that will be called to try to resolve external references to
the external DTD subset, general entities and parameter entities.
If set to non-null value, this resolver will be called first, before
the default resolution mechanism. If resolver returns a valid return
value (see below), it will be used as the source for the entity
value.
<p>
Note: Currently Woodstox only supports return values of type
<code>java.io.InputStream</code>; other types that StAX API suggests
are not accepted, and will cause an Exception to be thrown.
<br />
Note: if using Woodstox reader, it is recommended that the specific
<a href="#P_ENTITY_RESOLVER">P_ENTITY_RESOLVER</a>
(for general entity references)
and
<a href="#P_DTD_RESOLVER">P_DTD_RESOLVER</a>
(for external DTD subset and parameter entities)
properties are used instead: <code>XMLResolver</code> interface
unfortunately lacks some of context handling features it should have
(problem with StAX specification). In fact, internally Woodstox
will just wrap this Object and set it as <a href="#P_ENTITY_RESOLVER">
value.
</td>
</tr>
<tr>
<td><span class="propertyId">ALLOCATOR</span>
<br /> <span class="className">javax.xml.stream.XMLEventAllocator</span>
<br /> null
</td>
<td>
Defines the factory object used to create the event objects created by
the Event API (<code>XMLEventReader</code>). If left to null, Woodstox
will use the default implementation,
<code>com.ctc.wstx.stax.event.DefaultEventAllocator</code>.
<p>
Note: although it is possible to implement instances from scratch, it
is strongly encouraged that instances used with Woodstox are created by
extending
<code>com.ctc.wstx.stax.event.DefaultEventAllocator</code>;
mostly because it takes care of some of problems with the specification.
Specifically, some data needed for some events is not available via
basic StAX API: and as a result, default implementation accesses some
information via extended Woodstox API, when used with Woodstox event
reader.
</td>
</tr>
</table>
StAX2 (v1.0) specified properties
All property ids in this property group refer to constants defined in org.codehaus.stax2.XMLInputFactory2
- P_REPORT_ALL_TEXT_AS_CHARACTERS
- P_REPORT_PROLOG_WHITESPACE
- Short desc:
- Type: java.lang.Boolean
- Default value: **
- P_INTERN_NAMES
- P_REPORT_PROLOG_WHITESPACE
- Short desc:
- Type: java.lang.Boolean
- Default value: **
- P_INTERN_NS_URIS
- P_REPORT_PROLOG_WHITESPACE
- Short desc:
- Type: java.lang.Boolean
- Default value: **
- P_PRESERVE_LOCATION
- Short desc:
- Type: java.lang.Boolean
- Default value: **
- P_REPORT_PROLOG_WHITESPACE
- Short desc:
- Type: java.lang.Boolean
- Default value: **
Woodstox custom properties
All property ids in this property group refer to constants defined in
interface <code>com.ctc.wstx.stax.WstxInputProperties</code>.
</p>
<p>
Default values are current as of version 0.8.8.
</p><p>
Note, also, that in some cases there may be more detailed information
available about specific properties in
<a href="../curr/javadocs/index.html">Javadocs</a> for
classes:
</p>
<ul>
<li><code>com.ctc.wstx.stax.WstxInputFactory</code></li>
<li><code>com.ctc.wstx.stax.WstxInputProperties</code></li>
<li><code>com.ctc.wstx.stax.ReaderConfig</code></li>
</ul>
<table border="1" frame="border" rules="all">
<tr valign="top">
<th align="left">Property id
<br /> Value space
<br /> Default value
</th>
<th width="*">Effects</th>
</tr>
<tr>
<td><a name="P_NORMALIZE_LFS"></a><span
class="propertyId">P_NORMALIZE_LFS</span>
<br /> <span class="className">java.lang.Boolean</span>
<br /> True
</td><td>
Whether the reader should normalize linefeeds in textual content (text, CDATA sections,
processing instruction data, comments, CDATA attribute values) according to XML
specifications.
<p>
If set to true, the reader will normalize such linefeeds; if false,
will leave the linefeeds as they are in the input.
<p>
Main reasons for setting this to false is to preserve native linefeeds on platforms
that do not use 'standard' XML linefeed (Windows, MacOS). It may also slightly
improve performance.
</td>
</tr>
<tr>
<td><a name="P_NORMALIZE_ATTR_VALUES"></a><span
class="propertyId">P_NORMALIZE_ATTR_VALUES</span>
<br /> <span class="className">java.lang.Boolean</span>
<br /> True
</td><td>
Whether the reader should normalize white space in attribute values
according to XML specifications. This is in addition to (optional) linefeed conversion,
which may or may not be done depending on value of
<a href="#P_NORMALIZE_LFS">P_NORMALIZE_LFS</a>.
<br />
If set to true, the reader will normalize such white space; if to false will not
normalize (except for the optional linefeed conversion, if enabled).
<p>
Main reasons for setting this to false is to minimize any changes to the input document
format. It may also slightly improve performance.
</td>
</tr>
<tr>
<td><a name="P_REPORT_PROLOG_WHITESPACE"><span
class="propertyId">P_REPORT_PROLOG_WHITESPACE</span>
<br /> <span class="className">java.lang.Boolean</span>
<br /> False
</td>
<td>
Whether the reader should report (ignorable) white space events in XML document prolog
and epilogs, ie. outside the actual XML Tree.
<p>
If set to true, will return SPACE events to indicate the ignorable white space; if set
to false will quietly just skip the white space.
<p>
Main reason to set this property on is to minimize changes to the input document
formatting. Turning it to true may have slight performance overhead.
</td>
</tr>
<tr>
<td><a name="P_REPORT_ALL_TEXT_AS_CHARACTERS"></a><span
class="propertyId">P_REPORT_ALL_TEXT_AS_CHARACTERS</span>
<br /> <span class="className">java.lang.Boolean</span>
<br /> False
</td>
<td>
Whether the reader should report all text events inside the document content as being of
type CHARACTERS or not (note that prolog/epilog white space will always be reported as
SPACE).
<p>
If true, all text (including ignorable white space) is to be reported
as type CHARACTERS; if false, the real type is returned for all cases but
for coalesced CDATA (which is always reported as CHARACTERS). [note: CDATA can only be
coalesced when <a href="#IS_COALESCING">IS_COALESCING</a> is set to True].
</td>
</tr>
<tr>
<td><a name="P_INTERN_URIS"></a><span
class="propertyId">P_INTERN_NS_URIS</span>
<br /> <span class="className">java.lang.Boolean</span>
<br /> True
</td><td>
Whether the reader should intern the values of namespace URIs or not.
<p>
If set to true, the reader will call String.intern() on all parsed namespace URIs.
If not, URI Strings are left as is, and since they are constructed from parsed data
are generally never intern()ed.
<p>
Usually there is no need to set this feature to false, since intern()ing overhead should
not be significant. Having this option set to true is good for performance especially
when accessing namespace prefixed attribute values.
<p>
Note: this option only matter is namespaces are supported, ie.
<a href="#IS_NAMESPACE_AWARE">IS_NAMESPACE_AWARE</a> is set to True.
</td>
</tr>
<tr>
<td><span class="propertyId">P_VALIDATE_TEXT_CHARS</span>
<br /> <span class="className">java.lang.Boolean</span>
<br /> False
</td>
<td>
Whether the Reader should validate text content (text segments, CDATA sections,
processing instruction data, comment contents, attribute values [attribute type CDATA])
according to XML 1.1 rules or not.
<p>
If set to True, should verify that all characters included are in valid XML character
range (not just valid Unicode characters); if False will only do basic null character
checks but otherwise assume content is ok.
<p>
Note: Turning this option on will impose some processing overhead on parsing.
<br />
<b>NOTE</b>: Not yet fully implemented.
</td>
</tr>
<tr>
<td><a name="CACHE_DTDS"></a><span
class="propertyId">P_CACHE_DTDS</span>
<br /> <span class="className">java.lang.Boolean</span>
<br /> True
</td>
<td>
Whether the Reader should cache external DTD subsets read (which is done for
documents that have such subsets when <a href="#SUPPORT_DTD">SUPPORT_DTD</a>
is set to True) or not.
<p>
If set to True, will cache limited set of external DTD subsets (in the order
of 20 - 50 subsets max., depending on whether J2ME implementation or
'normal' one is used) in hopes of being able to reuse them;
if false, will do no caching.
<p>
Setting this option to false will prevent DTD subset reuse; setting it to True
will add some memory overhead for cached DTDs.
<p>
Note: as DTDs are cached on per-factory basis, it is important to try to reuse input
factory instances for parsing.
</td>
</tr>
<tr>
<td><a name="P_LAZY_PARSING"></a>
<span class="propertyId">P_LAZY_PARSING</span>
<br /> <span class="className">java.lang.Boolean</span>
<br /> True
</td>
<td>
Whether the reader is allowed to so-called "lazy" parsing, ie. only parse
parts of contents that the calling application has requested.
Benefit of this approach is most significant for long textual content
that is skipped (often the case for comments, ignorable white space,
sometimes for CDATA and text segments); if so there is no need to
allocate memory for reading the textual content. The (only?) downside is
that this may also lead to "lazy exceptions"; since full parsing may not
be done on call to <code>reader.next()</code>, but later on, exceptions
are also only thrown when the problem is encountered later on.
<p>
If set to True, will allow Reader to only parse/load data as needed;
if set to False will force Reader to always read in all the data.
</td>
</tr>
<tr>
<td><a name="P_PRESERVE_LOCATION"></a><span
class="propertyId">P_PRESERVE_LOCATION</span>
<br /> <span class="className">java.lang.Boolean</span>
<br /> True
</td>
<td>
Whether the Event objects (created when using Event API, using
<code>javax.xml.stream.XMLEventReader</code>) will store actual
accurate Location information (private
<code>javax.xml.stream.Location</code> Object) or not.
<p>
Turning this feature off reduces memory usage somewhat, as well
as increases performance due to lessened garbage collection
time (when reclaiming discarded Event objects). Performance
improvement may be up to 25% in some cases.
</td>
</tr>
<tr>
<td><a name="P_INPUT_BUFFER_LENGTH"></a><span
class="propertyId">P_INPUT_BUFFER_LENGTH</span>
<br /> <span class="className">java.lang.Integer</span>
<br /> 4000/2000 (J2SE/J2ME)
</td>
<td>
Determines the size of the input buffers the readers use for reading XML content
(for input streams, size in bytes; for stream readers in characters).
<p>
Setting this property to a low value helps in saving some memory, but negatively
impacts performance. Setting this property to reasonably high value may help
in improving performance, but the benefit decreases for bigger buffer sizes.
</td>
</tr>
<tr>
<td><span class="propertyId">P_TEXT_BUFFER_LENGTH</span>
<br /> <span class="className">java.lang.Integer</span>
<br /> 2000/1000 (J2SE/J2ME)
</td>
<td>
Determines the initial text buffer segment size used internally to hold (processed)
text segment (text, CDATA, comment, proc. instr) contents. As with
<a href="#P_INPUT_BUFFER_LENGTH">INPUT_BUFFER_LENGTH</a> has some effect on
performance. However, this property is less critical since the segment size
will be incrementally increased on as-needed basis (since it has to, in order to
be able to store the whole text segment in question).
</td>
</tr>
<tr>
<td><a name="P_MIN_TEXT_SEGMENT"></a><span
class="propertyId">P_MIN_TEXT_SEGMENT</span>
<br /> <span class="className">java.lang.Integer</span>
<br /> 64
</td>
<td>
Determines the shortest text segment (text, CDATA) length that the reader is
allowed to return to caller (but only if <a href="#IS_COALESCING">IS_COALESCING</a>
is set to False!).
<p>
Setting this property to a high value prevents splitting of physical text segments
into multiple events, but may slightly decrease parser performance. Leaving the
value to reasonably low value will let the reader optimize segmentation.
<p>
Note: setting this to a low value does not guarantee that the reader will only return
short segments; it just allows it to do so. Actual length of segments depends on
readers internal state and size of the text buffer (see <a href="#P_TEXT_BUFFER_LENGTH"
>P_TEXT_BUFFER_LENGTH</a>).
</td>
</tr>
<tr>
<td><span class="propertyId">P_CUSTOM_INTERNAL_ENTITIES</span>
<br /> <span class="className">Map</span>
<br /> null
</td>
<td>
This property allows calling application to specify further pre-defined general
internal entity values, in addition to the standard ones (amp, lt, gt, apos). Note that
the values need to be normally encoded, as if they were actually declared in
a DTD subset; meaning they do get re-parsed properly and thus can refer to other
entities (character and general entities).
<p>
Note: These entities are not used as parameter entities in DTD subsets.
</td>
</tr>
<tr>
<td><a name="P_DTD_RESOLVER"></a><span class="propertyId">P_DTD_RESOLVER</span>
<br /> <span class="className">com.ctc.wstx.stax.WstxInputResolver</span>
<br /> null
</td>
<td>
Resolver object that will be used as the primary DTD reference resolver, instead
of the default one. This will thus get called when resolving reference to the
external DTD subset, AND when resolving external parsed parameter entities (entities
declared in DTD subsets that have '%' prefix).
<p>
Note that the default entity resolver will always be used after calling this resolver,
if this resolver returns null.
</td>
</tr>
<tr>
<td><a name="P_ENTITY_RESOLVER"></a><span class="propertyId">P_ENTITY_RESOLVER</span>
<br /> <span class="className">com.ctc.wstx.stax.WstxInputResolver</span>
<br /> null
</td>
<td>
Resolver object that will be used as the primary entity reference resolver, instead
of the default one. This will thus get called when resolving declared external parsed
entity references (but not external DTD subset reference or parameter entities – see
<a href="#P_DTD_RESOLVER">P_DTD_RESOLVER</a>).
<p>
Note that the default entity resolver will always be used after calling this resolver,
if this resolver returns null.
</td>
<tr>
<td><a name="P_BASE_URL"></a><span class="propertyId">P_BASE_URL</span>
<br /> <span class="className">java.net.URL</span>
<br /> null
</td>
<td>
Basic reference location that can be used by the resolvers (DTD, ENTITY) when resolving
relative references. If set to non-null, will be used as the main context for resolution;
if left as null, system id (if any passed) will be used instead, assuming it is either
a valid URL, or reference from the current directory at the server.
<p>
It is often good idea to set this property when the application will be run as
a managed service (in an application server etc.), to ensure that the 'root' location
is well-known.
</td>
</tr>
</table>
<h2>Profiles (property groups)</h2>
In addition to being able to set individual values separate, Woodstox also
allows for using "profiles"; pre-set values for group of properties to
optimize readers for specific goal.
<p>
To use profiles, you need to use Woodstox-specific method calls, since
StAX API does not have similar concept.
</p><p>
As with Woodstox-specific properties, in some cases there may be more
detailed information available about specific profiles in
<a href="../curr/javadocs/index.html">Javadocs</a> for
classes:
</p>
<ul>
<li><code>com.ctc.wstx.stax.WstxInputFactory</code></li>
<li><code>com.ctc.wstx.stax.ReaderConfig</code></li>
</ul>
<table border="1" frame="border" rules="all">
<tr valign="top">
<th>Method call</th>
<th width="*">Effects</th>
</tr>
<tr>
<td>configureForMaxConformance()</td>
<td>
Profile that will try to make processing as close to the one defined
by the XML specification as possible. This may have some (usually slight)
performance overhead, but no increased memory usage.
<p>
Will set following property values:
<ul>
<li><a href="#P_NORMALIZE_LFS">P_NORMALIZE_LFS</a>: True
</li>
<li><a href="#P_NORMALIZE_ATTR_VALUES">P_NORMALIZE_ATTR_VALUES</a>: True
</li>
</ul>
</td>
</tr>
<tr>
<td>configureForMaxConvenience()</td>
<td>
Profile that will try to use the settings that will make parsing and
exception handling "as easy as possible". This means trying to ensure
that some of the things that might be done to optimize performance
are disabled; things like splitting of text segments (disabled with
this profile), as well as to suppress reporting things that are usually
not very useful (like ignorable white space in prolog/epilog).
<p>
Will set following property values:
<ul>
<li><a href="#IS_COALESCING">IS_COALESCING</a>: True
</li>
<li><a href="#P_REPORT_ALL_TEXT_AS_CHARACTERS">P_REPORT_ALL_TEXT_AS_CHARACTERS</a>: True
</li>
<li><a href="#IS_REPLACING_ENTITY_REFERENCES">IS_REPLACING_ENTITY_REFERENCES</a>: True
</li>
<li><a href="#P_REPORT_PROLOG_WHITESPACE">P_REPORT_PROLOG_WHITESPACE</a>: False (seldom interesting; only useful for round-tripping)
</li>
<li><a href="#P_LAZY_PARSING">P_LAZY_PARSING</a>: False (to prevent "lazy exceptions")
</li>
<li><a href="#P_PRESERVE_LOCATION">P_PRESERVE_LOCATION</a>: True (to make sure Event objects [if used] will have accurate Location information if needed)
</li>
</ul>
<td>
</td>
</tr>
<tr>
<td>configureForMaxSpeed()</td>
<td>
Profile that will try to optimize performance of the reader (ie. make
parsing as fast as possible), possibly by increasing memory usage
somewhat.
<p>
Will set following property values:
<ul>
<li><a href="#IS_COALESCING">IS_COALESCING</a>: False
</li>
<li><a href="#P_MIN_TEXT_SEGMENT">P_MIN_TEXT_SEGMENT</a>: 8 (characters)
[to allow reader versatility in reporting short segments]
</li>
<li><a href="#P_INPUT_BUFFER_LENGTH">P_INPUT_BUFFER_LENGTH</a>: 8000
(twice the default; can use even bigger values, but values above 64k
are unlikely to yield additional improvements)
</li>
<li><a href="#P_TEXT_BUFFER_LENGTH">P_TEXT_BUFFER_LENGTH</a>: 4000
(twice the default; increasing this amount is unlikely to yield
significant performance improvements)
</li>
<li><a href="#P_NORMALIZE_LFS">P_NORMALIZE_LFS</a>: False (most likely
to affect performance on Windows platform)
</li>
<li><a href="#P_NORMALIZE_ATTR_VALUES">P_NORMALIZE_ATTR_VALUES</a>: False
</li>
<li><a href="#P_CACHE_DTDS">P_INTERN_URIS</a>: True
</li>
<li><a href="#P_CACHE_DTDS">P_CACHE_DTDS</a>: True
</li>
<li><a href="#P_LAZY_PARSING">P_LAZY_PARSING</a>: True
</li>
<li><a href="#P_PRESERVE_LOCATION">P_PRESERVE_LOCATION</a>: False
(to minimize memory usage, and thereby increasing speed directly and via
reduced GC activity).
</li>
</ul>
</td>
</tr>
<tr>
<td>configureForMinMemUsage()</td>
<td>
Profile that will try to minimize memory usage of the reader, possibly
at some expense of performance
<p>
Will set following property values:
<ul>
<li><a href="#IS_COALESCING">IS_COALESCING</a>: False (coalescing may
require use of longer text buffers)
</li>
<li><a href="#P_MIN_TEXT_SEGMENT">P_MIN_TEXT_SEGMENT</a>: 16 (default)
</li>
<li><a href="#P_INPUT_BUFFER_LENGTH">P_INPUT_BUFFER_LENGTH</a>: 512
</li>
<li><a href="#P_TEXT_BUFFER_LENGTH">P_TEXT_BUFFER_LENGTH</a>: 512
</li>
<li><a href="#P_CACHE_DTDS">P_CACHE_DTDS</a>: False
</li>
<li><a href="#P_LAZY_PARSING">P_LAZY_PARSING</a>: True (in addition to
possible performance improvement, lazy parsing also prevents having to
store unneeded data in memory before needed)
</li>
<li><a href="#P_PRESERVE_LOCATION">P_PRESERVE_LOCATION</a>: False;
less Location objects used by Events, less memory usage.
</li>
</ul>
</td>
</tr>
<tr>
<td>configureForRoundTripping()</td>
<td>
Profile that will try to minimize changes to output formatting during
input processing and parsing as possible, to allow for output to resemble
the input as closely as possible (where structure is not changed). This
means suppressing all mandated character conversions, for one thing.
<p>
Will set following property values:
<ul>
<li><a href="#P_REPORT_PROLOG_WHITESPACE">P_REPORT_PROLOG_WHITESPACE</a>: True
</li>
<li><a href="#P_NORMALIZE_LFS">P_NORMALIZE_LFS</a>: False
</li>
<li><a href="#P_NORMALIZE_ATTR_VALUES">P_NORMALIZE_ATTR_VALUES</a>: False
</li>
<li><a href="#IS_COALESCING">IS_COALESCING</a>: False (to prevent reader from
combining adjacent CDATA/text sections)
</li>
<li><a href="#P_MIN_TEXT_SEGMENT">P_MIN_TEXT_SEGMENT</a>:
<code>Integer.MAX_VALUE</code> (read "unlimited", to prevent reader from
chopping text/CDATA sections into smaller chunks)
</li>
<li><a href="#IS_REPLACING_ENTITY_REFERENCES">IS_REPLACING_ENTITY_REFERENCES</a>: False (to allow writer to output unexpanded entities)
</li>
<li><a href="#P_REPORT_ALL_TEXT_AS_CHARACTERS">P_REPORT_ALL_TEXT_AS_CHARACTERS</a>: False (to preserve CDATA type when output)
</li>
</ul>
</td>
</tr>
</table>
