Misc. Stax Helpers
This page contains (or links to) other StAX-based or related deliverables that do not fall under any other category (neither core Woodstox, or one of sub-projects). Many classes here could be moved to be parts of packages they adapters for, or Stax-utils page .
All code contained is licensed under LGPL (Lesser General Public License). If necessary it can be licensed with alternate licenses (ASL, BSD).
JDom Adapter classes
(Last update: 17-Oct-2005)
These are classes that allow using StAX-processors with JDOM XML-processor. JDOM is a processor that allows building, modifying and outputting of documents stored as Document Object Models, similar to standard DOM standard, but with more convenient Java bindings. Check out JDOM home page for details.
Currently these Java classes need to be used with JDOM:
- For building JDom documents using StAX readers:
- org.jdom.input.StAXBuilder is the actual builder (which has some inner classes to support it too)
- org.jdom.input.StAXTextModifier is a supporting class used by StAXBuilder; it allows for modifying contents of text nodes when building the tree. It is usually used for trimming out some white space, but can be used to do all kinds of text manipulations efficiently, as a part of tree building process.
- For serializing JDom documents using StAX writers:
- org.jdom.output.StAXOutputter is the serializer.
These classes allows building JDOM documents given a javax.xml.stream.XMLStreamReader instance. They have been tested with JDOM 1.0 release codebase, and seem to function ok. However, no extensive testing has been done.
Following problems are known to exist with the builder:
- Due to StAX 1.0 API's poor support for DTD event, tree built will not have enough DTD information to fully reconstruct it from JDOM tree. :-/ (note: StAX2 interface might allow solving this problem!)
DOM (plain old) Adapter classes
These are classes that allow using StAX-processors with standard DOM XML-processors (like Xerces and Crimson).
Currently these helper classes exist:
- For building DOM documents using StAX readers:
- Stax2DomBuilder.java is the builder that constructs DOM tree objects using an XMLStreamReader
- For serializing DOM documents using StAX writers:
- Dom2StaxSerializer.java can serialize DOM documents using XMLStreamWriters.
As with JDom, DOM has poor support for creating DTD constructs (it's not even possible to create DOCTYPE declaration easily it seems, using just standard DOM interfaces).
XmlPull Adapter classes
These are components that help in using StAX-processors with XmlPull XML-processors. XmlPull API specifies XML parsers and serializers (for inputting and outputting XML documents, respectively).
There are currently 3 classes that basically implement a "XmlPull-on-StAX" wrapper, that allows running code that expects XmlPull API, using a StAX implementation:
- com.ctc.xpp.XppOnStaxFactory (implements org.xmlpull.v1.XmlPullParserFactory).
- com.ctc.xpp.XppParserOnStaxReader (implements org.xmlpull.v1.XmlPullParser).
- com.ctc.xpp.XppSerializerOnStaxWriter (implements org.xmlpull.v1.XmlSerializer).
Currently such combination achieves reasonable compatibility with normal XML-processing, but it is not possible to have 100% compatibility, due to following incompatibilities between APIs:
- Xmlpull expects fairly sophisticated event merging with all of nextXxx methods, except for nextToken(): StAX does not mandate such coalescing, and it's rather tricky to implement on top of StAX, except by maybe peek()ing next event... but then it'd have to use event reader. Doesn't seem to be worth the hassle? A related problem is Xmlpull requirement to report unexpanded entities with certain settings: this is similarly hard to do when otherwise mergings need to be done.
- Namespace context and scoping rules are different. XPP allows access at any point; StAX only gives access while START/END_ELEMENT events are current events. As a result, wrapper only gives access when START/END_TAG events are at top of the stack.
- StAX does not have a (standard) feature for manually defined entities; as a result, wrapper does not make use of them. Note that SOME StAX implementations (like Woodstox) do support the concept, so it would be possible to extend this wrapper for specific implementations?
- StAX does not have a (standard) feature that would allow reporting namespace declarations as attributes; except by disabled namespace support altogether. Individual implementations could define such support, if it seems useful... but generic wrapper can't easily support it.
- StAX does not have a (standard) feature that would allow checking whether a start element is an empty element.
For what it's worth, combination of Woodstox StAX processor and these classes achieves about 50% pass rate with XmlPull conformance test suite. If empty element accessor method was added, this could probably get closer to 75%.