Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 14 Next »

5 Minute Tutorial for StaxMate

As per introduction, StaxMate is designed to allow

  • Reading XML content efficiently, conveniently and correctly
  • Writing XML content efficiently, conveniently and correctly

To see what this means in action, let's have a look at simple sample use cases.

Writing an XML document

Let's start with a particular simple and common use case: that of writing (aka generating) XML content. Content can come from variety of sources; here we only consider generation, not where data comes from.

Let's say we want to output XML document like this one:

empl.xml

Let's first look at one possible piece of code to output such a document:

WriteEmployee/StaxMate

So how does that work? Here is what is being done and why:

  1. First we create a StaxMate output factory: here we use automatic introspection that Stax XMLOutputFactory offers (to find any plugged implementation)
    • This output factory is full thread-safe (after configuration), and should be reused: usually a single(ton) instance is enough for the whole application or service.
  2. Create the document output object. Document object just denotes document itself, not a root element: but we will add the root element under it (could also add comments, processing instructions). In this case, we will write an xml file.
  3. We can also choose to "pretty print" output document, by enabling indentation.
  4. It is often useful to add xml comments that include developer-readable information about generator process; it is easily ignored by xml readers
  5. Add Employee element
  6. Add attribute 'id' with typed value (StaxMate can convert from number to String)
  7. Add 'name' element
  8. Add both 'first' element and its textual contents
    • (note: in Woodstox 2.0 could use "name.addElementWithCharacters()" to simplify this!)
  9. Add similarly, 'last' and its textual contents
  10. Important: MUST close the root-level object; otherwise start elements may not get closed, contents not flushed to the file.

Some things to consider:

  • Although we use XMLInputFactory implementation auto-discovery here, it is often preferable to pass this information from outside, perhaps using Dependency Injection framework (can then specify which impl to use; recommended one is com.ctc.wstx.stax.WstxOutputFactory, for Woodstox).
  • Indentation usually should NOT be used for production systems – since it just adds 20-30% to document size without any useful additional information – but it can be convenient during development and debugging.
  • Instead of writing contents to a file, we could have uses a ByteArrayOutputStream, StringWriter, or servlet's OutputStream as well; there are many convenience methods for typical targets.
  • Typed conversion for 'id' attribute value is just one example of ability to use Java types for output, not having to convert to Strings first
  • Methods that add child containers (SMOutputElement usually) can be chained, if the element itself is not needed for anything else; this can shorten the code nicely without reducing readability.

Reading XML content

Now that we have written some XML content, let's read it back in.
Let's start with the code (that uses xml document that we saw earlier):

ReadEmployee.java

So how does that work? Here is what is being done and why:

  1. First we create a StaxMate input factory: similar to constructing output factory
  2. Create the root cursor; only traverses over the root element, ignores non-elements like comments
  3. Cursors are initially not positioned over an event, need to advance
  4. Since we know cursor must point to root element, we can access employee id attribute
  5. Need to create a cursor for traversing, filter out all except "name" elements
  6. advance to the first (and only) "name" child element
  7. construct the innermost cursor for traversing immediate children of "name"
  8. advance to the first child ("first")
  9. collect all textual content
  10. advance to the second child ("second")
  11. collect all textual content
  12. close the underlying stream reader (important: but bit clumsy – something to address with StaxMate 2.0+)

Here are some more things to consider:

  • As with output factory, usually it's better to inject specific factory
  • Typed access works for cursors as well as for output elements: note, too, that both typed attribute values and element value can be handled (example only had typed attribute values)
  • Cursors initially do not point to an event – it is possible there are no events to point to, even – so one must always advance cursor after construction.
    • With StaxMate 2.0 this can be simplified with SMInputCursor.advance() call that is chainable (equivalent to 'next()', but instead of event type, returns cursor itself), and therefore works nicely with child-cursor construction calls.
  • Code above is not very robust: specifically, it does not verify that the elements are as expected: for example, what if "first" and "last" elements where switched? So production code should add a few more lines for checking
  • The last line, closing the underlying stream reader, is important thing to do currently, to ensure underlying input source (File input stream) gets closed
    • This is one area where StaxMate API should be improved in future.

Better than Stax 1.0?

Since the original claim was that StaxMate makes things more convenient, let's see what equivalent code for writein would look like, if we didn't have StaxMate:

WriteEmployee/Stax 1.0

So what does this tell us?

  • Code with Stax 1.0 is quite a bit more verbose; with indentation, more than twice as many lines, but even without it, 50% more.
    • But while code is longer, it is definitely less readable (that is, StaxMate's compactness tend to improve code readability)
  • Non-scoped nature of writing means that there is additional redundancy – write end elements must be written explicitly – and this can easily cause bugs ("which start element did this match with again?"; hence comments above)

Similarly we could show the alternative for reading XML content: but unfortunately that code would be even more verbose. So to keep this tutorial brief, we'll leave that exercise to readers.

Advanced Use Cases

Here are links to some use cases that show more advanced usage:

TO BE COMPLETED

  • No labels