Added by jgarnett, last edited by jgarnett on Aug 14, 2007  (view change)

Labels

 
(None)

At a certaint level we all want the same thing - the ability to use normal Java objects instead of a sticky xml mess.

GeoTools XML parsing code falls into three catagories:

  • SAX Based
  • DOM Based
  • Schema Assisted

The library has two forms of XML production available:

  • XML Transform
  • Schema Assisted

The Geotools library has spent a lot of time and energy on this problem, this document introduces you to what is available.

Choosing an XML Technology

SAX Parser

The SAX Parsers work using callbacks, they pass control between several hard coded implementations. For basic use you create your own SAX Parser (say responding to a new Geometry being parsed) and pass control off on of the geotools implementations and wait for it to call you.

Pros

  • Reuse of a SAX parser is possible
  • Allows streaming for large content.

Cons

  • Rather tricky to set up
  • Very tricky to Reuse
  • the SAX parsers are rather "brittle" and difficult to maintain
  • are currently hardcoded to pass control between themsleves, making support for new specifications tricky

The later Schema Assisted parsers are an attempt to let mere mortals create a tree of handlers on the fly (they use the schema document to do a bunch of the grunt work) that is hard coded for the SAX Parsers.

DOM Parser

The DOM parsers wander through an in memory DOM doing their best to extract content. Delegation is hard coded in much the same way as with the SAX parsers.

Scope

  • Filter 1.0
  • GML2
  • Style 1.0

Pros

  • accessible technology
  • very nice for quick examples
  • obvious how to use the parser

Cons

  • solution does not scale to large content as steaming not available
  • can only handle direct use of GML
  • additional coding is always required to parse your own content

XML Transform

Traditional Transform idiom, traverse GeoTools data structure to produce XML.

Scope

  • Filter 1.0
  • GML2
  • Style 1.0

Pros

  • fast
  • scalable (does not load features into memory)

Cons

  • not open ended
  • need to carefully provide hints before use
  • may revisit data several times (for bounding box and then content)

XML Data Objects (XDO)

This is the thrid generation of our schema assisted parser idea (where the SAX bindings are referenced by the XMLSchema rather then directly hard coded). This is fast scalable solution that supports reading and writing.

This technology has been spun out as a seperate Source Forge project headed up by David Zwiers.

Scope

  • WFS Read Capabilities
  • WFS Write Requests
  • WMS Read Capabilities
  • Filter Read/Write
  • GML2 Read/Write

Pros

  • fast and proven
  • ability to handle MASSIVE content like FeatureCollections

Cons

  • library has forked from GeoTools and we are not currently in sync
  • how to create new bindings is not obvious

GTXML

The forth generation schema assissted parser, using the XML Schema data structure (rather then hard coding) to figure out what binding to call.

Scope

  • Filter 1.0
  • Filter 1.1
  • GML2
  • GML3 Simple Feature Profile

Pros

  • schema aware allowing use of new content without additional coding
  • code generator for making custom bindings
  • streaming content for MASSIVE content like feature collections
  • support for content generation