Generic Framework for XML with two important and difficult goals:
- Read in XML to produce Java Objects (Optional Validation)
Translates Java Objects into an XML Document
Left to its own devices, the Framework will give you something like xBeans. But as we will see, it gives you the opportunity to be much smarter, re-using your code automatically.
The Diagram on the left is a graphical representation of two Geometry Objects.
The Diagram on the left is a portion of the XML representation of these Geometry Objects.
If we had used the default xBeans implementation, we would have represented all the data. Instead, we included some extra smarter algorithms and created JTS Java Objects. This provides us with all the optimized operations included in a JTS Object, such as checking for a Geometric Intersection between the Line and the Polygon instances.
<gml:coordinates decimal="." cs="," ts=" ">
<Description>My office building</Description>
In the example to the left, with the default xBean implementation, the coordinates would be unparsed and the ‘Location’ would not be a JTS Geometry. But since the schema for Building (included later) defines Location to be a Geometry, the default implementation would return a Bean with a ‘Geometry getLocation()’ method. This is all due to the frameworks ability to automatically recognize and link in your additional functionality.
Thus far we have introduced the notion of combining your smart directives with the default xBean implementation using XML schemas to produce real Java Objects. We have also seen that the framework will automatically reuse your hard work to parse XML representations of your object within other XML documents.
XML inheritance in the XML Framework
<xs:element name=”Building” type=”myBuildingType” substitutionGroup=”gml:_Feature”/>
<xs:element name=”Location” type=”gml:pointProperty”/>
<xs:element name=”Height” type=”xs:double”/>
Lets take a look at the Schema for our building example above. In the second and third lines we setup some XML inheritance. Inheritance is used by the XML framework to carry on doing smart operations. When you sub-class from the original parent instance, the framework will continue to attempt to complete your smart operations.
If we were to compare XML inheritance and Java Inheritance, we would see some similarities. The inheritance trees are similar to Interfaces in Java, except you may only implement a single interface. In XML the names of the interfaces are declared separately from the actual internals of the interface. Names are declared as abstract elements and the internals off the interface is represented as an extensible complex type, which the abstract element refers to.
In the example above, the two interfaces declared in the GML schema are used, “_Feature” and “pointProperty”. In the first example, we see “_Feature” declared as the interface name. From examining the GML namespace we found that the associated complex type was “AbstractFeatureType”. In line two above, we set up a Java ‘cast’ operation with the substitutionGroup declaration, and extend the “AbstractFeatureType” interface with two additional fields.
The interesting part of this example with respect to the XML framework is when you implement a smart algorithm for “_Feature” and “AbstractFeatureType”, in most cases you should not need to implement any Java code!
Streaming in the XML Framework
When your application deals with a lot of data, often it is important to not be memory bound. This has typically been solved using a buffering algorithm. In my implementation I found it was simplest to implement an Iterator that sits over a buffer. The Iterator executes in the main program thread, and spawns a secondary IO thread, which populates the buffer. Careful manipulation of the both threads execution patterns can result in an efficient and same streaming parser.
The diagram included above depicts my solution to creating a streaming parser. The arrows show data flow with respect to the thread’s activity status. Notice the consumer thread only accesses the data after allowing the producer thread to start population of the buffer, and that at no time do both threads execute in tandem.
Some words of warning:
- Be careful not to pause too long as you IO connection may timeout
- If you choose to manage your own IO timeouts, be careful to avoid infinite loops waiting on stalled IO.
- When killing loops, remember to both interrupt the IO and to terminate the thread’s execution.
For managing my own timeouts, I set a counter for the number of no-op yields in the producer thread, and reset the counter every time I produce an Object.
For killing Threads I use the Thread.interrupt() method, and a combination of a well-known exception (to catch in the consumer thread) and an internal state variable.