Skip to end of metadata
Go to start of metadata

Multiple namespaces for Castor's source generator

Background

I need to marshal and unmarshal documents like the following with Java classes from Castor's source generator:

<?xml version="1.0"?>
<!DOCTYPE rdf:RDF PUBLIC "-//DUBLIN CORE//DCMES DTD 2002/07/31//EN"
    "http://dublincore.org/documents/2002/07/31/dcmes-xml/dcmes-xml-dtd.dtd">
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:dc="http://purl.org/dc/elements/1.1/">
  <rdf:Description rdf:about="http://www.ilrt.bristol.ac.uk/people/cmdjb/">
    <dc:title>Dave Beckett's Home Page</dc:title>
    <dc:creator>Dave Beckett</dc:creator>
    <dc:publisher>ILRT, University of Bristol</dc:publisher>
    <dc:date>2002-07-31</dc:date>
  </rdf:Description>
</rdf:RDF>

This is the format proposed for Dublin Core (http://www.dublincore.org). It's a standard, so I don't have the option of changing the XML format (and I don't want to). I think the provision of access to existing standards is an excellent niche for Castor to fill, in addition to providing valuable support for the standards themselves (imagine if you could go to castor.org and download the JAR to read and write your favourite standard...). I also believe that generating code from XML Schema is an excellent way of formalising data modelling.

Clearly, for this example, two schemas are needed, one for the RDF and one for the DC namespace. The schemas from the Dublin Core website for handling this stuff (http://dublincore.org/documents/2002/07/31/dcmes-xml/dcmes-dc.xsd and http://dublincore.org/documents/2002/07/31/dcmes-xml/dcmes-rdf.xsd) work perfectly in Oxygen (the XML editor I use) when I compose example files, so they appear to be valid XML. But since they make extensive use of XML's substitution groups, the Castor source generator cannot handle them.

I've spent some time trying to produce Castor-friendly schemas equivalent to these official ones. I'm no expert on XML Schema or Castor, but after quite a lot of experimentation I came up with the solution below.

The solution

As a solution, all that speaks for this is that it works. Elegant, it is not. What I wanted to do was define an RDF and a DC schema and only have backwards references from the the DC schema to the RDF one. The way it works out though, each schema references the other.

OK. First of all the DC schema, because - on its own - it seems quite reasonable.

Schema for DC
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
	xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
	xmlns:dc="http://purl.org/dc/elements/1.1/" targetNamespace="http://purl.org/dc/elements/1.1/">

	<xsd:annotation>
		<xsd:documentation xml:lang="en" source="http://xml.vimia.com/cms/sitetree/meta-dc.xsd">
			Based on dcmes-dc.xsd from www.dublincore.org and modified to allow processing with
			Castor. </xsd:documentation>
	</xsd:annotation>

	<!-- note this reference to the RDF schema: it's a backward reference, so I would regard it
		as kosher. -->

	<!-- Import the RDF namespace schema for rdf:resource -->
	<xsd:import namespace="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
		schemaLocation="meta-rdf.xsd" />

	<!-- this is not necessary: all the following complexTypes could extend
		rdf:commonType directly, but it gives me an abstract class in Java
		which will be useful as a supertype of all the other DC classes -->

	<xsd:complexType name="abstractRoot" abstract="true">
		<xsd:complexContent>
			<xsd:extension base="rdf:commonType"> </xsd:extension>
		</xsd:complexContent>
	</xsd:complexType>

	<!-- I define different types for each kind of element -->

	<xsd:complexType name="titleType">
		<xsd:complexContent>
			<xsd:extension base="dc:abstractRoot"> </xsd:extension>
		</xsd:complexContent>
	</xsd:complexType>
	
	<xsd:complexType name="creatorType">
		<xsd:complexContent>
			<xsd:extension base="dc:abstractRoot"> </xsd:extension>
		</xsd:complexContent>
	</xsd:complexType>
	
	<!-- and as many further types as are needed to support the DC types -->

	<!-- define an element for each kind of DC tag -->

	<xsd:element name="title" type="dc:titleType" />

	<xsd:element name="creator" type="dc:creatorType" />

	<!-- and as many further element definitions as are needed ... -->

</xsd:schema>

Now for the nasty RDF schema, which unfortunately includes a forward reference to the DC schema. This scuppers any chance of reusability. Because the two schemas used here are so simple, this is no big deal, but for larger schemas it would be important.

Schema for RDF
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
	xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	targetNamespace="http://www.w3.org/1999/02/22-rdf-syntax-ns#">

	<xsd:annotation>
		<xsd:documentation xml:lang="en" source="http://xml.vimia.com/cms/sitetree/meta-rdf.xsd">
			Based on dcmes-rdf.xsd from www.dublincore.org and modified to allow processing
			with Castor.
		</xsd:documentation>
	</xsd:annotation>


	<!-- Import the XML namespace schema for xml:lang -->
	<xsd:import namespace="http://www.w3.org/XML/1998/namespace"
		schemaLocation="http://www.w3.org/2001/xml.xsd" />

	<!-- this is a forward reference to the DC schema, which contains a backward reference to
		here. If this were the only inelegant bit, it would be OK. But see below for worst stuff... -->
	
	<!-- Import the namespace schema for dc:* -->
	<xsd:import namespace="http://purl.org/dc/elements/1.1/" schemaLocation="meta-dc.xsd" />

	<xsd:element name="RDF">
		<xsd:complexType>
			<xsd:sequence>
				<xsd:element ref="rdf:Description" minOccurs="0" maxOccurs="unbounded"/>
			</xsd:sequence>
		</xsd:complexType>
	</xsd:element>
	
	<!-- this is horrible. Description belongs to the RDF namespace, so that element is in the right
		place here. But the references to stuff from the DC namespace are awful. If substitutionGroups
		worked, the DC stuff could be encapsulated in the DC schema -->

	<xsd:element name="Description">
		<xsd:complexType>
			<xsd:choice minOccurs="0" maxOccurs="unbounded">
				<xsd:element ref="dc:title" />
				<xsd:element ref="dc:creator" />
				<!-- etc --->
			</xsd:choice>
			<xsd:attribute ref="rdf:about" use="optional" />
		</xsd:complexType>
	</xsd:element>

	<xsd:complexType name="commonType">
		<xsd:attribute ref="xml:lang" use="optional" />
		<xsd:attribute ref="rdf:resource" use="optional" />
	</xsd:complexType>
	
	<xsd:attribute name="about" type="xsd:anyURI" />

	<xsd:attribute name="resource" type="xsd:anyURI" />

</xsd:schema>

It appears that this solution could be extended to more than two namespaces, but I haven't tried this.

There are other ways of writing valid XML Schema, but this is the only solution I have found which Castor's source generator works with. If there are any XSD wizards out there who would like to comment, I'd be very grateful for the feedback.

The clean solution to this problem is for the source generator to handle substitutionGroup, which would allow use of the "official" Dublin Core schemas, which unfortunately do not work with Castor at the moment (version 1.0).

castorbuilder.properties

The following addition to the castorbuilder.properties file shows how the schema namespaces used above can be mapped to Java package names. This is not necessary, but helpful.

# XML namespace mapping to Java packages
#
org.exolab.castor.builder.nspackages=\
	http://www.w3.org/1999/02/22-rdf-syntax-ns#=com.vimia.cms.sitetree.rdf, \
	http://purl.org/dc/elements/1.1/=com.vimia.cms.sitetree.dc
  • No labels