Geographic Information Systems have an interesting problem domain, much like with programming GIS Hacks are involved in modeling the real world.
There is more then one way to do it, programers used to use procedures, geographers used to use a pen and paper. Lets get on with it.
Ack - try and stay awake. Nothing is as silly as trying to explain why creating a model is useful. Why create a model when you could solve a real problem?
Well for starters models are simpler and more flexable.
One can also view "proper" Object-Oriented programming as constructing a model of a problem space in the real world - and then using it to solve actual problems. There is even a branch of practice called Model Driven Development & Domain Driven Design to try and steer us back in the direction of OO prinicples.
As for a Geographer - they can create a Map (a visualization) of their model, this is a bit cheaper then jumping up in a hot air balloon and taking pictures. And we can make maps for things that don't exist yet (such as someone running around and wrecking the dikes that keep a river in check).
Different modeling systems allow for different ranges of expression. It is quite hard to capture the relationships on "Days of Our Lives" using the C programming language. The Java language would do a bit better as it supports a Class system (but may still have a tough time with the inheirtence relationships in Deliverence). The closer our means of expression is to the problem domain the less work we have to do.
For Geographers the Story is exactly the same, depending on the modeling system they use they will be able to accomplish different things. It is fairly hard to model projections using a CAD program, a modeling system that supports projection is often more suited to GIS work.
The point? We can only evaulate a modeling system with respect to a problem domain, or a task.
Stop talking about models and get some work done ... ie. Blunt Example: based on a schema construct a query to access data.
What to we want (aka what is our task) - Geographic Markup Language:
Here are some of the constructs in GML that we need to ensure we handle well:
These are informal working definitions, for a formal definition consult a specification.
We would like to maintain the following separation of concerns:
Metadata |
XMLSchema |
Table |
FeatureType |
Data |
XML |
Row |
Feature |
Query |
XPath |
SQL |
Expression |
Different FeatureModels handle the idea of id differently:
(sorry for the use of database language here - it is what GeoAPI and DeeGree do)
|
ID |
|
Key |
|
Keys |
|
Key Attribute |
|
Key Attribtues |
Evaulation: Is the idea of a unique unmodieable ID maintained?
Key Attribtue - does this violate encapsulation?
if on the off chance the attribute was modifiable people would be able to change their ID (and the world would end).
The concept of KEY (and Unique) seems to be available in GML3, it does not have any relationship to FeatureID.
We can consider support of KEY and UNIQUE validation constraints, this does not need to imply any relationship to the generated GML2 FID or GML3 ID
|
Access attribtue by an interger index |
|
use attribute name to access content |
|
use qualigied name, or attribute object |
|
use xpath to access deeply nested content |
XPath seems to be too complex for implementations to support, aim for distinct separation between data model and query model
index and name both run into limitation when used with super types
|
Type information through extention of a single super type |
|
Type infromation from multiple super types |
A Type system does not have to imply reuse,
Java interfaces are pure type system, the class system is for code reuse.
XML has the same separation by way of substiution groups
We need to ensure that we can determine the schema of children allowed in a feature collection:
|
collection schema refernces child schema |
|
collection schema refernces schema of children |
|
children represented as a nested feature attribtue with multiplicity |
In addition their are the following tradeoffs:
|
Parent contains references (or filter) defining contained children |
|
Children contain a back refernece to collection, prevents children from being in more then one collection |
|
FeatureType |
|---|---|
|
Attributes providing name and type information |
|
Concept of multiplicty, and nilable |
|
Allow for restrictions on attributes (such as length) |
|
"Complex" |
|
Allow for a choices |
|
Allow for complex attributes |
|
complex attribute is a feature |
|
FeatureCollection Type |
|
access to child feature type |
|
FeatureCollections with mixed content |
|
FeatureCollection child modeled as an complex attribute |
Needs to be able to describe the contents of the previous section:
Evaulation: Is the schema "rich" enough to generate XPath expressions into the content?
Here is a description of a "reference" GML document, we can use it as a reference point to see how capable the different modeling systems are.
Simple feature content as used by JUMP & Shapefile
header document collection1 road.1= label=hwy1, line=..., group=1, title=Hiway 1 road.2= label=hwy1, line=..., group=7, title=Hiway 2 ... road.N=... |
Support for complex types, usually represented as an object. Support is required at the schema level to enable the consturction of XPath queries into the complex content.
header document collection road.1= label=hwy1, line=..., address=(name=fred,zip=1234} road.2= label=hwy1, line=..., address=(name=fred,zip=1234} ... road.N=... |
Features contain repeating (or optional) elements:
Attribute Multiplicity: min/max used to constrain the number of times attribtues are able to occur
List multiplicity: Values are represented as a list, min/max are used to bound the list
header document collection road.1= label=hwy1, line=..., group=1, title=Hiway 1 road.2= label=hwy1, line=..., group=7, title=Hiway 2, type=7, title=Lovers Lane ... road.N=... |
Required for:

Note the Lists representation agree directly with the XPath query model, but does not allow for validation of complex features. It is presented due to its use by several of the toolkits (not because it represents a valid model for handling GML3 content).
A form of complex content in which Features are allowed to contain other Features.
Support nested features as normal attribtues, depending on implemetnation no additional may be required over and above that required for complex content.
header document collection road.1= label=hwy1, line=..., road.2= label=hwy1, line=..., crosses=(label=pond, polygon=polygon(coordiantes)) ... road.N=... |
Unsure what metadata support is required to represent this kind of document?
header document collection1 road.1= label=hwy1, line=..., group=1, title=Hiway 1 road.2= label=hwy1, line=..., group=7, title=Hiway 2 ... road.N=... collection2 lake.1= label=pond, polygon=polygon(coordiantes) ... lake.N=... |
Same problem as before, what captures the schema information for a GML Document?
header document collection1 road.1= label=hwy1, line=..., group=1, title=Hiway 1, crosses=#lake.1 road.2= label=hwy1, line=..., group=7, title=Hiway 2 ... road.N=... collection2 lake.1= label=pond, polygon=polygon(coordiantes) ... lake.N=... |
Note: Refernece should be handled as a lazy Feature (require another pass through the data when resolved?), either that or are we stuck with loading the document into memory.
Allow children of several types within a collection. GML may limit this to chilfren within one substiution group.
header document collection road.1= label=hwy1, line=..., group=1, title=Hiway 1 lake.1= label=pond, polygon=polygon(coordiantes) road.2= label=hwy1, line=..., group=7, title=Hiway 2 ... |
Allow collection contents to be collections themselves.
header document collection1 road.1= label=hwy1, line=..., group=1, title=Hiway 1 collection2 lake.1= label=pond, polygon=polygon(coordiantes) ... lake.N=... road.2= label=hwy1, line=..., group=7, title=Hiway 2 ... |
Note chart is based on documented intent, not ability.
Name |
SuperType |
ID |
Feature Access |
FeatureType |
FeatureCollectionType |
Feature |
|---|---|---|---|---|---|---|
JUMP |
|
|
|
|
|
|
GeoAPI 2.0 |
|
|
|
|
|
|
GeoTools 2.0 |
|
|
|
|
|
|
GeoTools 2.1 |
|
|
|
|
|
|
Deegree |
|
|
|
|
|
|
We require the feature model to be "big enough" in modeling power to support:
The exact requiremetns are outlined here:
In breif:
There is some discussion of references and dealing multiple feature collections and feature references.
Our FeatureType system must be complete enough to allow the generation of XPath expressions as used by Expressions in SLD documents.
The XPath view of a document is limited to the following:
To support this model we do not need the full set of validation constructs requried for GML3 (such as choice, restrictions or even multiplicity).
Lets practice with the following docuemnt:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="application/xml" href="sco.xsl">
<gml:FeatureCollection
xmlns:wfs="http://www.opengis.net/wfs" xmlns:sco="http://online.socialchange.net.au"
xmlns:gml="http://www.opengis.net/gml" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://online.socialchange.net.au 03_multiple_geometric_and_non_geometric_attributes.xsd
http://www.opengis.net/gml ../schemas.opengis.net/gml/3.1.1/base/gml.xsd">
<!--this case has multiple geometry properties and repeated non-geometry properties (one scalar, once complex)-->
<gml:boundedBy>
<gml:Envelope srsName="http://www.opengis.net/gml/srs/epsg.xml#4283">
<gml:coordinates decimal="." cs="," ts=" ">8.89999962,143.53399658 200,143.53399658</gml:coordinates>
</gml:Envelope>
</gml:boundedBy>
<gml:featureMember>
<sco:wq_plus gml:id="_41010901">
<sco:sitename>BALRANALD WEIR</sco:sitename>
<sco:measurement gml:id="_16JAN94002001002003000000">
<sco:determinand_description>16/JAN/94</sco:determinand_description>
<sco:result>Turbidity</sco:result>
</sco:measurement>
<sco:measurement gml:id="_24JAN94002001002003000000">
<sco:determinand_description>24/JAN/94</sco:determinand_description>
<sco:result>Turbidity</sco:result>
</sco:measurement>
<sco:location>
<gml:Point srsName="http://www.opengis.net/gml/srs/epsg.xml#4283">
<gml:coordinates decimal="." cs="," ts=" ">22,143.53399658</gml:coordinates>
</gml:Point>
</sco:location>
<sco:nearestSlimePit>
<gml:Point srsName="http://www.opengis.net/gml/srs/epsg.xml#4283">
<gml:coordinates decimal="." cs="," ts=" ">22.1,143.53399658</gml:coordinates>
</gml:Point>
</sco:nearestSlimePit>
<sco:sitename>RWWQ0004</sco:sitename>
</sco:wq_plus>
</gml:featureMember>
<gml:name>My Boreholes</gml:name>
</gml:FeatureCollection>
|
Node breakdown of example:
Root:
Points of interest:
Location Paths:
Similar to unix shell:
Root node |
/ |
Selects the root node |
|---|---|---|
Child element |
gml:Point |
Point( 22, 134.5399658, EPSG:4283 ), Point( 22.1,143.53399658, EPSG:4283 ) |
|
|
Note this is relative |
This review will focus on the different modeling systems ability to capture the Shapefile and/or GML document mentioned above.
The other thing we should look at is the humble shapefile, this was used to drive the expressiveness of the existing GeoTools modeling abilities. We really dont want to backtrack (although shapefile is so simple it will be hard to).
Model |
shapefiles, one per feature type |
|---|---|
FeatureType |
shapefile, details in header |
Feature |
entry in shp file, using shx to find a row in dxf file |
FeatureCollection |
n/a |
Schema |
Model |
Collection |
ID |
|---|---|---|---|
|
|
|
|
restriction |
flat |
child |
row number |
Database land is all wrapped up in the Simple Features for SQL specification. The main benifit is a definition of the basic Geometry types we use in GeoTools (implemented by the JTS library).
Model |
Relational |
|---|---|
FeatureType |
table (or view) details in metadata |
Feature |
Row |
FeatureCollection |
Result Set |
Schema |
Model |
Collection |
ID |
|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
fidmapper function |
Note: Not all databases support SFSQL (and Oracle supports a few interesting things likes curves)
DJB: you should also mention object-relational DB techniques (like hybernate) or just simple multiplicity-by-table-joining.
Model |
Object-Oriented |
|---|---|
FeatureType |
Class |
Feature |
Object |
FeatureCollection |
Aggregation |
Model |
XML |
|---|---|
FeatureType |
XMLSchema |
FeatureCollection |
complext type that extends gml:FeatureCollection |
Feature |
element extending gml:Feature |
|
child of gml:FeatureCollection's gml:featureMembers element |
Schema |
Model |
Collection |
ID |
|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The GML2 geometry model is similar to JTS (although based on some ISO specification).
Model |
XML |
|---|---|
FeatureType |
Direct/indirect extention of gml:AbstractFeatureType |
|
Attribute is a local xsd:element |
FeatureCollection |
complext type that extends gml:FeatureCollection |
Feature |
element relization of GF_FeatureType |
|
child of gml:_FeatureCollection's featureMember element |
|
member of the gml:_Feature substiution group |
Schema |
Model |
Collection |
ID |
|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Reference: GML-3.1.pdf
GML3 includes a more complicated Geometry model then supported by JTS. Higher order geometries (like surface) are supported in addition to curves and topologies.
What is interesting is that "Annex E" describes a mechanical way to port a UML model to GML (the bridge between programmer and geographer has been crossed before us).
|
|
|
|
|
|
|