Table of Contents
- Introduction
- Modeling
- How to Evaulate
- Comparison
- Reference
Introduction
Geographic Information Systems have an interesting problem domain, much like with programming GIS Hacks are involved in modeling the real world.
- Programmers model with Software (and use abstractions like Object and Class).
- Geographers model with Maps (and use abstractions like Feature and FeatureType).
There is more then one way to do it, programers used to use procedures, geographers used to use a pen and paper. Lets get on with it.
Modeling
Ack - try and stay awake. Nothing is as silly as trying to explain why creating a model is useful. Why create a model when you could solve a real problem?
Well for starters models are simpler and more flexable.
- A Geographer could model what it would be like to flood a river, this is much more pragmatic then driving around with sticks of dynamite and blowing up a few dikes (although perhaps not as much fun). Hacking models is easier then hacking reality.
- A programmer can model code (using UML diagrams and such), hacking the model is often a lot quicker then hacking the code.
One can also view "proper" Object-Oriented programming as constructing a model of a problem space in the real world - and then using it to solve actual problems. There is even a branch of practice called Model Driven Development & Domain Driven Design to try and steer us back in the direction of OO prinicples.
As for a Geographer - they can create a Map (a visualization) of their model, this is a bit cheaper then jumping up in a hot air balloon and taking pictures. And we can make maps for things that don't exist yet (such as someone running around and wrecking the dikes that keep a river in check).
How to evaulate modeling Systems?
Different modeling systems allow for different ranges of expression. It is quite hard to capture the relationships on "Days of Our Lives" using the C programming language. The Java language would do a bit better as it supports a Class system (but may still have a tough time with the inheirtence relationships in Deliverence). The closer our means of expression is to the problem domain the less work we have to do.
For Geographers the Story is exactly the same, depending on the modeling system they use they will be able to accomplish different things. It is fairly hard to model projections using a CAD program, a modeling system that supports projection is often more suited to GIS work.
The point? We can only evaulate a modeling system with respect to a problem domain, or a task.
So what is it we want to Do?
Stop talking about models and get some work done ... ie. Blunt Example: based on a schema construct a query to access data.
What to we want (aka what is our task) - Geographic Markup Language:
- GML3 - this is the "mandate" mentioned at an IRC meeting a couple of months ago.
- Programers would use UML2 these days (right?)
Here are some of the constructs in GML that we need to ensure we handle well:
- Feature - something that can be drawn on a map
- Programers would call these objects (or instances if that makes you happy)
- FeatureType - kind of thing on a map.
- Programmers would call these classes
- FeatureCollection
- a bunch of features, can also be considered a "Feature" (Cause you can draw it on a map)
- Programers would consider this a data structure
- id - formally FID or "FeatureID"
- used to unqiuely identify the Feature, the real world object
These are informal working definitions, for a formal definition consult a specification.
Formal separation of Concerns
We would like to maintain the following separation of concerns:
Metadata |
XMLSchema |
Table |
FeatureType |
Data |
XML |
Row |
Feature |
Query |
XPath |
SQL |
Expression |
Handling of ID
Different FeatureModels handle the idea of id differently:
(sorry for the use of database language here - it is what GeoAPI and DeeGree do)
|
ID |
|
Key |
|
Keys |
|
Key Attribute |
|
Key Attribtues |
Evaulation: Is the idea of a unique unmodieable ID maintained?
Key Attribtue - does this violate encapsulation?
if on the off chance the attribute was modifiable people would be able to change their ID (and the world would end).
The concept of KEY (and Unique) seems to be available in GML3, it does not have any relationship to FeatureID.
We can consider support of KEY and UNIQUE validation constraints, this does not need to imply any relationship to the generated GML2 FID or GML3 ID
Attribute Access
|
Access attribtue by an interger index |
|
use attribute name to access content |
|
use qualigied name, or attribute object |
|
use xpath to access deeply nested content |
XPath seems to be too complex for implementations to support, aim for distinct separation between data model and query model
index and name both run into limitation when used with super types
Type System
|
Type information through extention of a single super type |
|
Type infromation from multiple super types |
A Type system does not have to imply reuse,
Java interfaces are pure type system, the class system is for code reuse.
XML has the same separation by way of substiution groups
Handling of FeatureCollection
We need to ensure that we can determine the schema of children allowed in a feature collection:
|
collection schema refernces child schema |
|
collection schema refernces schema of children |
|
children represented as a nested feature attribtue with multiplicity |
In addition their are the following tradeoffs:
|
Parent contains references (or filter) defining contained children |
|
Children contain a back refernece to collection, prevents children from being in more then one collection |
Handling of Schema
|
FeatureType |
|---|---|
|
Attributes providing name and type information |
|
Concept of multiplicty, and nilable |
|
Allow for restrictions on attributes (such as length) |
|
"Complex" |
|
Allow for a choices |
|
Allow for complex attributes |
|
complex attribute is a feature |
|
FeatureCollection Type |
|
access to child feature type |
|
FeatureCollections with mixed content |
|
FeatureCollection child modeled as an complex attribute |
Needs to be able to describe the contents of the previous section:
Evaulation: Is the schema "rich" enough to generate XPath expressions into the content?
Handling of Content
Here is a description of a "reference" GML document, we can use it as a reference point to see how capable the different modeling systems are.
Flat
Simple feature content as used by JUMP & Shapefile
Complext Content
Support for complex types, usually represented as an object. Support is required at the schema level to enable the consturction of XPath queries into the complex content.
Multiplicity
Features contain repeating (or optional) elements:
Attribute Multiplicity: min/max used to constrain the number of times attribtues are able to occur
List multiplicity: Values are represented as a list, min/max are used to bound the list
Required for:

Note the Lists representation agree directly with the XPath query model, but does not allow for validation of complex features. It is presented due to its use by several of the toolkits (not because it represents a valid model for handling GML3 content).
Nested Features
A form of complex content in which Features are allowed to contain other Features.
Support nested features as normal attribtues, depending on implemetnation no additional may be required over and above that required for complex content.
Multiple Collections
Unsure what metadata support is required to represent this kind of document?
Feature References
Same problem as before, what captures the schema information for a GML Document?
Note: Refernece should be handled as a lazy Feature (require another pass through the data when resolved?), either that or are we stuck with loading the document into memory.
Mixed Content
Allow children of several types within a collection. GML may limit this to chilfren within one substiution group.
Nested Collections
Allow collection contents to be collections themselves.
Comparison
Note chart is based on documented intent, not ability.
Name |
SuperType |
ID |
Feature Access |
FeatureType |
FeatureCollectionType |
Feature |
|---|---|---|---|---|---|---|
JUMP |
|
|
|
|
|
|
GeoAPI 2.0 |
|
|
|
|
|
|
GeoTools 2.0 |
|
|
|
|
|
|
GeoTools 2.1 |
|
|
|
|
|
|
Deegree |
|
|
|
|
|
|
Requirements
We require the feature model to be "big enough" in modeling power to support:
- Level 0 Profile of GML3 for WFS
- XPath and SLD Documents
Level 0 Profile of GML3
The exact requiremetns are outlined here:
In breif:
- complex type support
- subset of geometry types
There is some discussion of references and dealing multiple feature collections and feature references.
XPath and SLD Docuemnts
Our FeatureType system must be complete enough to allow the generation of XPath expressions as used by Expressions in SLD documents.
The XPath view of a document is limited to the following:
- nodelists
To support this model we do not need the full set of validation constructs requried for GML3 (such as choice, restrictions or even multiplicity).
Example
Lets practice with the following docuemnt:
Node breakdown of example:
Root:
- xml-stylesheet type="application/xml" href="sco.xsl"
- gml:FeatureCollection
- gml:boundedBy: Envelope
- gml:featureMember
- sco:wq_plus
- sco:sitename: BALRANALD WEIR
- sco:measurement
- sco:determineand_description: 16/JAN/94
- sco:result: Turbidity
- sco:measurement
- sco:determineand_description: 24/JAN/94
- sco:result: Turbidity
- sco:location: Point( 22, 134.5399658, EPSG:4283 )
- sco:nearestSlimePit: Point( 22.1,143.53399658, EPSG:4283 )
- sco:sitename: RWWQ004
- sco:wq_plus
- gml:name: My Boreholes
Points of interest:
- Root node is not the same as the root element
- Root node contains the entire document including the xml-stylesheet (xmlns and xmlns:prefix are not covered though)
Location Paths:
Similar to unix shell:
Root node |
/ |
Selects the root node |
|---|---|---|
Child element |
gml:Point |
Point( 22, 134.5399658, EPSG:4283 ), Point( 22.1,143.53399658, EPSG:4283 ) |
|
|
Note this is relative |
Reference
Data Format Reference
This review will focus on the different modeling systems ability to capture the Shapefile and/or GML document mentioned above.
Shapefile
The other thing we should look at is the humble shapefile, this was used to drive the expressiveness of the existing GeoTools modeling abilities. We really dont want to backtrack (although shapefile is so simple it will be hard to).
Model |
shapefiles, one per feature type |
|---|---|
FeatureType |
shapefile, details in header |
Feature |
entry in shp file, using shx to find a row in dxf file |
FeatureCollection |
n/a |
Schema |
Model |
Collection |
ID |
|---|---|---|---|
|
|
|
|
restriction |
flat |
child |
row number |
Databases
Database land is all wrapped up in the Simple Features for SQL specification. The main benifit is a definition of the basic Geometry types we use in GeoTools (implemented by the JTS library).
Model |
Relational |
|---|---|
FeatureType |
table (or view) details in metadata |
Feature |
Row |
FeatureCollection |
Result Set |
Schema |
Model |
Collection |
ID |
|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
fidmapper function |
Note: Not all databases support SFSQL (and Oracle supports a few interesting things likes curves)
DJB: you should also mention object-relational DB techniques (like hybernate) or just simple multiplicity-by-table-joining.
Unified Modeling Language
Model |
Object-Oriented |
|---|---|
FeatureType |
Class |
Feature |
Object |
FeatureCollection |
Aggregation |
GML2
Model |
XML |
|---|---|
FeatureType |
XMLSchema |
FeatureCollection |
complext type that extends gml:FeatureCollection |
Feature |
element extending gml:Feature |
|
child of gml:FeatureCollection's gml:featureMembers element |
Schema |
Model |
Collection |
ID |
|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The GML2 geometry model is similar to JTS (although based on some ISO specification).
GML3
Model |
XML |
|---|---|
FeatureType |
Direct/indirect extention of gml:AbstractFeatureType |
|
Attribute is a local xsd:element |
FeatureCollection |
complext type that extends gml:FeatureCollection |
Feature |
element relization of GF_FeatureType |
|
child of gml:_FeatureCollection's featureMember element |
|
member of the gml:_Feature substiution group |
Schema |
Model |
Collection |
ID |
|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Reference: GML-3.1.pdf
- Page 67 for Feature
- Page 68 & 528 for FeatureCollection
- Page 530 for "Annex E UML-to-GML Application Schema Encoding Rules"
GML3 includes a more complicated Geometry model then supported by JTS. Higher order geometries (like surface) are supported in addition to curves and topologies.
What is interesting is that "Annex E" describes a mechanical way to port a UML model to GML (the bridge between programmer and geographer has been crossed before us).
Reviews
GeoTools 2.0
|
|
GeoTools 2.1
|
|
GeoAPI 1.0
|
Not yet reviewed |
GeoAPI 2.0
|
|
DeeGree
|
GPL toolkit actually able to take on complex GML, used as an input to the GeoAPI model. The model supports complex types but the implementations do not seem to follow, at least not on the head of their repository. |
JUMP
|
|
GML2 & GML3
|
GML3 makes the transition to ISO based Geometries. |