Table of Contents

Introduction

Geographic Information Systems have an interesting problem domain, much like with programming GIS Hacks are involved in modeling the real world.

There is more then one way to do it, programers used to use procedures, geographers used to use a pen and paper. Lets get on with it.

Modeling

Ack - try and stay awake. Nothing is as silly as trying to explain why creating a model is useful. Why create a model when you could solve a real problem?
Well for starters models are simpler and more flexable.

One can also view "proper" Object-Oriented programming as constructing a model of a problem space in the real world - and then using it to solve actual problems. There is even a branch of practice called Model Driven Development & Domain Driven Design to try and steer us back in the direction of OO prinicples.

As for a Geographer - they can create a Map (a visualization) of their model, this is a bit cheaper then jumping up in a hot air balloon and taking pictures. And we can make maps for things that don't exist yet (such as someone running around and wrecking the dikes that keep a river in check).

How to evaulate modeling Systems?

Different modeling systems allow for different ranges of expression. It is quite hard to capture the relationships on "Days of Our Lives" using the C programming language. The Java language would do a bit better as it supports a Class system (but may still have a tough time with the inheirtence relationships in Deliverence). The closer our means of expression is to the problem domain the less work we have to do.

For Geographers the Story is exactly the same, depending on the modeling system they use they will be able to accomplish different things. It is fairly hard to model projections using a CAD program, a modeling system that supports projection is often more suited to GIS work.

The point? We can only evaulate a modeling system with respect to a problem domain, or a task.

So what is it we want to Do?

Stop talking about models and get some work done ... ie. Blunt Example: based on a schema construct a query to access data.

What to we want (aka what is our task) - Geographic Markup Language:

Here are some of the constructs in GML that we need to ensure we handle well:

These are informal working definitions, for a formal definition consult a specification.

Formal separation of Concerns

We would like to maintain the following separation of concerns:

Metadata

XMLSchema

Table

FeatureType

Data

XML

Row

Feature

Query

XPath

SQL

Expression

Handling of ID

Different FeatureModels handle the idea of id differently:
(sorry for the use of database language here - it is what GeoAPI and DeeGree do)

ID
Based on "magic" (like shapefile row)
This is the ideal, everything else is an approximation. DJB: shapefile row is a very bad example - this is the worst possible example of an ID since it changes whenever you delete something. The PostGIS DataStore using the system's tuple id (OID) is a much better example.

Key
A single KEY column is treated as an ID
column is no longer available as an attribute to be queried

Keys
Several KEY columns are used to derrive an ID
Columns are not available as an attribute to be queried

Key Attribute
A single KEY column is marked as the Feature ID
Column remains available as an attribute

Key Attribtues
Several KEY columns are used to derrive an ID
Columns remain available as attributes

(question) Evaulation: Is the idea of a unique unmodieable ID maintained?
(question) Key Attribtue - does this violate encapsulation?
(warning) if on the off chance the attribute was modifiable people would be able to change their ID (and the world would end).
(info) The concept of KEY (and Unique) seems to be available in GML3, it does not have any relationship to FeatureID.
(lightbulb) We can consider support of KEY and UNIQUE validation constraints, this does not need to imply any relationship to the generated GML2 FID or GML3 ID

Attribute Access

Access attribtue by an interger index

use attribute name to access content

use qualigied name, or attribute object

use xpath to access deeply nested content

(warning) XPath seems to be too complex for implementations to support, aim for distinct separation between data model and query model
(warning) index and name both run into limitation when used with super types

Type System

Type information through extention of a single super type

Type infromation from multiple super types

(warning) A Type system does not have to imply reuse,
(info) Java interfaces are pure type system, the class system is for code reuse.
(info) XML has the same separation by way of substiution groups

Handling of FeatureCollection

We need to ensure that we can determine the schema of children allowed in a feature collection:

collection schema refernces child schema

collection schema refernces schema of children

children represented as a nested feature attribtue with multiplicity

In addition their are the following tradeoffs:

Parent contains references (or filter) defining contained children

Children contain a back refernece to collection, prevents children from being in more then one collection

Handling of Schema

 

FeatureType

Attributes providing name and type information

Concept of multiplicty, and nilable

Allow for restrictions on attributes (such as length)

 

"Complex"

Allow for a choices

Allow for complex attributes

complex attribute is a feature

 

FeatureCollection Type

access to child feature type

FeatureCollections with mixed content

FeatureCollection child modeled as an complex attribute

Needs to be able to describe the contents of the previous section:

(question) Evaulation: Is the schema "rich" enough to generate XPath expressions into the content?

Handling of Content

Here is a description of a "reference" GML document, we can use it as a reference point to see how capable the different modeling systems are.

Flat

Simple feature content as used by JUMP & Shapefile

header
document
 collection1
  road.1= label=hwy1, line=..., group=1, title=Hiway 1
  road.2= label=hwy1, line=..., group=7, title=Hiway 2
  ...
  road.N=...

Complext Content

Support for complex types, usually represented as an object. Support is required at the schema level to enable the consturction of XPath queries into the complex content.

header
document
 collection
  road.1= label=hwy1, line=..., address=(name=fred,zip=1234}
  road.2= label=hwy1, line=..., address=(name=fred,zip=1234}
  ...
  road.N=...

Multiplicity

Features contain repeating (or optional) elements:
Attribute Multiplicity: min/max used to constrain the number of times attribtues are able to occur
List multiplicity: Values are represented as a list, min/max are used to bound the list

header
document
 collection
  road.1= label=hwy1, line=..., group=1, title=Hiway 1
  road.2= label=hwy1, line=..., group=7, title=Hiway 2, type=7, title=Lovers Lane
  ...
  road.N=...

Required for:

(warning) Note the Lists representation agree directly with the XPath query model, but does not allow for validation of complex features. It is presented due to its use by several of the toolkits (not because it represents a valid model for handling GML3 content).

Nested Features

A form of complex content in which Features are allowed to contain other Features.

Support nested features as normal attribtues, depending on implemetnation no additional may be required over and above that required for complex content.

header
document
 collection
  road.1= label=hwy1, line=...,  
  road.2= label=hwy1, line=..., crosses=(label=pond, polygon=polygon(coordiantes))
  ...
  road.N=...

Multiple Collections

Unsure what metadata support is required to represent this kind of document?

header
document
 collection1
  road.1= label=hwy1, line=..., group=1, title=Hiway 1
  road.2= label=hwy1, line=..., group=7, title=Hiway 2
  ...
  road.N=...
 collection2
  lake.1= label=pond, polygon=polygon(coordiantes)
  ...
  lake.N=...

Feature References

Same problem as before, what captures the schema information for a GML Document?

header
document
 collection1
  road.1= label=hwy1, line=..., group=1, title=Hiway 1, crosses=#lake.1
  road.2= label=hwy1, line=..., group=7, title=Hiway 2
  ...
  road.N=...
 collection2
  lake.1= label=pond, polygon=polygon(coordiantes)
  ...
  lake.N=...

Note: Refernece should be handled as a lazy Feature (require another pass through the data when resolved?), either that or are we stuck with loading the document into memory.

Mixed Content

Allow children of several types within a collection. GML may limit this to chilfren within one substiution group.

header
document
 collection
  road.1= label=hwy1, line=..., group=1, title=Hiway 1
  lake.1= label=pond, polygon=polygon(coordiantes)
  road.2= label=hwy1, line=..., group=7, title=Hiway 2
  ...

Nested Collections

Allow collection contents to be collections themselves.

header
document
 collection1
  road.1= label=hwy1, line=..., group=1, title=Hiway 1
  collection2
   lake.1= label=pond, polygon=polygon(coordiantes)
   ...
   lake.N=...
  road.2= label=hwy1, line=..., group=7, title=Hiway 2
  ...

Comparison

Note chart is based on documented intent, not ability.

Name

SuperType

ID

Feature Access

FeatureType

FeatureCollectionType

Feature

JUMP

 

 

GeoAPI 2.0

 

GeoTools 2.0

 

GeoTools 2.1

Deegree

Requirements

We require the feature model to be "big enough" in modeling power to support:

Level 0 Profile of GML3

The exact requiremetns are outlined here:

In breif:

There is some discussion of references and dealing multiple feature collections and feature references.

XPath and SLD Docuemnts

Our FeatureType system must be complete enough to allow the generation of XPath expressions as used by Expressions in SLD documents.

The XPath view of a document is limited to the following:

To support this model we do not need the full set of validation constructs requried for GML3 (such as choice, restrictions or even multiplicity).

Example

Lets practice with the following docuemnt:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="application/xml" href="sco.xsl">
<gml:FeatureCollection 
  xmlns:wfs="http://www.opengis.net/wfs" xmlns:sco="http://online.socialchange.net.au"
  xmlns:gml="http://www.opengis.net/gml" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://online.socialchange.net.au 03_multiple_geometric_and_non_geometric_attributes.xsd
                      http://www.opengis.net/gml ../schemas.opengis.net/gml/3.1.1/base/gml.xsd">
  <!--this case has multiple geometry properties and repeated non-geometry properties (one scalar, once complex)-->
  <gml:boundedBy>
    <gml:Envelope srsName="http://www.opengis.net/gml/srs/epsg.xml#4283">
      <gml:coordinates decimal="." cs="," ts=" ">8.89999962,143.53399658 200,143.53399658</gml:coordinates>
    </gml:Envelope>
  </gml:boundedBy>
  <gml:featureMember>
    <sco:wq_plus gml:id="_41010901">
      <sco:sitename>BALRANALD WEIR</sco:sitename>
      <sco:measurement gml:id="_16JAN94002001002003000000">
        <sco:determinand_description>16/JAN/94</sco:determinand_description>
        <sco:result>Turbidity</sco:result>
      </sco:measurement>
      <sco:measurement gml:id="_24JAN94002001002003000000">
        <sco:determinand_description>24/JAN/94</sco:determinand_description>
        <sco:result>Turbidity</sco:result>
      </sco:measurement>
      <sco:location>
        <gml:Point srsName="http://www.opengis.net/gml/srs/epsg.xml#4283">
          <gml:coordinates decimal="." cs="," ts=" ">22,143.53399658</gml:coordinates>
        </gml:Point>
      </sco:location>
      <sco:nearestSlimePit>
        <gml:Point srsName="http://www.opengis.net/gml/srs/epsg.xml#4283">
          <gml:coordinates decimal="." cs="," ts=" ">22.1,143.53399658</gml:coordinates>
        </gml:Point>
      </sco:nearestSlimePit>
      <sco:sitename>RWWQ0004</sco:sitename>
    </sco:wq_plus>
  </gml:featureMember>
  <gml:name>My Boreholes</gml:name>
</gml:FeatureCollection>

Node breakdown of example:

Root:

Points of interest:

Location Paths:

Similar to unix shell:

Root node

/

Selects the root node

Child element

gml:Point

Point( 22, 134.5399658, EPSG:4283 ), Point( 22.1,143.53399658, EPSG:4283 )

 

 

Note this is relative

Reference

Data Format Reference

This review will focus on the different modeling systems ability to capture the Shapefile and/or GML document mentioned above.

Shapefile

The other thing we should look at is the humble shapefile, this was used to drive the expressiveness of the existing GeoTools modeling abilities. We really dont want to backtrack (although shapefile is so simple it will be hard to).

Model

shapefiles, one per feature type

FeatureType

shapefile, details in header

Feature

entry in shp file, using shx to find a row in dxf file

FeatureCollection

n/a

Schema

Model

Collection

ID

restriction

flat

child

row number

Databases

Database land is all wrapped up in the Simple Features for SQL specification. The main benifit is a definition of the basic Geometry types we use in GeoTools (implemented by the JTS library).

Model

Relational

FeatureType

table (or view) details in metadata

Feature

Row

FeatureCollection

Result Set

Schema

Model

Collection

ID

 

 

 

 

 

 

fidmapper function

Note: Not all databases support SFSQL (and Oracle supports a few interesting things likes curves)

DJB: you should also mention object-relational DB techniques (like hybernate) or just simple multiplicity-by-table-joining.

Unified Modeling Language

Model

Object-Oriented

FeatureType

Class

Feature

Object

FeatureCollection

Aggregation

GML2

Model

XML

FeatureType

XMLSchema

FeatureCollection

complext type that extends gml:FeatureCollection

Feature

element extending gml:Feature

 

child of gml:FeatureCollection's gml:featureMembers element

Schema

Model

Collection

ID

 

 

 

 

 

 

 

 

 

 

 

The GML2 geometry model is similar to JTS (although based on some ISO specification).

GML3

Model

XML

FeatureType

Direct/indirect extention of gml:AbstractFeatureType

 

Attribute is a local xsd:element

FeatureCollection

complext type that extends gml:FeatureCollection

Feature

element relization of GF_FeatureType

 

child of gml:_FeatureCollection's featureMember element

 

member of the gml:_Feature substiution group

Schema

Model

Collection

ID

 

 

 

 

 

 

 

 

 

 

 

Reference: GML-3.1.pdf

GML3 includes a more complicated Geometry model then supported by JTS. Higher order geometries (like surface) are supported in addition to curves and topologies.

What is interesting is that "Annex E" describes a mechanical way to port a UML model to GML (the bridge between programmer and geographer has been crossed before us).

Reviews

GeoTools 2.0

Schema

Model

Collection

ID

 

 

 

 

 

 

 

 

 

  • As you can see the GeoTools 2.0 schema model far outstrips its Modeling abilities.
  • Multiplicity is supported as attribute arrays
  • Collections are not considered Feature in their own right.
    Schema indicates multiplicity support, resulting values indistingishable from support of complex content such as a list.
  • supertype support and name/index attribute access inconsistent, in practicel terms super type is unsupported
  • Feature indicates xpath support (but not implemented)
  • FeatureType indicates xpath support (but not implemented)

GeoTools 2.1

Schema

Model

Collection

ID

 

 

 

 

 

 

 

 

 

 

 

  • GeoTools 2.1 added support for restrictions
    • by use of facets (defined as a series of Filter associated with an AttributeType)
  • Collections are now considered a Feature
    • Cannot determine allowable child features from collection feature type schema.

GeoAPI 1.0

Schema

Model

Collection

ID

Not yet reviewed

GeoAPI 2.0

Schema

Model

Collection

ID

 

 

 

 

 

 

  • FeatureType is mustable? Until frozen/used?
  • Support for Object Attributes
  • FeatureCollection is considered a Feature
  • FeatureCollectionType incomplete
    • Cannot determine allowable child features

DeeGree

Schema

Model

Collection

ID

 

 

 

 

 

 

GPL toolkit actually able to take on complex GML, used as an input to the GeoAPI model. The model supports complex types but the implementations do not seem to follow, at least not on the head of their repository.

JUMP

Schema

Model

Collection

ID

 

 

 

  • Simple FeatureType model, FeatureType is non mutable.
  • Support for Object Attributes
  • Collections are allowed children of a single type.
  • Collections are not considered Feature in their own right.
  • Collections are held in memory

GML2 & GML3

Schema

Model

Collection

ID

 

 

 

 

 

 

 

 

 

 

 

GML3 makes the transition to ISO based Geometries.