Skip to end of metadata
Go to start of metadata

Table of Contents

Introduction

Geographic Information Systems have an interesting problem domain, much like with programming GIS Hacks are involved in modeling the real world.

  • Programmers model with Software (and use abstractions like Object and Class).
  • Geographers model with Maps (and use abstractions like Feature and FeatureType).

There is more then one way to do it, programers used to use procedures, geographers used to use a pen and paper. Lets get on with it.

Modeling

Ack - try and stay awake. Nothing is as silly as trying to explain why creating a model is useful. Why create a model when you could solve a real problem?
Well for starters models are simpler and more flexable.

  • A Geographer could model what it would be like to flood a river, this is much more pragmatic then driving around with sticks of dynamite and blowing up a few dikes (although perhaps not as much fun). Hacking models is easier then hacking reality.
  • A programmer can model code (using UML diagrams and such), hacking the model is often a lot quicker then hacking the code.

One can also view "proper" Object-Oriented programming as constructing a model of a problem space in the real world - and then using it to solve actual problems. There is even a branch of practice called Model Driven Development & Domain Driven Design to try and steer us back in the direction of OO prinicples.

As for a Geographer - they can create a Map (a visualization) of their model, this is a bit cheaper then jumping up in a hot air balloon and taking pictures. And we can make maps for things that don't exist yet (such as someone running around and wrecking the dikes that keep a river in check).

How to evaulate modeling Systems?

Different modeling systems allow for different ranges of expression. It is quite hard to capture the relationships on "Days of Our Lives" using the C programming language. The Java language would do a bit better as it supports a Class system (but may still have a tough time with the inheirtence relationships in Deliverence). The closer our means of expression is to the problem domain the less work we have to do.

For Geographers the Story is exactly the same, depending on the modeling system they use they will be able to accomplish different things. It is fairly hard to model projections using a CAD program, a modeling system that supports projection is often more suited to GIS work.

The point? We can only evaulate a modeling system with respect to a problem domain, or a task.

So what is it we want to Do?

Stop talking about models and get some work done ... ie. Blunt Example: based on a schema construct a query to access data.

What to we want (aka what is our task) - Geographic Markup Language:

  • GML3 - this is the "mandate" mentioned at an IRC meeting a couple of months ago.
    • Programers would use UML2 these days (right?)

Here are some of the constructs in GML that we need to ensure we handle well:

  • Feature - something that can be drawn on a map
    • Programers would call these objects (or instances if that makes you happy)
  • FeatureType - kind of thing on a map.
    • Programmers would call these classes
  • FeatureCollection
    • a bunch of features, can also be considered a "Feature" (Cause you can draw it on a map)
    • Programers would consider this a data structure
  • id - formally FID or "FeatureID"
    • used to unqiuely identify the Feature, the real world object

These are informal working definitions, for a formal definition consult a specification.

Formal separation of Concerns

We would like to maintain the following separation of concerns:

Metadata

XMLSchema

Table

FeatureType

Data

XML

Row

Feature

Query

XPath

SQL

Expression

Handling of ID

Different FeatureModels handle the idea of id differently:
(sorry for the use of database language here - it is what GeoAPI and DeeGree do)

ID
Based on "magic" (like shapefile row)
This is the ideal, everything else is an approximation. DJB: shapefile row is a very bad example - this is the worst possible example of an ID since it changes whenever you delete something. The PostGIS DataStore using the system's tuple id (OID) is a much better example.

Key
A single KEY column is treated as an ID
column is no longer available as an attribute to be queried

Keys
Several KEY columns are used to derrive an ID
Columns are not available as an attribute to be queried

Key Attribute
A single KEY column is marked as the Feature ID
Column remains available as an attribute

Key Attribtues
Several KEY columns are used to derrive an ID
Columns remain available as attributes

(question) Evaulation: Is the idea of a unique unmodieable ID maintained?
(question) Key Attribtue - does this violate encapsulation?
(warning) if on the off chance the attribute was modifiable people would be able to change their ID (and the world would end).
(info) The concept of KEY (and Unique) seems to be available in GML3, it does not have any relationship to FeatureID.
(lightbulb) We can consider support of KEY and UNIQUE validation constraints, this does not need to imply any relationship to the generated GML2 FID or GML3 ID

Attribute Access

Access attribtue by an interger index

use attribute name to access content

use qualigied name, or attribute object

use xpath to access deeply nested content

(warning) XPath seems to be too complex for implementations to support, aim for distinct separation between data model and query model
(warning) index and name both run into limitation when used with super types

Type System

Type information through extention of a single super type

Type infromation from multiple super types

(warning) A Type system does not have to imply reuse,
(info) Java interfaces are pure type system, the class system is for code reuse.
(info) XML has the same separation by way of substiution groups

Handling of FeatureCollection

We need to ensure that we can determine the schema of children allowed in a feature collection:

collection schema refernces child schema

collection schema refernces schema of children

children represented as a nested feature attribtue with multiplicity

In addition their are the following tradeoffs:

Parent contains references (or filter) defining contained children

Children contain a back refernece to collection, prevents children from being in more then one collection

Handling of Schema

 

FeatureType

Attributes providing name and type information

Concept of multiplicty, and nilable

Allow for restrictions on attributes (such as length)

 

"Complex"

Allow for a choices

Allow for complex attributes

complex attribute is a feature

 

FeatureCollection Type

access to child feature type

FeatureCollections with mixed content

FeatureCollection child modeled as an complex attribute

Needs to be able to describe the contents of the previous section:

(question) Evaulation: Is the schema "rich" enough to generate XPath expressions into the content?

Handling of Content

Here is a description of a "reference" GML document, we can use it as a reference point to see how capable the different modeling systems are.

Flat

Simple feature content as used by JUMP & Shapefile

Complext Content

Support for complex types, usually represented as an object. Support is required at the schema level to enable the consturction of XPath queries into the complex content.

Multiplicity

Features contain repeating (or optional) elements:
Attribute Multiplicity: min/max used to constrain the number of times attribtues are able to occur
List multiplicity: Values are represented as a list, min/max are used to bound the list

Required for:

(warning) Note the Lists representation agree directly with the XPath query model, but does not allow for validation of complex features. It is presented due to its use by several of the toolkits (not because it represents a valid model for handling GML3 content).

Nested Features

A form of complex content in which Features are allowed to contain other Features.

Support nested features as normal attribtues, depending on implemetnation no additional may be required over and above that required for complex content.

Multiple Collections

Unsure what metadata support is required to represent this kind of document?

Feature References

Same problem as before, what captures the schema information for a GML Document?

Note: Refernece should be handled as a lazy Feature (require another pass through the data when resolved?), either that or are we stuck with loading the document into memory.

Mixed Content

Allow children of several types within a collection. GML may limit this to chilfren within one substiution group.

Nested Collections

Allow collection contents to be collections themselves.

Comparison

Note chart is based on documented intent, not ability.

Name

SuperType

ID

Feature Access

FeatureType

FeatureCollectionType

Feature

JUMP

 

 

GeoAPI 2.0

 

GeoTools 2.0

 

GeoTools 2.1

Deegree

Requirements

We require the feature model to be "big enough" in modeling power to support:

  • Level 0 Profile of GML3 for WFS
  • XPath and SLD Documents

Level 0 Profile of GML3

The exact requiremetns are outlined here:

In breif:

  • complex type support
  • subset of geometry types

There is some discussion of references and dealing multiple feature collections and feature references.

XPath and SLD Docuemnts

Our FeatureType system must be complete enough to allow the generation of XPath expressions as used by Expressions in SLD documents.

The XPath view of a document is limited to the following:

  • nodelists

To support this model we do not need the full set of validation constructs requried for GML3 (such as choice, restrictions or even multiplicity).

Example

Lets practice with the following docuemnt:

Node breakdown of example:

Root:

  • xml-stylesheet type="application/xml" href="sco.xsl"
    • gml:FeatureCollection
    • gml:boundedBy: Envelope
    • gml:featureMember
      • sco:wq_plus
        • sco:sitename: BALRANALD WEIR
        • sco:measurement
          • sco:determineand_description: 16/JAN/94
          • sco:result: Turbidity
        • sco:measurement
          • sco:determineand_description: 24/JAN/94
          • sco:result: Turbidity
        • sco:location: Point( 22, 134.5399658, EPSG:4283 )
        • sco:nearestSlimePit: Point( 22.1,143.53399658, EPSG:4283 )
        • sco:sitename: RWWQ004
    • gml:name: My Boreholes

Points of interest:

  • Root node is not the same as the root element
  • Root node contains the entire document including the xml-stylesheet (xmlns and xmlns:prefix are not covered though)

Location Paths:

Similar to unix shell:

Root node

/

Selects the root node

Child element

gml:Point

Point( 22, 134.5399658, EPSG:4283 ), Point( 22.1,143.53399658, EPSG:4283 )

 

 

Note this is relative

Reference

Data Format Reference

This review will focus on the different modeling systems ability to capture the Shapefile and/or GML document mentioned above.

Shapefile

The other thing we should look at is the humble shapefile, this was used to drive the expressiveness of the existing GeoTools modeling abilities. We really dont want to backtrack (although shapefile is so simple it will be hard to).

Model

shapefiles, one per feature type

FeatureType

shapefile, details in header

Feature

entry in shp file, using shx to find a row in dxf file

FeatureCollection

n/a

Schema

Model

Collection

ID

restriction

flat

child

row number

Databases

Database land is all wrapped up in the Simple Features for SQL specification. The main benifit is a definition of the basic Geometry types we use in GeoTools (implemented by the JTS library).

Model

Relational

FeatureType

table (or view) details in metadata

Feature

Row

FeatureCollection

Result Set

Schema

Model

Collection

ID

 

 

 

 

 

 

fidmapper function

Note: Not all databases support SFSQL (and Oracle supports a few interesting things likes curves)

DJB: you should also mention object-relational DB techniques (like hybernate) or just simple multiplicity-by-table-joining.

Unified Modeling Language

Model

Object-Oriented

FeatureType

Class

Feature

Object

FeatureCollection

Aggregation

GML2

Model

XML

FeatureType

XMLSchema

FeatureCollection

complext type that extends gml:FeatureCollection

Feature

element extending gml:Feature

 

child of gml:FeatureCollection's gml:featureMembers element

Schema

Model

Collection

ID

 

 

 

 

 

 

 

 

 

 

 

The GML2 geometry model is similar to JTS (although based on some ISO specification).

GML3

Model

XML

FeatureType

Direct/indirect extention of gml:AbstractFeatureType

 

Attribute is a local xsd:element

FeatureCollection

complext type that extends gml:FeatureCollection

Feature

element relization of GF_FeatureType

 

child of gml:_FeatureCollection's featureMember element

 

member of the gml:_Feature substiution group

Schema

Model

Collection

ID

 

 

 

 

 

 

 

 

 

 

 

Reference: GML-3.1.pdf

  • Page 67 for Feature
  • Page 68 & 528 for FeatureCollection
  • Page 530 for "Annex E UML-to-GML Application Schema Encoding Rules"

GML3 includes a more complicated Geometry model then supported by JTS. Higher order geometries (like surface) are supported in addition to curves and topologies.

What is interesting is that "Annex E" describes a mechanical way to port a UML model to GML (the bridge between programmer and geographer has been crossed before us).

Reviews

GeoTools 2.0

Schema

Model

Collection

ID

 

 

 

 

 

 

 

 

 

  • As you can see the GeoTools 2.0 schema model far outstrips its Modeling abilities.
  • Multiplicity is supported as attribute arrays
  • Collections are not considered Feature in their own right.
    Schema indicates multiplicity support, resulting values indistingishable from support of complex content such as a list.
  • supertype support and name/index attribute access inconsistent, in practicel terms super type is unsupported
  • Feature indicates xpath support (but not implemented)
  • FeatureType indicates xpath support (but not implemented)

GeoTools 2.1

Schema

Model

Collection

ID

 

 

 

 

 

 

 

 

 

 

 

  • GeoTools 2.1 added support for restrictions
    • by use of facets (defined as a series of Filter associated with an AttributeType)
  • Collections are now considered a Feature
    • Cannot determine allowable child features from collection feature type schema.

GeoAPI 1.0

Schema

Model

Collection

ID

Not yet reviewed

GeoAPI 2.0

Schema

Model

Collection

ID

 

 

 

 

 

 

  • FeatureType is mustable? Until frozen/used?
  • Support for Object Attributes
  • FeatureCollection is considered a Feature
  • FeatureCollectionType incomplete
    • Cannot determine allowable child features

DeeGree

Schema

Model

Collection

ID

 

 

 

 

 

 

GPL toolkit actually able to take on complex GML, used as an input to the GeoAPI model. The model supports complex types but the implementations do not seem to follow, at least not on the head of their repository.

JUMP

Schema

Model

Collection

ID

 

 

 

  • Simple FeatureType model, FeatureType is non mutable.
  • Support for Object Attributes
  • Collections are allowed children of a single type.
  • Collections are not considered Feature in their own right.
  • Collections are held in memory

GML2 & GML3

Schema

Model

Collection

ID

 

 

 

 

 

 

 

 

 

 

 

GML3 makes the transition to ISO based Geometries.

  • No labels