Added by Christian Mueller, last edited by Christian Mueller on May 17, 2009  (view change)

Labels:

Enter labels to add to this page:
Wait Image 
Looking for a label? Just start typing.

This plugin offers the the possibility to use pregeneralized features to reduce cpu and memory usage at runtime. The effect for the client is simply a better response time.

Motivation

Spatial features have a least one geometry, which mostly contains a large number of points. As an example, the border of Austria (which is a small country) is a polygon constructed from 380000 points.
Drawing this border on a screen with a resolution of 1280x1024 will draw each pixel many,many times. The same holds true for printing on a sheet of paper.

The idea is to generalize this geometry, saying it is sufficient to have a minimum distance of 500 meters between 2 Points of the polygon. This generalized geometry has still enough points to be drawn on the screen or on a sheet of paper.

Idea

The idea is to have a data store / feature source / feature reader implementation which acts as wrapper for the original features and their generalized geometries. Additionally there is a new Hint GEOMETRY_DISTANCE which has a value for the minimum distance between two points required. This Hint can be passed within the Query object.
The wrappers itself behave like the original objects, except returning geometries dependent on the new hint. If no hint is given, the original geometries are returned.

Relationship between GEOMETRY_DISTANCE and generalized geometries

Assume the original geometries have an accuracy of 1 meter and we have generalizations for all geometries with 5m, 10m, 20m and 50m.

Requested Distance (dist) 
Returned geometry
dist  < 5
original geometry
5 <= dist < 10
geometry generalized to 5 m
10 <= dist < 20
geometry generalized to 10 m
20 <= dist < 50
geometry generalized to 20 m
50 <= dist
geometry generalized to 50 m


Physical layout

Definition: The Base Feature is the original feature which is the starting point

A geometry generalized to m meters will be written as g(m) , e. g.   g(5) is a geometry generalized to a minimum distance of 5 meters. The original geometry will be written as g(1)

A feature has the following components

  1. FID (Feature Identifier)
  2. a set of data attributes
  3. a set of geometry attributes (mostly only one, depending on the data store)

The streams feature from gt-sample has

FID CAT_ID CAT_DESCR the_geom

CAT_ID and CAT_DESR are integers, the_geom is a linestring.

Vertical layout

For each generalization, the whole feature set is duplicated, resulting in

streams

FID CAT_ID CAT_DESCR the_geom
streams.1 1 4711 g1(1)
streams.2 2 4712 g2(1)

streams_5

FID CAT_ID CAT_DESCR the_geom
streams.1 1 4711 g1(5)
streams.2 2 4712 g2(5)

streams_10

FID CAT_ID CAT_DESCR the_geom
streams.1 1 4711 g1(10)
streams.2 2 4712 g2(10)

streams_20

FID CAT_ID CAT_DESCR the_geom
streams.1 1 4711 g1(20)
streams.2 2 4712 g2(20)

streams_50

FID CAT_ID CAT_DESCR the_geom
streams.1 1 4711 g1(50)
streams.2 2 4712 g2(50)

The only difference between these 5 feature sets are the geometries, which are generalized by 5,10,20 and 50 meters.

A new Feature

GenStreams

FID CAT_ID CAT_DESCR the_geom

is hiding all other features and using them dependent on the GEOMETRY_DISTANCE hint.

The disadvantage is the duplication of all attribute values for each generalization. If the features are stored as shape files, there is no other chance because shape files allow only one geometry. Storing the features in a database offers the possibility to use SQL views avoiding this redundancy.

Horizontal layout

The generalized geometries were added as additional attributes

streams

FID CAT_ID CAT_DESCR the_geom the_geom_5 the_geom_10 the_geom_20 the_geom50
streams.1 1 4711 g1(1) g1(5) g1(10) g1(20) g1(50)
streams.2 2 4712 g2(1) g2(5) g2(10) g2(20) g2(50)

Again, there is a new feature

GenStreams

FID CAT_ID CAT_DESCR the_geom

The generalized geometry attributes are hidden and are not part of the GenStreams feature type.

Mixed layout

A combination of horizontal and vertical design

streams

FID CAT_ID CAT_DESCR the_geom
streams.1 1 4711 g1(1)
streams.2 2 4712 g2(1)

streams_5_10

FID CAT_ID CAT_DESCR the_geom_5 the_geom_10
streams.1 1 4711 g1(5) g1(10)
streams.2 2 4712 g2(5) g2(10)

streams_20_50

FID CAT_ID CAT_DESCR the_geom_20 the_geom_50
streams.1 1 4711 g1(20) g1(50)
streams.2 2 4712 g2(20) g2(50)

The new feature is
GenStreams

FID CAT_ID CAT_DESCR the_geom

Independent of the used physical layout, the feature type of GenStreams is always the same. All other feature types are not visible and are called backend features.

Creating a pregeneralized data store

Starting point is the class org.geotools.data.gen.PreGeneralizedDataStore. Prior to create an object of this class, you need and object of class org.geotools.data.Repository and one of class org.geotools.data.gen.info.GeneralizationInfos

GeneralizationInfos

Each PreGeneralizedDataStore has exactly one object of type GeneralizationInfos. This object holds a collection of org.geotools.data.gen.info.GeneralizationInfo objects. The size of this collection is equal to the number of org.geotools.data.gen.PreGeneralizedFeatureSource objects contained in the data store.

Properties:

  • infoMap
    Mapping from names of generalized features to the corresponding GeneralizationInfo objects
  • dataSourceName
    Optional, default data source location for all GeneralizationInfo objects
  • dataSourceNameSpace
    Optional, name space for the default data source location

GeneralizationInfo

A GeneralizationInfo holds the configuration information for one feature type and his generalized geometries.

Properties:

  • featureName
    Name of the generalized feature ("GenStreams")
  • baseFeatureName
    Name of the base feature ("streams")
  • geomPropertyName
    Name of the geometry attribute in the base feature ("the_geom")
  • generalizations
    Collection of org.geotools.data.gen.Generalization objects
  • dataSourceName
    Location of the datasource for the base Feature (URL of shape file)
    If no location is given, use location from GeneralizationInfos parent object.
  • dataSourceNameSpace
    Optional, name space for the dataSourceName

Generalization

A Generalization object belongs to a GeneralizationInfo object and holds information for geometries generalized to a given distance.

Properties:

  • distance
    The generalization distance
  • featureName
    The the name of the feature containing these geometries ("streams_5_10")
  • geomPropertyName
    The name of the geometry property ("the_geom_5")
  • dataSourceName
    Optional, if not specified,  dataSoureName from the GeneralizationInfo parent object is used.
  • dataSourceNameSpace
    Optional, name space for the dataSourceName

Reading GeneralizationInfos from XML

Part of this java package is a class org.geotools.data.gen.GeneralizationInfosProviderImpl which offers the possibility to configure a GeneralizationInfos object in XML syntax.
Code example:

GeneralizationInfosProvider provider = new GeneralizationInfosProviderImpl();
GeneralizationInfos infos = null;
try {
  infos = provider.getGeneralizationInfos("src/test/resources/geninfo1.xml");
} catch (IOException e) {
  e.printStackTrace();
}
XML Configuration for vertical layout
<?xml version="1.0" encoding="UTF-8"?>
<GeneralizationInfos version="1.0">
	<GeneralizationInfo dataSourceName="dsStreams"  featureName="GenStreams" baseFeatureName="streams" geomPropertyName="the_geom">
		<Generalization dataSourceName="dsStreams_5"  distance="5" featureName="streams_5" geomPropertyName="the_geom"/>
		<Generalization dataSourceName="dsStreams_10"  distance="10" featureName="streams_10" geomPropertyName="the_geom"/>
		<Generalization dataSourceName="dsStreams_20"  distance="20" featureName="streams_20" geomPropertyName="the_geom"/>
		<Generalization dataSourceName="dsStreams_50"  distance="50" featureName="streams_50" geomPropertyName="the_geom"/>
	</GeneralizationInfo>
</GeneralizationInfos>
XML Configuration for horizontal layout
<?xml version="1.0" encoding="UTF-8"?>
<GeneralizationInfos version="1.0">
	<GeneralizationInfo dataSourceName="dsStreams_5_10_20_50"  featureName="GenStreams" baseFeatureName="streams_5_10_20_50" geomPropertyName="the_geom">
		<Generalization dataSourceName="dsStreams_5_10_20_50"  distance="5" featureName="streams_5_10_20_50" geomPropertyName="the_geom5"/>
		<Generalization dataSourceName="dsStreams_5_10_20_50"  distance="10" featureName="streams_5_10_20_50" geomPropertyName="the_geom10"/>
		<Generalization dataSourceName="dsStreams_5_10_20_50"  distance="20" featureName="streams_5_10_20_50" geomPropertyName="the_geom20"/>
		<Generalization dataSourceName="dsStreams_5_10_20_50"  distance="50" featureName="streams_5_10_20_50" geomPropertyName="the_geom50"/>
	</GeneralizationInfo>
</GeneralizationInfos>
XML Configuration for mixed layout
<?xml version="1.0" encoding="UTF-8"?>
<GeneralizationInfos version="1.0">
	<GeneralizationInfo dataSourceName="dsStreams"  featureName="GenStreams" baseFeatureName="streams" geomPropertyName="the_geom">
		<Generalization dataSourceName="dsStreams_5_10"  distance="5" featureName="streams_5_10" geomPropertyName="the_geom"/>
		<Generalization dataSourceName="dsStreams_5_10"  distance="10" featureName="streams_5_10" geomPropertyName="the_geom2"/>
		<Generalization dataSourceName="dsStreams_20_50"  distance="20" featureName="streams_20_50" geomPropertyName="the_geom"/>
		<Generalization dataSourceName="dsStreams_20_50"  distance="50" featureName="streams_20_50" geomPropertyName="the_geom2"/>
	</GeneralizationInfo>
</GeneralizationInfos>

Data Store Repository

Prior to creating a pregeneralized data store an object implementing the interface org.geotools.data.Repository is needed. This interface has one important method

#public DataStore dataStore(Name name)

A Name object has a local name and a namespace (which could be null).
The local name corresponds to the dataSoureName property in the GeneralizationInfos, GeneralizationInfo and Generalization objects. The same holds true for the namespace parameter and the dataSourceNameSpace property.

The content of the dataSourceName could be a registered name for a datastore (of course) or an URL to a shape file,an URL to a property file containing connect parameters for a database or anything else. It depends on the implementation of the Repository interface.

Included in this package is an implementation org.getotools.data.gen.DSFinderRepository, which interprets a dataSoureName ending with ".shp" or ".SHP" as location of a shape file and anything else as a property file. This implementation will use the geotools DataStoreFinder.getDataStore(Map params) method to find the needed data store.

Another existing Implementation is org.geotools.data.DefaultRepository which is useful for creating the data stores in the application and registering them with the corresponding names.

Creating a PregeneralizedDataStore object

Using the RepositoryDSFinder class

Repository repo = new RepositoryDSFinder();
GeneralizationInfosProvider provider = new GeneralizationInfosProviderImpl();
GeneralizationInfos infos = null;
try {
  infos = provider.getGeneralizationInfos("src/test/resources/geninfo1.xml");
} catch (IOException e) {
  e.printStackTrace();
}
DataStore ds = new PreGeneralizedDataStore(infos,repo)
FeatureSource<SimpleFeatureType, SimpleFeature> fs = ds.getFeatureSource("GenStreams")

Using the DefaultRepository class

Repository repo = new DefaultRepository();
//
// register your datastores in the repository
//
GeneralizationInfosProvider provider = new GeneralizationInfosProviderImpl();
GeneralizationInfos infos = null;
try {
  infos = provider.getGeneralizationInfos("src/test/resources/geninfo1.xml");
} catch (IOException e) {
  e.printStackTrace();
}
DataStore ds = new PreGeneralizedDataStore(infos,repo)
FeatureSource<SimpleFeatureType, SimpleFeature> fs = ds.getFeatureSource("GenStreams")

Using the DataStoreFinder class

Part of the implementation is a factory class org.geotools.data.gen.PreGeneralizedDataStoreFactory.
This factory recognizes 4 parameters

  1. PreGeneralizedDataStoreFactory.REPOSITORY_CLASS
    Mandatory, the class name for a Repository implementation, must have a default Constructor
  2. PreGeneralizedDataStoreFactory.GENERALIZATION_INFOS_PROVIDER_CLASS
    Mandatory, the class name for a org.geotools.data.gen.info.GeneralizationInfosProvider implementation, must have a default Constructor
  3. PreGeneralizedDataStoreFactory.GENERALIZATION_INFOS_PROVIDER_PARAM
    Optional, a parameter which is passed to getGeneralizationInfos(obj) method of the GeneralizationInfosProvider object
  4. PreGeneralizedDataStoreFactory.NAMESPACEP
    Optional, an URI for an optional name space
Map<String,Serializable> paramMap = new HashMap<String,Serializable>();
DataStore ds = null;
try {
   paramMap.put(PreGeneralizedDataStoreFactory.REPOSITORY_CLASS.key,
              "org.geotools.data.DefaultRepository");
   paramMap.put(PreGeneralizedDataStoreFactory.GENERALIZATION_INFOS_PROVIDER_CLASS.key,
              "org.geotools.data.gen.info.GeneralizationInfosProviderImpl");
   paramMap.put(PreGeneralizedDataStoreFactory.GENERALIZATION_INFOS_PROVIDER_PARAM.key,
              "src/test/resources/geninfo1.xml")

   ds = DataStoreFinder.getDataStore(paramMap));
   } catch (IOException ex) {
     ex.printStackTrace();
   }
FeatureSource<SimpleFeatureType, SimpleFeature> fs = ds.getFeatureSource("GenStreams")

Using a pregeneralized data store

To benefit from less memory usage und less cpu consumption the usage of Hints.GEOMETRY_DISTANCE is required. It is possible to pass hints to a query object, an example for reading all features with geometries fitting for a generalization of 22 meters

FeatureSource<SimpleFeatureType, SimpleFeature> fs = ds.getFeatureSource("GenStreams");

//fs.getSupportedHints().contains(Hints.GEOMETRY_DISTANCE) must be true;
Query q = new  DefaultQuery("GenStreams");
q.getHints().put(Hints.GEOMETRY_DISTANCE, 22.0);
FeatureCollection<SimpleFeatureType, SimpleFeature> fCollection = fs.getFeatures(q);

//
// business as usual, but with generalized geometries
//

Toolbox

Included in this packages is a command line utility. Locate the jar file, the name is
"gt-feature-pregeneralized-<version>.jar Assuming the version is "2.5-SNAPSHOT", call with

java -jar gt-feature-pregeneralized-2.5-SNAPSHOT.jar 

Validating the xml config file

You can validate your xml config file with

java -jar gt-feature-pregeneralized-2.5-SNAPSHOT.jar validate myconfig.xml

Create pregeneralized geometries for shape files

Creating generalized versions for a shape file which can be used for a vertical layout.

java -jar gt-feature-pregeneralized-2.5-SNAPSHOT.jar generalize streams.shp targetDir 15.0,30

The parameters are

  • streams.shp
    The source shape file
  • targetDir
    The directory where to store the generalized shape files
  • 15.0,30
    A comma separated list of distances (integer or doubles, decimal separator is "."). This example would generate a sub directory "15.0" and a sub directory "30" in the target directory. The sub directories hold the generalized shape files. No white spaces are allowed within the distance list.

Conclusion

The focus of this module is to support the transparent handling of generalized geometries. There is no restriction how the GeneralizationInfos object is build and how the needed data stores are found.

The 2 interfaces

  1. org.geotools.data.Repository
  2. org.geotools.data.gen.info.GeneralizationInfosProvider

allow special implementations to be plugged in. As an example, a GeneralizationInfosProvider implementation can build the configuration data from a jdbc database, getting as parameter a JNDI name.

Once again, modifications of pregeneralized features are NOT possible.