This is an RnD page as we explore design ideas for a GeoGit DataStore.
This page is backed by an implementation on a github branch "markles":
The design of the GeoGit DataStore is pretty straight forward:
- It extends ContentDatastore
- It maintains an internal "delegate" DataStore which is used as the "workspace" to hold the current checkout
- It maintains a GeoGit class and uses it to access the BLOB structures maintained in BerklyDB (or other Key Value database). To make common tasks easier a GeoGitFacade class with utility methods for performing common blobbing tasks (such as access an index BLOB describing the contents).
This section will be filled in with notes on the GeoGitFacade class; currently he design is straight forward.
Director or Index BLOB
GeoGit contents is pretty unstructured (as expected for a pile of BLOBS). We should look into supporting:
- A spatial index? This is a spatial
DataStore Delegate Design
- The original "PostGIS-versioned" datastore was implemented as a direct wrapper around PostGISDataStore offering an example of how this can be performed.
- The JDBCDataStore classes are marked final specifically to prevent subclassing; instead a strategy object SQLDialect is used to configure JDBCDataStore teaching it the abilities of each database. This handles all the responsibilities of SQL encoding; and mapping attributes to attributes and feature id.
The above shows *JDBCDataStore* and the "support classes" that actually make up the implementation. These classes are considered part of the JDBCDataStore implementation and are carefully tested to work together. Central to this design is marking the classes final to prevent subclasses from locking down the API contract (allowing JDBCDataStore and support classes to be fixed and improved over time without the burden of subclasses to lug about).
The central challenge here is how to reuse the above work; as GeoGitDataStore would like to also make SQL queries and take part in the generation of featureIds; splice in ResourceIds and so on.
Justin Deoliveira has asked that we work in a fork of GeoTools in order to explore options (and then report back on what needs to be changed when merging in the work).
Pure DataStore Wrapper
The goal here would be to define a pure DataStore wrapper that would encode revision+featureid information into the attributes provided by the delegate DataStore (which would be treated as a simple storage of "row sets").
So for this approach we would have *GeoGitFeatureWriter* (similar to FilteringFeatureWriter):
- Able to be used on any DataStore (would recommend property DataStore as it supports multi geometry)
- Lack of tight integration with SQL Generation may result in inefficient code
- Need to write delegating implementations for ContentDataStore implementation (and all support classes)
Following the example of postgis-versioned a wrapper DataStore is defined to add versioning against a JDBCDataStore.
Definition of a IJDBCDataStore which is implemented by JDBCDataStore and GeoGitWrappingDataStore. This allows classes that previously expected direct access to work with either.
This has currently been implemented and all jdbc-ng plugins test. Test for dependent modules also pass as well as a few tests that have been written using a GeoGitWrapperDataStore.
- The rational behind adding the interface was so that the GeoGitWrappingDatastore or a separate implementation or direct versioning could still use the existing class structure of JDBC classes (e.g. SqlDialect).
- The only other option would have been to reimplement the entire module for *gt-jdbc*. This would have resulted in a lot of duplicated classes with only package name changes or with only a redeclaration of a field type. This would have caused a lot more maintaining effort in the long run and was counter to DRY principles.
- Our intent was to develop a solution that allowed the API to maintain final implementation of existing classes without blowing out the classes.
In the above example the type narrowing where a JDBCSimpleFeatureWriter getDataStore() returns a JDBCDataStore is causing the use of IJDBCDataStore to spread throughout all JDBCDataStore implementations classes; and the various implementations for PostGIS, DB2, Oracle and so forth.
This approach has been more complex than originally thought. The implementation of an iterface was not difficult, however, a lot of the JDBCDataStore methods had to be made public rather than private due to the number or package class calls that were performed. This exposes a much larger public API for JDBCDataStore than the community would probably like.
Though many classes had to be changed most of the changes were trivial. The main issues encountered stemmed from the type narrowing of overridden methods. Superclasses that were returning ContentDataStore instead of DataStore were narrowed to JDBCDataStore. These calls broke as we moved to interface returns. A few changes were required up the tree to allow the type narrowing with interfaces instead.
Changes were also required to introduce IJDBCFeatureStore and IJDBCFeatureSource. These interfaces were needed to provide similar wrapping functionality at low levels.
- It is implemented and all existing tests pass!
- Making more of JDBCDataStore method public
- Introduction of IJDBCDataStore interfaces gives the API more surface area to maintain
- The IJDBCDataStore ends up being a grab back of methods at different levels of abstraction (basically a side effect of an implementation issue rather than a clear concept)
SQLDataStore Abstract Class
This is a follow up Idea; which should result in a cleaner implementation of the above approach without introducing an interface (is a bit scary and runs counter to the design goals of JDBCDataStore).
Rather than introduce an Interface; we could also pull up the concept into an abstract class (suggested name "SQLDataStore"). This has the advantage of still extending ContentDataStore; and the name matches up with *SQLDialect*; giving it a nice clear set of responsibilities.
Direction: Jody has suggested this idea in response to IJDBCDataStore; as we really don't want an interface (abstract class offers greater stability).
In this scenario, JDBCFeatureStore and JDBCFeatureSource would also undergo a similar abstraction parenting with Version-aware wrapper classes.
Classes like VersionedJDBCInsertFeatureWriter et all can be merely subclassed as they are not marked final.
The difficulty that has arisen with type narrowing and interface/class usage has made the Interface solution feel overly complex.
The solution also exposes more of the interface than the community will probably feel comfortable with. At this point, with the DataStoreWrapper approach it makes since to restart using an AbstractSQLDatastore that will have a similar purpose to the Interface (however) an interface representing the public API might still be useful for exposing to external modules. Type narrowing will be easier to manage and issues of protected properties on objects that were hidden from the wrapping class would also be easier to overcome (The versioning datastore was in a separate package so a few of the properties of the wrapped datastore were unreachable without altering access).
- Clear migration from existing IJDBCDataStore implementation
- Clear definition of SQLDataStore as an abstract class
- Allows JDBCDataStore support classes to maintain their package visibility relationship with SQLDataStore and JDBCDataStore
- Concepts pulled up into SQLDataStore would be those common to both GeoGitDataStore and JDBCDataStore
- Can use package visibility to avoid exposing too much to subclasses
- Large number of classes introduced
- Still gives more surface area to JDBCDataStore concepts (at least it is still controlled and locked down to GeoGIT and JDBCDataStore)
RevisionSupport Strategy Object
Another design alternative suggested by Jody Garnett was to extend JDBCDataStore from the inside with a strategy object to sort to how revision information is handled. This could be defined as an RevisionSupport; with a default implementation that encodes revision information into a 'revision' attribute which can be combined with the normal FeatureID.
If an implementation of SQLDialect supported RevisionSupport it would be used directly; allowing a datastore implementation such as Oracle to directly use native functionality.
This design is only a suggestion; it probably provided too much internal access to JDBCDataStore classes beyond what SQLDialect already provides?
We would add a second strategy object (called RevisionSupport) and use it wherever we use sqlDialect:
Code Example VersionedJDBCUpdateWriter:
So most of the JDBCSupport classes are unchanged; the core JDBCDataStore would make use of both SQLDialect and RevisionSupport to do its job:
- Keeps the relationships internal to JDBCDataStore and controlled by a strategy object
- Similar to the design of LockingManager; allows internal access to SQLDialect and participation in the generation of select statements etc.
- Nobody has been interested or understood this suggestion
One of the smallest changes would be to copy the design of java.sql.Wrapper:
This could be use to quickly allow classes that expect access to their JDBCDataStore to unwrap the delegate as in the following example:
Any code (such as a reader) that needed access to JDBCDataStore could now do so with a single line of code:
- Allows JDBCDataStore to continue its package visibility contracts with JDBCDataStore support classes.
- Updating references to getDataStore() in JDBCDataStore support classes.
- Not sure if it solves the problem