Motivation: |
Allow concurrent access to a CRS authority with caching support |
|
|---|---|---|
Contact: |
||
Tracker: |
||
Tagline: |
|
This page represents the current plan; for discussion please check the tracker link above.
Currently, GeoTools is tuned for use by a very small number of concurrent users. When deploying this subsystem in a highly threaded environment with thousands of users some shortcomings are revealed:
pool and findPool) are able to function in the face of multiple threads (both reading and writing).CRSAuthority implementation making use of an EPSG Database, we need to ensure that supporting multiple threads does not result in Connections being leaked from the java.sql.DataSource provided.This proposal has been accepted; implemented and is waiting the attentions of the Module Maintainer for referencing. We are currently running two implementations of the various base classes.
|
|
|
no progress |
|
done |
|
impeded |
|
lack mandate/funds/time |
|
volunteer needed |
|---|
2.4-M4:
2.4-RC0:
2.4-RC1:
This results in no change to client code - all client code should be making use of one of AuthorityFactory sub-interfaces or the CRS facade as documented here:
So what is this change about then? This change request is for the internals of the referencing module, and the relationship between the super classes defined therein and the plug in modules providing implementations.
Requests:
CoordinateReferenceSystem, CoordinateSystem, Datum, CoordinateOperation, etc.BEFORE |
Proposed |
AFTER |
Role and Responsibility |
|---|---|---|---|
|
|
|
Class diagrams, additional diagrams follow. |
|
unchanged |
unchanged |
|
|
unchanged |
unchanged |
|
|
|
|
Authority providing the definition of referencing objects for a code |
BEFORE |
Proposed |
AFTER |
Role and Responsibility |
|
|
|
Manages a shared cache and worker creation/dispatch/reuse for multiple threads |
|
|
|
Decorator used to wrap cache support around an existing authority |
|
|
|
Acts as a super class for authority factories making use of a cache |
|
|
|
Shared cache implementation |
|
|
|
An off the shelf component from commons-pool |
|
|
|
remove |
|
|
|
Implementation backed by official EPSG database |
|
|
|
When the EPSG database is loaded into Oracle |
BEFORE |
Proposed |
AFTER |
Role and Responsibility |
|
|
|
Direct access to an authority |
|
|
|
NullObject used to maintain stand-alone functionality |
|
|
|
Consult an offical EPSG database for CRS definition |
|
|
|
Dialect for Access SQL |
|
|
|
Dialect for ANSI SQL |
|
|
|
Dialect for Oracle SQL |
BEFORE |
Proposed |
AFTER |
Role and Responsibility |
|
|
|
Holds all the factories needed to create stuff. |
|
|
|
Creates the default implementations provided by the GeoTool library |
|
|
|
Uses weak references to store set contents |
|
|
|
Isolate toUnique method in order to implement "intern" functionality |
We are documenting this as a refactoring with BEFORE and AFTER pictures. For design alternatives please review the comments of GEOT-1286
Runtime Overview |
Class Diagram |
Sequence Diagram |
|---|---|---|
|
|
|
For background reading on the design of the GeoTools referencing system:
FactoryOnOracle (ie a BufferedAuthorityFactory) allows multiple threads, making use of a an internal pool as a cache for objects already constructed. In the event of a cache miss the backingStore is used to create the required object.
FactoryUsingOracleSQL (ie a DirectAuthorityFactory acting as a "backingStore") has synchronized each and every public method call (internally it makes use of a Thread lock check to ensure that subclasses do not confuse matters).
When creating compound objects will make a recursive call to its parent buffered FactoryOnOracle. This recursive relationship is captured in the sequence diagram above.
A Timer is used to dispose of the backingStore when no longer in use.
FactoryOnOracle (ie a BufferedAuthorityFactory) makes use a pool (a HashMap of strong and weak references) in order to track referencing objects created for use by client code. By default, the 20 most recently used objects are hold by strong references, and the remainding ones are hold by weak references. A second cache, findPool, makes use of a HashMap of WeakReferences in order to keep temporary referencing objects created during the use of the find method.
The garbage collector is used to clean out weak references as needed.
A single connection is opened by FactoryOnOracle, and handed over to the backingStore (ie FactoryUsingSQL) on construction. This connection is closed after a 20 mins idle perriod (at which point the entire backingStore is shut down). This work is performed by a timer task in DeferredAuthorityFactory, not to be confused with the thread shutdown in DefaultFactory, which is a shutdown hook used to ensure the connection is closed at JVM shutdown time.
The referencing module functions as normal, classes have been renamed according to function:
Runtime Overview |
|---|
|
|
Dispatch and Adapters |
|---|---|
|
The default CRSAuthorityFactory used by client code such as the CRS facade |
|
Decorator often used to reorder axis to agree with the expectations of simple web software |
|
Acts as an "Adapter", making all known crs authorities avaialble for one stop shopping |
|
Uses a FactoryRegistry to manage as singletons all the following .... |
|
Authority Implementations |
|
A "Builder" is able to convert from epsg code into full |
|
A "Builder" that takes hard coded definitions of "AUTO" and "AUTO2" codes and makes them available |
|
Used to hoist "extra" epsg codes definitions in common use |
|
Internals |
|
A cache used for storing referencing objects |
|
From the apache commons library, used to manage worker lifecycle |
|
A "Builder" that uses the definitions provided by the EPSG database loaded into oracle tables |
EPSGOracleThreadedAuthority allows multiple threads, making use of ReferencingObjectCache in order to return objects previously constructed and an ObjectPool of workers to create new content in the event of a cache miss.
Class Diagram |
|---|
|
To build compound objects the workers will need to share the cache with the parent.
Class |
Theadsafe |
|
|---|---|---|
|
yes |
Allows multiple threads |
|
yes |
All public methods are synchronized (allowing class to be used in a standalone fashion |
|
yes |
Allows multiple readers, read/write lock used on individual cache entries |
The following sequence diagram shows the behaviour of EPSGOracleThreadedAuthority when responding to a createDatum request. Initially the requested datum is not in the cache, a worker is retrieved from the ObjectPool and used to perform the work. Of interest is the use of the shared ReferencingObjectCache to block subsequent workers from duplicating this activity.
Sequence Diagram |
|---|
|
The cache has been isolated into a single class - ReferencingObjectCache. This class is responsible for storing strong references to objects already created and released to code outside of the referencing module.
The ReferencingObjectCache stores an internal Map<Obj,Reference> as described in the following table.
Reference |
Use |
|---|---|
weak |
Used by default to store cache contents |
strong |
Used for frequently used objects up until a configured threshold (default of 50) |
placeholder |
Placed into the cache to block readers, used to indicate work in progress during object construction |
Sequence Diagram |
|---|
|
As noted above the ReferencingObjectCache class is thread safe.
|
The find method makes use of fully created referencing objects (like Datum and CoordinateReferenceSystem) in order to make comparisons using all available meta data. This workflow involves creating (and throwing away) lots of objects; and falls outside of our normal usage patterns. To facilitate this work flow:
|
|
|
EPSGOracleThreadedAuthority is the keeper of a DataSource which is provided to EPSGOracleDirectAuthority workers on construction. The EPSGOracleDirectAuthority workers use their dataSource to create a connection as needed, they will also keep a cache of PreparedStatements opened against that connection.
The ObjectPool lifecycle methods are implemented allowing EPSGOracleDirectAuthority object to be notified when no longer in active use. At this point their PreparedStatements and Connection can be closed - and reclaimed by the DataSource
We will need to make use of a single worker (and use it to satisfy multiple definitions) when implementing the find method.
By providing hints to tune the ObjectPool we can allow an application to:
Update Module matrix pages
Update User Guide:
Issue Tracker: