Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

...

There are a couple things wrong with this picture:

  • it prescribes an order (sequential or otherwise) through the data
  • it gives us no chance to turn that lovely "count" into naked SQL for speed and profit

In a sense it depends on the internal structure of the FeatureCollection and by showing us too much of the guts, takes away the FeatureCollection implementors freedom to do magic stuff.

...

What kind of magic stuff?

  • like splitting the collection in two and processing half on each processor
  • picking up the visitor and processing over on the server side
  • abusing the internal structure of the collection (perhaps there is an index?)

The amount of magic you can do really depends on the abilities of your FeatureCollection, different magic is available for an indexed shapefile vs a local HSQL Database. You may need to balance the amount of data collected against any distribution that may be in play (consider a geometric "buffer" opperation performed on a remote PostGIS).

...

Code Block
Sum length = new Sum( "length" );
Count count = new Count() ;
aFeatureCollection.accepts( new FeatureVisitor[]{ length,  count } );
System.out.println( "Average length:"+ length.getTotal()/count.getCount() );

 

 

Pros:

  • replaces FeatureResults getCount() .... aFeatureSource.features( Filter filter ).visit( new Count() )
  • replace getBounds() can do the same kind of thing
  • Visitor is also easier to optimize if you have multiple processors ...

Cons:

  • hard to optimize into SQL