Data model revisions

Data Model Review

Currently, Continuum uses at least one bi-directional M:N mapping in the database. While this may be conceptually correct from a pure object standpoint, it puts a lot of stress on the ORM layer. I'd like to review the DB schema as a whole and see if we can reduce the strain on the ORM layer a little. Demanding things like bidirectional many-to-many mappings will always expose Continuum to the risk of ORM bugs, and can usually be worked around using mapping tables and a service layer to handle object graph updates.

Moving to a service layer along with a simpler DB schema will reduce our risks of locking errors and poor performance.

Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.
  1. Nov 20, 2005

    Brett Porter says:

    I'm not certain about this. I thought I removed the M:N relationships, firstly, ...

    I'm not certain about this. I thought I removed the M:N relationships, firstly, but certainly am using bidirectional ones. There are plenty of tests and we can build them out to test concurrency too.

    We've really put our stake in with JDO at this point and should probably work with jpox/Apache JDO to improve the stability of specific issues rather than write our own service layer, depending on what investment that requires.

    I think some simple improvements to the fetch groups (eg by storing information on the front page with the object itself) and basic object caching could help dramatically without jumping off the design deep end.

    Please identify specific pain points and work to address them from that angle.

  2. Nov 20, 2005

    Brett Porter says:

    I will add another comment here as I had meant to note it earlier. One way this ...

    I will add another comment here as I had meant to note it earlier. One way this could be improved is in being clearer about what fetch groups are present when an object is used. I think that the model implementation leaks a few too many details in its present form. Need to find the balance between not fetching too much information and not fetching enough an dmaking the api clear on that. Decoupling the build results seems the most obvious to me again as I think the data impact of the rest is negligible.

    Also keep in mind the potential impact of a distributed set of continuums needing to work off one data model.

  3. Nov 29, 2005

    John Casey says:

    I agree that we shouldn't be jumping ship on JDO, but I think we'd be much bette...

    I agree that we shouldn't be jumping ship on JDO, but I think we'd be much better served to identify the particular points where we need READ_UNCOMMITTED, and try to eliminate the bidirectional mappings that are probably causing this problem. A miniscule service utility could provide the same sort of cascading update without resorting to READ_UNCOMMITTED to avoid database locking issues.

    WRT distributed Continuum deployments, I think we just need to work on minimizing caching in the Continuum application itself. If we can let the ORM layer handle caching, then make sure our ORM layer can use a distributed object cache, we shouldn't run into big problems with stale data coming from the DB.

    Not sure what you mean by fetch groups, but prefetching is something we need to be really careful about IMO, as it can lead to bloating of the memory footprint, propagation delays in a hypothetical shared cache, and possibly stale data. It's an optimization step, and may require some sort of view on the object cache to get really drilled down...I don't know, that's just a guess.