When most people think of XML data transforms they probably think of XSLT. XSLT is great, but as you'll know if you've used it, it has it's issues too. XSLT can be great for making structural transformations on XML data. However, life can sometimes get more difficult:
- When you need to process or transform data buried within the XML structure e.g. a complex attribute value. A classic example of this is a date field. Processing and transforming data like this can be real headache with XSLT.
- When the XML data being transformed is normalized. This can result in your stylesheets becoming a lot more complex. It can also result in a drop in performance of your stylesheets.
- When transformations become detailed, stylesheet maintenance can become an issue.
XSL attempts to solve #1 above through XSL Extensions. This however can open up another can of worms in terms of portability across different XSL Processors and even between versions of the same XSL Processor. Just take a look at the mailing lists for some of the more popular XSL Processors. Stylesheet portability as a result of XSL Extensions is a popular topic.
All that said, XSLT is a very powerful and popular XML templating solution. There are other transformation/templating solutions out there and they too suffer under the wrong conditions. We're not bashing XSLT, we're simply using it's popularity to make the basic point that there's no single "perfect" solution.
Mix and Match Transformation Technologies
What we set out to do when creating Smooks was to create a framework that could be used to "mix and match" different technologies within the context of a single message transform. In this way, users of Smooks could select the transformation technology most appropriate to a particular aspect of a message transform. For example, they could use XSLT for the main transformation and use Java (or Groovy) for performing date field transformations into a format that's consumable by the main XSL transform, while at the same time keeping the XSL stylesheets portable. This last point is important.
Isolate Transformation Technologies
We wanted to create a level of isolation between the different technologies supported by Smooks. We wanted to reduce the chances of a change of one technology effecting the other technologies used in a message transform e.g. reduce the likelihood of a change of XSL Processor causing a situation where mods/rewrites to your XSL stylesheets (or your Java/Groovy code for e.g. processing the date fields) would be required. We felt that achieving this could be possible if we could provide an environment wherein the different technologies could be used in their purest form e.g. where solving a problem caused by a shortcoming in the XSL language was not solved by bolting something foreign onto the side of the XSLT.
Support Fragment Based Processing
In order to support a "Mix and Match" model in a useful way, we felt it could only work if Smooks transformation resources could be targeted at message fragments. Taking the usecase example outlined above where we use Groovy to pre-process date field values into an XML format that's readily consumable by XSLT, it would be useless if the Groovy resource logic had to traverse the message structure in order to find the date fields. This should be something supported by Smooks. Smooks should be (and is) able to "visit" the Groovy logic upon the date fields, leaving the Groovy logic with nothing to do other than transform the date field as required.
Message Profiling
We wanted to create a framework in which transformation and processing resources (e.g. an XSLT) could be targeted at message fragments based on message profiles. Groups of messages sharing the same message profile can then be targeted through a single message profile.
Smooks - The Independent Framework
In order to allow technologies to be "Mixed and Matched" in a way that offers isolation between technologies, we need an independent framework responsible for managing and applying these technologies. This is the roll filled by Smooks.
To fill this roll, we've purposely tried to keep Smooks' range of responsibilities as narrow as possible. We've made sure Smooks Core itself does not become a transformation technology i.e. an alternative to XSLT, StringTemplate etc. In fact, Smooks Core makes no assumptions in terms of what a particular piece of visitor logic is doing. The visitor logic may be performing a transformation on the fragment being visited, or it may simply be analyzing it in some way without transforming.
So the responsibilities of Smooks Core are basically around:
- Loading, instantiating and caching resources on a per message profile basis. If profiling is not in use, then all resources are targeted at all messages.
- Parsing (SAX Parser) and filtering messages, applying the resources to the message fragments where appropriate.