Skip to end of metadata
Go to start of metadata

Note: This guide is a draft

Overview 

This guide is for Smooks developers or developer who create add-ons for Smooks.

Smooks Core

The basic architecture of Smooks is split in 2 parts:
  1. Smooks Core: Provides the basic Smooks infrastructure/framework. Smooks Core is responsible for processing a set of Smooks Resource configurations. From these configurations, Smooks Core constructs and manages "Execution Context", "Content Delivery Configuration" and "Filter" (DOM/SAX) components. Smooks Core is the subject of this section.
  2. Smooks Cartridges: Provide the real functionality on top of Smooks Core e.g. Javabean/POJO population, Templating support (XSLT, FreeMarker, StringTemplate), Scripting Support (Groovy) etc. These are the resources that will be applied by Smooks Core during filtering (see below).


The data filtering process is at the core of how Smooks functions. It is driven by a set of "Smooks Resource Configurations". Each resource configuration specifies a single resource that's used by Smooks (or another resource) during the data filtering process.

There are two main classifications of Smooks Resource:

  1. Java based resources. These are more generally referred to as "Content Handlers". They implement the ContentHandler interface.
  2. Non Java based resources. These can be any type of resource. Basically anything that's not a Content Handler. These would typically be resources (configurations etc.) that support one or more Content Handlers.


The following are examples of the Content Handler types currently in use:

  1. Stream Readers: Smooks supports filtering of both XML and non-XML data because it allows you to configure a "Stream Reader" for each filter process. If no reader is configured, it defaults to XML. So, the Stream Reader resource is responsible for generating a stream of SAX events from a hierarchical data stream (e.g. XML, CSV, EDI etc.). This stream of SAX events can then be processed by XML Element Visitors (via Smooks). More on this later.
  2. Element Visitors: After Smooks hooks a Stream Parser to the data Stream, it starts receiving a stream of SAX events i.e. "startElement", "endElement" etc. These events are then used by Smooks to select an "ElementVisitor" implementation, which process the event in some way. This is the primary extension point in Smooks, as well as being the mechanism though which Smooks supports a fragment based processing model. See the list of Smooks Cartridges for examples of ElementVisitor implementations already available with Smooks. More on this later.
  3. Element Serializers: DOM based processing supports implementation of DOM Element "Serialization Units", allowing you to implement custom serialization at a fragment level.


These topics will be covered in more detail in later section.

Smooks Execution

Smooks is executed via the Smooks class. The basic usage pattern is as follows:

  1. Smooks Construction: Create the Smooks instance using a Smooks Resource configuration stream. This instance should then be cached.
  2. ExecutionContext Creation: Create an ExecutionContext via the Smooks.createExecutionContext() method.
  3. Stream Filtering: Filter the data stream "Source" to a data stream "Result" via the Smooks.filter() method, using the ExecutionContext created in step #2.


And in code:

    // Instantiate Smooks with the config...
(1) Smooks smooks = new Smooks("smooks-config.xml");
    // Create an exec context - no profiles....
(2) ExecutionContext executionContext = smooks.createExecutionContext();

    // Filter the input message source to the output message result, using the execution context...
(3) smooks.filter(new StreamSource(messageInStream), new StreamResult(messageOutStream), executionContext);


Steps #2 and #3 are repeated for each message processed by the Smooks instance. Note that at step #3, different types of Source and Result objects can be used:

  1. Source: DOMSource, StreamSource, JavaSource
  2. Result: DOMResult, StreamResult, JavaResult


These different Source and Result types can be combined to perform e.g. DOM to DOM/Stream/Java transforms, Stream (XML/EDI/CSV etc) to DOM/Stream/Java transforms and Java to DOM/Stream/Java transforms (yes... Java to Java transforms!).

Steps #1, #2 and #3 are described in more detail in the following sections. It's not necessary to know the information presented in these sections, but it will help you understand how Smooks works.

Step 1: Smooks Construction

Smooks construction initializes an associated SmooksResourceConfigurationStore. This store will feed construction of the ExecutionContext instances created through the Smooks.createExecutionContext() method.

Step 2: Execution Context Creation

An ExecutionContext must be created in order to perform a filtering operation. The ExecutionContext fulfills the following purposes:

  1. Contains the ContentDeliveryConfig used during the filtering operation i.e. the list of Resource Configurations to be used during the filtering process.
  2. Provides a context through which ContentHandler implementations can interact.
  3. Provides a context through which the "Caller" can interact with ContentHandler implementations, before and after the filter process.


An ExceutionContext can be created based on a "profile" by calling Smooks.createExecutionContext(String). Most usecases do not require this, so we have not shown it in the Sequence Diagram illustrated here. What the Sequence Diagram does illustrate however is that "under the hood", Smooks always works based on profiles. By not specifying a profile you effectively request an ExecutionContext for the "Open Profile", which means that the Execution Context will be associated with the content delivery configuration containing Resource Configurations that were not targeted at any profile.

Step 3: Filtering

Once the ExecutionContext has been created it can be used to execute filtering operations on the data input stream. What this Sequence Diagram does not show is the process of creating the SAX Parser, which is deferred to the Filter implementation (DOM/SAX). Nor does it illustrate details of the Filter process itself (DOM/SAX). DOM and SAX Filtering will be dealt with in the following sections.

Checking the Execution Process

As Smooks performs the filtering process (processing the Event Stream generated from the Source), it publishes events that can be captured and programmatically analyzed during/after execution.

The easiest way to generate an execution report out of Smooks is to configure the ExecutionContext to generate a report. Smooks supports generation of a HTML report via the HtmlReportGenerator.

The following is an example of how to configure Smooks to generate a HTML report.

Smooks smooks = new Smooks("/smooks/smooks-transform-x.xml");
ExecutionContext execContext = smooks.createExecutionContext();

execContext.setEventListener(new HtmlReportGenerator("/tmp/smooks-report.html"));
smooks.filter(new StreamSource(inputStream), new StreamResult(outputStream), execContext);

The HtmlReportGenerator is a very useful tool during development with Smooks.  It's the nearest thing Smooks has to an IDE based Debugger (which we hope to have in a future release).  It can be very useful for diagnosing issues, or simply as a tool for comprehending a Smooks transformation.

An example HtmlReportGenerator report can be seen online here

Of course you can also write and use your own ExecutionEventListener implementations.

DOM Filtering

DOM Filtering in Smooks is implemented through the SmooksDOMFilter class. This class provides a DOM based processing model on top of SAX. It uses the SAX events generated from the input data stream to generate a DOM (see the Stream Parsers section for how to configure the Stream Parser).

After creating the DOM representation of the input message, the SmooksDOMFilter applies a 2 phase filter, whereby the DOM elements are visited upon by the configured DOMElementVisitor implementations during the "Visit" phase (phase 1) and the configured DOMElementVisitor implementations are applied during the "Serialization" phase (phase 2). Read the SmooksDOMFilter javadocs for more details.

Taking the following input XML as an example:

<a>
    <b>
        <c name="first" />
        <c name="second" />
    </b>
</a>

We create 2 DOMElementVisitor implementations. The first implementation is targeted at the <b/> elements and is called "BVisitor". The second is targeted at the <c/> elements and is called "CVisitor". We also create a SerializationUnit implementation to track how serialization is sequenced. We will call this SerializationUnit "AcmeSerializer" and target is at all elements:

public class BVisitor implements DOMElementVisitor {
    public void visitBefore(Element element, ExecutionContext executionContext) {
        System.out.println("Visit Before: <b>");
    }

    public void visitAfter(Element element, ExecutionContext executionContext) {
        System.out.println("Visit After: </b>");
    }
}

public class CVisitor implements DOMElementVisitor {
    public void visitBefore(Element element, ExecutionContext executionContext) {
        System.out.println("Visit Before: <c> - " + element.getAttribute("name"));
    }

    public void visitAfter(Element element, ExecutionContext executionContext) {
        System.out.println("Visit After: </c> - " + element.getAttribute("name"));
    }
}

public class AcmeSerializer extends DefaultSerializationUnit {
    public void writeElementStart(Element element, Writer writer, ExecutionContext executionContext) throws IOException {
        System.out.println("Serialize Start: <" + DomUtils.getName(element) + "> - " + element.getAttribute("name"));
    }

    public void writeElementEnd(Element element, Writer writer, ExecutionContext executionContext) throws IOException {
        System.out.println("Serialize End: </" + DomUtils.getName(element) + "> - " + element.getAttribute("name"));
    }
}

We configure these Content Handlers in Smooks as follows (see Smooks Resources for configuration details):

<?xml version='1.0'?>
<smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.0.xsd">

    <resource-config selector="b">
        <resource>com.acme.BVisitor</resource>
    </resource-config>

    <resource-config selector="c">
        <resource>com.acme.CVisitor</resource>
    </resource-config>

    <resource-config selector="*">
        <resource>com.acme.AcmeSerializer</resource>
    </resource-config>

</smooks-resource-list>

Running this configuration through Smooks will show the order in which the visitBefore, visitAfter and serialization methods are called by the SmooksDOMFilter:

Visit Before: <b>
Visit Before: <c> - first
Visit After: </c> - first
Visit Before: <c> - second
Visit After: </c> - second
Visit After: </b>
Serialize Start: <a> -
Serialize Start: <b> -
Serialize Start: <c> - first
Serialize End: </c> - first
Serialize Start: <c> - second
Serialize End: </c> - second
Serialize End: </b> -
Serialize End: </a> -

SAX Filtering

SAX support has been added to Smooks v1.0 (SNAPSHOT available). This processing model eliminates a lot of the overhead associated with the DOM Processing model described above.

The SAX Processing model allows you to hook SAXElementVisitor implementations into the SAX Event stream associated with a message input stream (XML, EDI, CSV etc). Some of the Smooks Cartridges have been updated to leverage this SAX processing model. Initial basic benchmarking tests suggest a performance boost of at least one order of magnitude over the DOM Processing model.

People may also find the SAX Processing model easier to conceptualize (than its DOM counterpart) simply because it follows the normal SAX processing model, which is based on startElement, child-content and endElement events.

Taking the following input XML as an example (same as with the DOM example above):

<a>
    <b>
        <c name="first" />
        <c name="second" />
    </b>
</a>

We create 2 SAXElementVisitor implementations. The first implementation is targeted at the <b/> elements and is called "BVisitor". The second is targeted at the <c/> elements and is called "CVisitor":

public class BVisitor implements DOMElementVisitor {
    public void visitBefore(SAXElement element, ExecutionContext executionContext) {
        System.out.println("Visit Before: <b>");
    }

    public void onChildText(SAXElement element, SAXText childText, ExecutionContext executionContext) {
        // Ignoring child text
    }

    public void onChildElement(SAXElement element, SAXElement childElement, ExecutionContext executionContext) {
        // Ignoring child elements - they will be handled by their own visitors
    }

    public void visitAfter(SAXElement element, ExecutionContext executionContext) {
        System.out.println("Visit After: </b>");
    }
}

public class CVisitor implements DOMElementVisitor {
    public void visitBefore(SAXElement element, ExecutionContext executionContext) {
        System.out.println("Visit Before: <c> - " + SAXUtil.getAttribute("name", element.getAttributes()));
    }

    public void onChildText(SAXElement element, SAXText childText, ExecutionContext executionContext) {
        // Ignoring child text
    }

    public void onChildElement(SAXElement element, SAXElement childElement, ExecutionContext executionContext) {
        // Ignoring child elements - they will be handled by their own visitors
    }

    public void visitAfter(SAXElement element, ExecutionContext executionContext) {
        System.out.println("Visit After: </c> - " + SAXUtil.getAttribute("name", element.getAttributes()));
    }
}

We configure these Content Handlers in Smooks as follows (see Smooks Resources for configuration details):

<?xml version='1.0'?>
<smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.0.xsd">

    <resource-config selector="b">
        <resource>com.acme.BVisitor</resource>
    </resource-config>

    <resource-config selector="c">
        <resource>com.acme.CVisitor</resource>
    </resource-config>

</smooks-resource-list>

Running this configuration through Smooks will show the order in which the visitBefore and visitAfter methods are called by the SmooksSAXFilter:

Visit Before: <b>
Visit Before: <c> - first
Visit After: </c> - first
Visit Before: <c> - second
Visit After: </c> - second
Visit After: </b>

Notice the difference between the SAX and DOM processing models in this example. The DOM Processing model explicit defined SerializationUnit implementations for message serialization. The SAX processing model rolls serialization up into the actual SAXElementVisitor implementation by providing a java.io.Writer instance in the SAXElement supplied in each of the visit methods.

Something to note about the java.io.Writer instance supplied in SAXElement is that it can be changed (cached) and reset on the SAXElement as the SAX stream is being processed. If you change the Writer instance on an element, all child element visitors will be passed the new Writer instance in the SAXElement instances supplied to them. This provides the capability to do all sorts of things with the SAX event stream.

Another point to note is that if you don't target a visitor at a particular element in the SAX event stream, the SmooksSAXFilter will automatically apply the DefaultSAXElementVisitor, which will simply serialize that element (and its child text) to the writer supplied in the SAXElement. Using this in a situation where you wish to transform only a small number of elements in a message, you can do so by only implementing and targeting SAXElementVisitors for that element subset.

Filtering Process Selection (DOM or SAX?)

This is done by Smooks based on the following criteria:
  1. If all visitor resources (i.e. not including non element visitor resources) implement only the DOM visitor interfaces (DOMElementVisitor or SerializationUnit), then the DOM processing model is selected.
  2. If all visitor resources (i.e. not including non element visitor resources) implement only the SAX visitor interface (SAXElementVisitor), then the SAX processing model is selected.
  3. If all visitor resources (i.e. not including non element visitor resources) implement both the DOM and SAX visitor interfaces, then the DOM processing model is selected, unless the Smooks resource configuration contains the stream.filter.type global configuration parameter (see below).

The stream.filter.type global configuration parameter is configured ("DOM"/"SAX") as follows:

<params>
    <param name="stream.filter.type">SAX</param>
</params>

Mixing DOM and SAX

The DOM processing model has the obvious:
  • Advantage of being easier to work with on a code level, allowing node traversal etc. It also makes it a lot easier to take advantage of Scripting and Templating engines that have built in support for utilizing DOM structures (e.g. FreeMarker and Groovy).
  • Disadvantage of being constrained by memory i.e. if you have huge messages, then you typically cannot use a DOM processing model.


Smooks v1.1 adds support for mixing these 2 models through the DomModelCreator class. When used with SAX filtering, this visitor will construct a DOM Fragment of the visited element. This allows DOM utilities to be used in a Streaming environment.

When 1+ models are nested inside each other, outer models will never contain data from the inner models i.e. the same fragments will never coexist inside two models.

Take the following message as an example:

<order id='332'>
    <header>
        <customer number="123">Joe</customer>
    </header>
    <order-items>
        <order-item id='1'>
            <product>1</product>
            <quantity>2</quantity>
            <price>8.80</price>
        </order-item>
        <order-item id='2'>
            <product>2</product>
            <quantity>2</quantity>
            <price>8.80</price>
        </order-item>
        <order-item id='3'>
            <product>3</product>
            <quantity>2</quantity>
            <price>8.80</price>
        </order-item>
   </order-items>
</order>

The DomModelCreator can be configured in Smooks to create models for the "order" and "order-item" message fragments:

<resource-config selector="order,order-item">
    <resource>org.milyn.delivery.DomModelCreator</resource>
</resource-config>

In this case, the "order" model will never contain "order-item" model data (order-item elements are nested inside the order element). The in memory model for the "order" will simply be:

<order id='332'>
    <header>
        <customer number="123">Joe</customer>
    </header>
    <order-items />
</order>

Added to this is the fact that there will only ever be 0 or 1 "order-item" models in memory at any given time, with each new "order-item" model overwriting the previous "order-item" model. All this ensures that the memory footprint is kept to a minimum.

Because the Smooks processing model is event driven via the message content (i.e. you can hook in Visitor logic to be applied at different points while Smooks filters/streams the message), you can take advantage of this mixed DOM and SAX processing model.

See the following examples that utilize this mixed DOM + SAX approach:

Content Handlers

Content Handlers are the cornerstone of the Smooks component model. The following Content Handler types are currently in existance:
  1. Stream Readers: Smooks supports filtering of both XML and non-XML data because it allows you to configure a "Stream Reader" for each filter process. If no reader is configured, it defaults to XML. So, the Stream Reader resource is responsible for generating a stream of SAX events from a hierarchical data stream (e.g. XML, CSV, EDI etc.). This stream of SAX events can then be processed by XML Element Visitors (via Smooks). More on this later.
  2. Element Visitors: After Smooks hooks a Stream Parser to the data Stream, it starts receiving a stream of SAX events i.e. "startElement", "endElement" etc. These events are then used by Smooks to select an "ElementVisitor" implementation, which process the event in some way. This is the primary extension point in Smooks, as well as being the mechanism though which Smooks supports a fragment based processing model. See the list of Smooks Cartridges for examples of ElementVisitor implementations already available with Smooks. More on this later.
  3. Element Serializers: DOM based processing supports implementation of DOM Element "Serialization Units", allowing you to implement custom serialization at a fragment level.

Configuration

All Content Handler types share the same configuration model. This is based on the SmooksResourceConfiguration class (see Smooks Resources). Version 1.0 of Smooks also introduces an Annotation driven approach to Content Handler configuration.

SmooksResourceConfiguration

Content Handlers can receive their configuration by implementing the setConfiguration method:

public void setConfiguration(SmooksResourceConfiguration resourceConfig) throws SmooksConfigurationException;

Annotations

Version 1.0 of Smooks introduces support for annotation driven Content Handler configuration. This means that as well as supporting the old "setConfiguration(SmooksResourceConfiguration resourceConfig)" style Content Handler configuration, Smooks v1.0 also supports configuration injection via annotations. Two annotations are supported for this purpose:

@Config for injecting the SmooksResourceConfiguration directly onto the Content Handler e.g.

public class MyVisitor implements SAXElementVisitor {
    @Config
    private SmooksResourceConfiguration config;

    ... etc...
}

@ConfigParam for injecting specific <param> values from the SmooksResourceConfiguration directly onto the Content Handler e.g.

public class MyVisitor implements SAXElementVisitor {
    
    /**
     * Inject the "encoding" <param> from the SmooksResourceConfiguration.  Do some automatic type conversion on the value (to Charset).  Also default the value to "UTF-8".
     */
    @ConfigParam(defaultVal = "UTF-8")
    private Charset encoding;

    /**
     * Inject the "action" <param> from the SmooksResourceConfiguration.  Do some defaulting and validation on the value.
     */
    @ConfigParam(defaultVal = "replace", choice = {"replace", "addto", "insertbefore", "insertafter"})
    private String action;

    ... etc...
}

Annotation based <param> injection has a number of advantages:

  1. Cleaner code.
  2. Automatic type conversion.
  3. Automatic validation.
  4. Automatic "choice" support.

Lifecycle

Content Handler lifecycle is supported through the method annotations @Initialize and @Uninitialize.

Content Handler methods annotated with @Initialize will be automatically called by Smooks after handler creation and configuration. Content Handler methods annotated with @Uninitialize will be automatically called on VM shutdown (via a shutdown hook), or after calling Smooks.close() (useful where multiple Smooks instances are running in the same VM, with container managed lifecycle).

Stream Readers

Smooks relies on a "Stream Reader" for generating a stream of SAX events from the Source message data stream. A Stream Reader is a class that implements the XMLReader interface (or the SmooksXMLReader interface).

By default, Smooks uses the default XMLReader (XMLReaderFactory.createXMLReader()), but can be easily configured to read non-XML data Sources by configuring a specialized XMLReader:

<?xml version="1.0"?>
<smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd">

    <reader class="com.acme.ZZZZReader" />

    <!-- 
         Other Smooks resources, e.g. <jb:bindings> configs for 
         binding data from the ZZZZ data stream into Java Objects....
    -->

</smooks-resource-list>

The reader can also be configured with a set of handlers, features and parameters. Here is a full example configuration.

<reader class="com.acme.ZZZZReader">
    <handlers>
        <handler class="com.X" />
        <handler class="com.Y" />
    </handlers>
    <features>
        <setOn feature="http://a" />
        <setOn feature="http://b" />
        <setOff feature="http://c" />
        <setOff feature="http://d" />
    </features>
    <params>
        <param name="param1">val1</param>
        <param name="param2">val2</param>
    </params>
</reader>

A number of non-XML Readers are available with Smooks out of the box:

  1. CSVReader
  2. SmooksEDIReader
  3. JSONReader
  4. XStreamXMLReader


Any of the above XMLReaders can be configured as outlined above, but some of them have a specialized configuration namespaces that simplify configuration.

Example - CSVReader Configuration

<?xml version="1.0"?>
<smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd" xmlns:csv="http://www.milyn.org/xsd/smooks/csv-1.1.xsd">

    <!--
    Configure the CSV  to parse the message into a stream of SAX events.
    -->
    <csv:reader fields="firstname,lastname,gender,age,country" separator="|" quote="'" skipLines="1" />

</smooks-resource-list>

Example - SmooksEDIReader Configuration

<?xml version="1.0"?>
<smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd" xmlns:edi="http://www.milyn.org/xsd/smooks/edi-1.1.xsd">

    <edi:reader mappingModel="/org/milyn/smooks/edi/edi-to-xml-mapping.xml" />

</smooks-resource-list>

Example - JSONReader Configurations

<?xml version="1.0"?>
<smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd" xmlns:json="http://www.milyn.org/xsd/smooks/json-1.1.xsd">

    <!--
    Basic configuration...
    -->
    <json:reader/>

</smooks-resource-list>
<?xml version="1.0"?>
<smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd" xmlns:json="http://www.milyn.org/xsd/smooks/json-1.1.xsd">

    <!--
    Key replacement...
    -->
    <json:reader>
        <json:keyMap>
            <json:key from="some key">someKey</json:key>
            <json:key from="some&amp;key" to="someAndKey" />
        </json:keyMap>
    </json:reader>

</smooks-resource-list>
<?xml version="1.0"?>
<smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd" xmlns:json="http://www.milyn.org/xsd/smooks/json-1.1.xsd">

    <!--
    Other configurations...
    -->
    <json:reader keyWhitspaceReplacement="_" keyPrefixOnNumeric="n" illegalElementNameCharReplacement="." nullValueReplacement="##NULL##" />

</smooks-resource-list>

To set features on the default reader, simply omit the class name from the configuration:

<reader>
    <features>
        <setOn feature="http://a" />
        <setOn feature="http://b" />
        <setOff feature="http://c" />
        <setOff feature="http://d" />
    </features>
</reader>

DOM Element Vistors & Serializers

DOMElementVisitors and SerializationUnits are Content Handler specializations that are applied during the DOM Filtering process.

DOMElementVisitors implement a very simple 2 method interface:

public interface DOMElementVisitor extends ContentHandler {

    public abstract void visitBefore(Element element, ExecutionContext executionContext) throws SmooksException;

    public abstract void visitAfter(Element element, ExecutionContext executionContext) throws SmooksException;
}

See the DOM Filtering section for details on when the SmooksDOMFilter calls the visitiBefore and visitAfter methods.

Points to note:

Icon
  1. The best way to get a hands on feel for how to write your own DOMElementVisitor, or reuse some of the implementations already available in the Smooks Cartridges, is to check out the tutorials.
  2. It's not likely that you'll ever need to worry about SerializationUnits. For the most part, the default serialization fits the bill.

SAX Element Vistors

SAXElementVisitors are Content Handler specializations that are applied during the SAX Filtering process.

SAXElementVisitors implement a very simple 4 method interface:

public interface SAXElementVisitor extends ContentHandler {

    public abstract void visitBefore(SAXElement element, ExecutionContext executionContext) throws SmooksException, IOException;

    public abstract void onChildText(SAXElement element, SAXText childText, ExecutionContext executionContext) throws SmooksException, IOException;

    public abstract void onChildElement(SAXElement element, SAXElement childElement, ExecutionContext executionContext) throws SmooksException, IOException;

    public abstract void visitAfter(SAXElement element, ExecutionContext executionContext) throws SmooksException, IOException;
}

See the SAX Filtering section for details on when the SmooksSAXFilter calls the visitor methods.

Smooks Cartridges

The basic functionality of Smooks Core can be extended through the creation of what we call a "Smooks Cartridge". A Cartridge is simply a Java archive (jar) containing reusable Content Handlers (Visitor Logic). A Smooks Cartridge should provide "ready to use" support for a specific type of XML analysis or transformation.

Using Maven?

Name

DOM Support

SAX Support

Description

JavaBean

(tick)

(tick)

Enables population of Java Object Model from data embedded in
a data stream (XML, non XML, Java etc). See Tutorials. Download.

Templating

(tick)
 FreeMarker
(tick)
 XSL
(tick)
 StringTemplate

(tick)
 FreeMarker
(error)
 XSL
(error)
 StringTemplate

Enables fragment-level templating using different templating solutions
e.g. FreeMarker, StringTemplate and XSLT. See Tutorials. Download.

Routing

(tick)
 File
(tick)
 JMS
(tick)
 Database

(tick)
 File
(tick)
 JMS
(tick)
 Database

Enables routing of message fragments (including populated object models)
to a range of different destination types. See Tutorials. Download.

Scripting

(tick)
 Groovy

(tick)
 Groovy

Enables fragment-level Transformation/Analysis using different
scripting languages. Currenly supports Groovy. See Tutorials. Download.

EDI

(tick)

(tick)

Smooks Cartridge that converts an EDI message data stream
into a stream of SAX events. Download.

CSV

(tick)

(tick)

Smooks Cartridge that converts a Comma Separated Value (CSV)
data stream into a stream of SAX events. Download.

JSON

(tick)

(tick)

Smooks Cartridge that converts a JSON formatted
data stream into a stream of SAX events. (Since v1.1).

Misc

(tick)

(error)

Contains miscellaneous resources for performing common analysis/transformation tasks
on an XML stream e.g. rename an element, delete an element, delete and attribute etc. Download.

Servlet

(tick)

(error)

Plugs Smooks into the J2EE Servlet Container. This allows Smooks to be
used for Servlet Response Analysis and Transformation e.g. to optimse the
Servlet Response for the requesting browser make/model. See Tutorials. Download.

CSS

(tick)

(error)

Makes Cascading Style Sheet (CSS) information easily available to web content
analysis or transformation logic. Supports linked or inline CSS Download.

Calc

(tick)

(tick)

Smooks Cartridge that can do simple calculation tasks.
At the moment it only contains a Counter visitor. (Since v1.1).

Smooks Cartridges

Javabean Cartridge

Javabean wiring internals

Topics in this session discuss some internal workings of the javabean cartridge. You don't have to read this section, if you just want too use the cartridge. But if you want to know more about what happens in the cartridge then you should read this.

Wiring step by step

This topic will explain when and how the methods of the javabeans get called.

In the following example two javabeans get wired together:

This is, step by step, what happens.

  1. On the order before visit:
    1. The bean example.model.Order is created
    2. The bean is put in the bean map under the beanId order.If there already exists a bean under the beanId order, then the following things happen:
      1. All the child life cycle associated objects get removed from the bean map. Further on in this chapter we will explain more about the bean life cycle association.
      2. The existing bean gets overwritten by the new Order bean.
    3. The cartridge searches in the bean map for a bean with the beanId header.
    4. Because no bean is found, the cartridge adds a bean life cycle event observer for the begin life cycle event of the beanId header. This is called late wiring. The observer will be notified when a bean is added with the beanId header. Further on in this chapter we will explain more about the bean life cycle events.
  2. On the header before visit:
    1. The bean example.model.Header is created. If there already exists a bean under the beanId header then the same thing happens as in 1b.
    2. The bean is put in the beanMap under the beanId header.
      1. Because there is a bean life cycle observer for the begin life cycle this observer is notified that a bean has been added with the header beanId. The observer of the order configuration sets the header object on the Order object, using the setter method Order#setHeader(Header). An important thing to know is that the life cycle of the header bean is associated with the life cycle of the order bean.
    3. The cartridge searches for a bean with the beanId order in the beanMap. It directly finds the bean and sets it on the Header object using the setter method Header#setOrder(Order). This is called direct wiring.
  3. Nothing happens in in the after visit of the two beans, so we are done. 

Bean life cycle

Every bean which is created by the Javabean Cartridge has a life cycle within Smooks. Important to know is that this bean life cycle is something Javabean Cartridge internally and it has nothing to do with the bean object itself. The bean has three life cycle events namely begin, change and end.

Life cycle event

Description

begin

When the bean is added to bean map then the life cycle of the bean begins. As you saw it is possible to add observers to the BeanAccessor (which controlles the bean map) that get notified when a bean begins its life cycle.

change

The change event occurs when the bean object gets replaced in the bean map, but the context (data) of the bean stays the same. This happens when a list, that was already in the bean map, gets converted to an array and that array is placed in the bean map as a replacement of the list. The object is different but the context stays the same. It is possible to add observers to the BeanAccessor that get notified when a bean life cycle change happens.

end

The life cycle of the bean ends when smooks stops using the bean. That doesn't mean that the bean isn't in the bean map anymore or that you can't use it outside of Smooks. The life cycle end can happen in three situations:

  1. Smooks is done filtering the document. In this case the beans stay available in the bean map, so that you can retrieve them from the BeanResult object.
  2. The bean is being replaced in the beanMap by a different bean object, with the same beanId.
  3. The bean is being removed from the beanMap. This happens only when a parent life cycle associated bean gets removed or replaced. More about that later in this chapter.
    At the moment it isn't possible to register an observer for this life cycle.

Bean life cycle association

The Bean life cycles of two beans can be associated to each other in a parent child structure. This can create a tree of associated beans. This association comes into play when the parent beans gets removed from the bean map and thus its life cycle ends. When the parent is removed then all associated child beans are also recursively removed from the bean map.

The life cycles of two beans are associated when the parent bean configuration has a bean wiring on the child bean and the child bean is created and added to the bean map after the parent. In that situation the parent bean used a begin life cycle event observer for the child bean id. The life cycle association makes sure that no child beans are left behind in the bean map, when the parent bean gets replaced by a new bean. This is very usefull and necessary in the following situation:

xml data
smooks configuration

After the first <a> node the bean contains the first example.mode.A object and the first example.mode.B object. The cartridge used an observer to catch the first B bean and set it on the A object. It also associated the B object as a child of the A object. When the cartridge proceeds to the next <a> node and adds the second A object to the bean map, it will first remove the previous A object under the beanId a and remove all associated child's. This will prevent the pollution of the second A object with the first B object. Because if it would be present in the bean map then the cartridge would set that on the second A bean instead of the second B object.

Smooks Application Integration 

This section will show how Smooks can be integrated into an Enterprise Service Bus (ESB).

Smooks contains a class named org.milyn.container.plugin.PayloadProcessor that is intended to simplify the integration process. PayloadProcessor is a processor class for an abstract payload. It works out how to filter the supplied Object payload through Smooks, to produce the desired ResultType result typ.

Implementing PayloadProcessor 

Contruction of the PayloadProcessor instance can be performed in any initialization method available to the ESB. The constructor is shown below:

The parameter resultType is one of the enum values in org.milyn.container.plugin.ResultType and the required type of the transformation output (note that this can be overridden, per transform, using a SourceResult object, as we will see later). These are the types currently available:

ResultType

Description

ResultType.STRING

String result used for example when tranforming xml.

ResultType.BYTES

Byte array result.

ResultType.JAVA

Java result as would be expected when transforming to a Java Object.

ResultType.NORESULT

When transforming and routing it is sometimes desirable to tell Smooks not to generate a result to keep the memory footprint as small as possible.

After the PayloadProcessor instance has been created you need to add the call to the PayloadProcessor's process method to invoke the Smooks filtering. This call should be added in the ESBs process method and looks like this:

The payload parameter can be any of the following:

Type

Description

java.lang.String

String input, like xml.

byte[]

Byte array input.

java.io.Reader

Any Reader instance.

java.io.InputStream

Any InputStream instance.

javax.xml.transform.Source

Any Source instance.

org.milyn.container.plugin.SourceResult

This class allows users to explicitly specify both the Source and Result payload carrier types. This can be used
 in situations where the required Source or Result are not supported among the default payload types.

To see an example how the PayloadProcessor can be used take a look at org.jboss.soa.esb.actions.smooks.SmooksAction in JBoss ESB.



  • No labels