Skip to end of metadata
Go to start of metadata

The debate is not over, scripting vs programming. In fact the definitions between these two still go blurry. In the context of Java, a program comprises of definition of a class and its structure. For example, the simplest program may have just the class definition.

Obviously, this particular code doesn't do anything other than just defining the class. A bare minimum useful program may contain a couple of more lines of code in Java. Lets try to write a program that prints "hello world"

The same greeter program can be re-written using a static block or even with the constructor. So, programs follow some strict guidelines in terms of structure.

On the other hand, scripts have a mixture of free-hand statements and semi-to-fully structured method elements. We will try to write the greeter program in groovy.

Thats it!!. No class definitions, no curly brackets, no strict syntax checks such as semicolons. Just a simple readable one line statement. All other overheads have been "encapsulated". After all encapsulation is one of the qualities of any Object Oriented Program (oops, Script), isn't it?

Ok, lets not delve too much into the contradiction between script and program, neither it is true 'encapsulation' or not. This article is to show the power of scripting using Groovy by walking you with a business example. Enough has been said about the short-hand structure of groovy, how it shrinks coding elements into small fractions to achieve the same result as its counterpart, Java. So, we will take a business problem and try to solve using a groovy script. Think of groovy as a replacement to your perl scripting or shell scripting, in this context. Groovy is not the competitor to Java, it is just complementing it.

Feed files are general scenario in business applications. Applications receive/fetch feed files from another application or location, to process and import into the application stream. There are many situations where we need to create a diff or delta between two successive feeds to make the feed processing perform better. For our example, lets assume a feed with an XML structure given below.

Briefly on the structure, the external entities (customer applications) send work requests to the enterprise application, through feed files. The application processes all the work request records by navigating the tree like structure of different work request elements. Basically each request contains client information, request type, date, supporting product information and order processing information. The real functionality of the application is something out of scope for this example.

Our script needs exactly two input data to create the delta file, viz., current feed file, last run feed file. The script has to validate and respond to the caller based on the values of the input elements. The idea is to build the script with three sections, first one to verify the input data, second to define a reusable block of code that does the actual check and find out the difference and finally, a small block that writes the output into a new file. On the way we will handle some exception cases also.

Lets start building the script by validating the input data.

If the number of arguments is anything not equal to 2, then we will not proceed. We can print the usage and exit too.

So, the arguments are string values of input file, file to compare and the output folder name.

Lets skip some of the sanity checks such as whether the input file is an XML, is it following the XSD structure etc. Lets assume the well structured, required XML files, ready for processing. We have some special parser API available in groovy for processing XML files as dotted notations. Lets load the input file using the API.

With this single statement we have loaded the XML file into memory and all the elements of the XML file are available as dotted notations and as XPATH query elements. All we are interested is to check whether each workrequest record in today's file is a new/modified record in comparison with the "corresponding record" of the last run feed file. Here another assumption is that the id attribute of the work request is something like a primary key and never changes.

As we loaded todaysFile into memory we will load the fileToCompare also.

Now we need to write a simple closure routine that fetches the WorkRequest record from the last run file by searching the workrequest id.

The above routine searches the entire last run file, looking for a specific work request using the xpath query. If it finds a matching record, then the entire work request is returned. otherwise a null will be returned. Now we need to write another small routine that verifies whether two elements (of todays file and last run file) are changed.

Simple enough, isn't it?

With the two helper routines ready to fire, we need to loop through the feed file to see which record is changed and which is not. The changed or the new records will have to be included in the delta file and the unchanged records will be left out.

Now we have collected all the records to be processed in the deltaRequests list. Only thing pending is to attach this list under workrequests element of the XML and write the delta file. To write the XML output there is an API by name groovy.xml.StreamingMarkupBuilder. We will use this to write the file.

With very simple steps of processing, we have written a clean script that processes two similar structured XML file and creates the delta file. When this script is plugged in into the application that processes redundant feeds on a daily process, we can significantly reduce the processing time by just creating the delta file and making the delta file as the input to the application.

  • No labels