How Groovy works behind the scenes
With your Groovy scripts, you have multiple options on how to
- run groovyc on them to create Java *.class files
- evaluate the script at runtime (either from a file or from a Java String)
- make your script available as Java Class-objects through the GroovyClassLoader (no *.class files generated!)
No *.java source files are ever generated.
All the above may seem like magic and it somehow is: the magic of imaginative, mindful software engineering.
Everything starts with the Groovy grammar.
There is the notion of the 'new' and the 'old' parser. Only the new one (as of Feb.05) is described here.
The Groovy grammar is written in terms of an ANTLR (= ANother Tool for Language Recognition)
grammar. The tool can handle grammars of type LL(k), where the Java grammar is of type LL(2)
and Groovy is of type LL(3).
The difference is in the number of tokens that the parser needs to look ahead for recognizing
e.g. "==" (2 tokens for Java) or "===" (3 tokens for Groovy).
To be more correct, the problem is in recognizing
the first "=" character. The parser needs to "look ahead" to derive its meaning.
ANTLR formulates the grammar in terms of "rules" that
fully implement EBNF (Extended Backus-Naur Form) enriched with Java code
blocks, special functions and some other things.
With the recognition of a rule,
actions can be triggered that come as usual Java
code. ANTLR dumps out Javacode representing a parser capable to
recognize the given grammar. And this parser executes the embedded code
blocks from the grammar - the "action".
Parser Generation and AST
ANTLR takes the Groovy grammar file "Groovy.g" to create the Groovy parser.
When the parser is fed with the source code of a Groovy script, it
produces the AST (= Abstract Syntax Tree) that
represents that code as a run-time structure.
Byte Code Generation
From the AST, it is possible to create Java Byte Code:
either for making it persistent as *.class files or for making it directly
available as Class objects through the GroovyClassLoader.
This ClassGeneration is done with the help of objectweb's ASM tool.
(The ASM name does not mean anything: it is just a reference to the
"asm" keyword in C, which allows some functions to be implemented in assembly language.)
ASM provides a Java API to construct or modify Byte Code on a given AST.
The API for bytecode generation heavily relies on the Visitor Pattern.
The main entry point for the class generation is
It is a large class. There are visitXYExpression
methods called when converting the AST to bytecode. For example
visitArrayExpression is called when creating arrays in bytecode.
More Links on the topic