Since Groovy 2.0, users are allowed to use the optional @TypeChecked annotation to activate type checking. In this mode, the compiler becomes more verbose and throws errors for, example, typos, non-existent methods, ... This comes with a few limitations though, most of them coming from the fact that Groovy remains inherently a dynamic language. For example, you wouldn't be able to use type checking on a markup builder:
In the previous example, none of the html, head, body or p methods exist. This works because Groovy uses a dynamic dispatch and converts those method calls at runtime. There's no limitation about the number of tags that you can use, nor the attributes.
On the other hand, Groovy is also a platform of choice when it comes to implement internal DSLs. The flexible syntax, combined with runtime and compile-time metaprogramming capabilities make Groovy an interesting choice because it allows the programmer to focus on the DSL rather than on tooling or implementation. Since Groovy DSLs are Groovy code, it's easy to have IDE support without having to write a dedicated plugin for example.
In a lot of cases, DSL engines are written in Groovy (or Java) then user code is executed as scripts, meaning that you have some kind of wrapper on top of user logic. The wrapper may consist, for example, in a GroovyShell or GroovyScriptEngine that performs some tasks transparently before running the script (adding imports, applying AST transforms, extending a base script, ...). Often, user written scripts come to production without testing because the DSL logic comes to a point where any user may write code using the DSL syntax. In the end, a user may just ignore that what he writes is actually code. This adds some challenges for the DSL implementor, such as securing execution of user code or, in this case, early reporting of errors.
For example, imagine a DSL which goal is to drive a rover on Mars remotely. Sending a message to the rover takes around 15 minutes. If the rover executes the script and fails with an error (say a typo), you have two problems:
- first, feedback comes only after 30 minutes (the time needed for the rover to get the script and the time needed to receive the error)
- second, some portion of the script has been executed and you may have to change the fixed script significantly (implying that you need to know the current state of the rover...)
Type checking extensions is a new mechanism in Groovy 2.1 that will allow the developer of a DSL engine to make those scripts safer by applying the same kind of checks that static type checking allows on regular groovy classes. The principle, here, is to fail early, that is to say fail compilation of scripts as soon as possible, and if possible provide feedback to the user (including nice error messages).
How does it work?
Since Groovy 2.1.0, the @TypeChecked annotation supports an attribute called extensions. This parameter takes an array of strings corresponding to a list of type checking extensions scripts. Those scripts are found at compile time on classpath. For example, you would write:
In that case, the foo methods would be type checked with the rules of the normal type checker completed by those found in the myextension.groovy script. Note that while internally the type checker supports multiple mechanisms to implement type checking extensions (including plain old java code), the recommanded way is to use those type checking extension scripts.
A DSL for type checking
The idea behind type checking extensions is to use a DSL to extend the type checker capabilities. This DSL allows you to hook into the compilation process, more specifically the type checking phase, using an "event-driven" API. For example, when the type checker enters a method body, it throws a beforeVisitMethod event that you can react to:
Imagine that you have this rover DSL at hand. A user would write:
If you have a class defined as such:
The script can be type checked before being executed using the following script:
Using the compiler configuration above, we can apply @TypeChecked transparently to the script. In that case, it would fail at compile time:
Now, we will slightly update the configuration to include the "extensions" parameter:
Then add the following to the myextension.groovy file:
Here, we're telling the compiler that if an unresolved variable is found and that it's name is robot, then we can say that the type of this variable is a Robot.
The type checking API is a low level API, dealing with the Abstract Syntax Tree. You will have to know your AST well to develop extensions, even if the DSL makes it much easier than just dealing with AST code from plain Java or Groovy.
The type checker sends the following events, to which an extension script can react:
|setup||Called after the type checker finished initialization||Can be used to perform setup of your extension|
|finish||Called after the type checker completed type checking||Can be used to perform additional checks after the type checker has finished its job.|
|unresolvedVariable||Called when the type checker finds an unresolved variable||VariableExpression var||Allows the developer to help the type checker with user-injected variables.|
|unresolvedProperty||Called when the type checker cannot find a property on the receiver||PropertyExpression pexp||Allows the developer to handle "dynamic" properties|
|unresolvedAttribute||Called when the type checker cannot find an attribute on the receiver||AttributeExpression aex||Allows the developer to handle missing attributes|
|beforeMethodCall||Called before the type checker starts type checking a method call expression||MethodCall call||Allows you to intercept method calls before the type checker performs its own checks. This is useful if you want to replace the default type checking with a custom one for a limited scope. In that case, you must set the handled flag to true, so that the type checker skips its own checks.|
|afterMethodCall||Called once the type checker has finished type checking a method call||MethodCall call|
Allow you to perform additional checks after the type checker has done its own checks. This is in particular useful if you want to perform the standard type checking tests but also want to ensure additional type safety, for example checking the arguments against each other.
Note that afterMethodCall is called even if you did beforeMethodCall and set the handled flag to true.
|onMethodSelection||Called by the type checker when it finds a method appropriate for a method call|
The type checker works by inferring argument types of a method call, then chooses a target method. If it finds one that corresponds, then it triggers this event. It is for example interesting if you want to react on a specific method call, such as entering the scope of a method that takes a closure as argument (as in builders).
Please note that this event may be thrown for various types of expressions, not only method calls (binary expressions for example).
|methodNotFound||Called by the type checker when it fails to find an appropriate method for a method call|
Unlike onMethodSelection, this event is sent when the type checker cannot find a target method for a method call (instance or static). It gives you the chance to intercept the error before it is sent to the user, but also set the target method.
For this, you need to return a list of MethodNode. In most situations, you would either return:
If you return more than one MethodNode, then the compiler would throw an error to the user stating that the method call is ambiguous, listing the possible methods.
For convenience, if you want to return only one method, you are allowed to return it directly instead of wrapping it into a list.
|beforeVisitMethod||Called by the type checker before type checking a method body||MethodNode node|
The type checker will call this method before starting to type check a method body. If you want, for example, to perform type checking by yourself instead of letting the type checker do it, you have to set the handled flag to true.
This event can also be used to help defining the scope of your extension (for example, applying it only if you are inside method foo).
|afterVisitMethod||Called by the type checker after type checking a method body||MethodNode node||Gives you the opportunity to perform additional checks after a method body is visited by the type checker. This is useful if you collect information, for example, and want to perform additional checks once everything has been collected.|
|beforeVisitClass||Called by the type checker before type checking a class||ClassNode node||If a class is annotated with @TypeChecked, then before visiting the class, this event will be sent. It is also the case for inner classes defined inside a class annotated with @TypeChecked. It can help you define the scope of your extension, or you can even totally replace the visit of the type checker with a custom type checking implementation. For that, you would have to set the handled flag to true.|
|afterVisitClass||Called by the type checker after having finished the visit of a type checked class||ClassNode node||Called for every class being type checked after the type checker finished its work. This includes classes annotated with @TypeChecked and any inner/anonymous class defined in the same class with is not skipped.|
|incompatibleAssignment||Called when the type checker thinks that an assignment is incorrect, meaning that the right hand side of an assignment is incompatible with the left hand side.|
Gives the developper the ability to handle incorrect assignments. This is for example useful if a class overrides setProperty, because in that case it is possible that assigning a variable of one type to a property of another type is handled through that runtime mechanism.
In that case, you can help the type checker just by telling it that the assignment is valid (using handled set to true).
Of course, an extension script may consist of several blocks, and you can have multiple blocks responding to the same event. This makes the DSL look nicer and easier to write. However, reacting to events is far from sufficient. If you know you can react to events, you also need to deal with the errors, which implies several "helper" methods that will make things easier.
Working with extensions
The DSL relies on a support class called org.codehaus.groovy.transform.stc.GroovyTypeCheckingExtensionSupport. This class itself extends org.codehaus.groovy.transform.stc.TypeCheckingExtension. Those two classes define a number of "helper" methods that will make working with the AST easier, especially regarding type checking. One interesting thing to know is that you have access to the type checker. This means that you can programmatically call methods of the type checker, including those that allow you to throw compilation errors.
The extension script delegates to the GroovyTypeCheckingExtensionSupport class, meaning that you have direct access to the following variables:
- context: the type checker context, of type org.codehaus.groovy.transform.stc.TypeCheckingContext
- typeCheckingVisitor: the type checker itself, a org.codehaus.groovy.transform.stc.StaticTypeCheckingVisitor instance
- generatedMethods: a list of "generated methods", which is in fact the list of "dummy" methods that you can create inside a type checking extension using the newMethod methods
The type checking context contains a lot of information that is useful in context for the type checker. For example, the current stack of enclosing method calls, binary expressions, closures, ... This information is in particular important if you have to know "where" you are when an error occurs and that you want to handle it.
Handling class nodes is something that needs particular attention when you work with a type checking extension. Compilation works with an abstract syntax tree (AST) and the tree may not be complete when you are type checking a class. This also means that when you refer to types, you must not use class literals such as String or HashSet, but to class nodes representing those types. This requires a certain level of abstraction and understanding how Groovy deals with class nodes. To make things easier, Groovy supplies several helper methods to deal with class nodes. For example, if you want to say "the type for String", you can write:
You would also note that there is a variant of classNodeFor that takes a String as an argument, instead of a Class. In general, you should not use that one, because it would create a class node for which the name is "String", but without any method, any property, ... defined on it. The first version returns a class node that is resolved but the second one returns one that is not. So the latter should be reserved for very special cases.
The second problem that you might encounter is referencing a type which is not yet compiled. This may happen more often than you think. For example, when you compile a set of files together. In that case, if you want to say "that variable is of type Foo" but Foo is not yet compiled, you can still refer to the "Foo" class node using lookupClassNodeFor.
Helping the type checker
Say that you know that variable foo is of type Foo and you want to tell the type checker about it. Then you can use the storeType method, which takes two arguments: the first one is the node for which you want to store the type and the second one is the type of the node. If you look at the implementation of storeType, you would see that it delegates to the type checker equivalent method, which itself does a lot of work to store node metadata. You would also see that storing the type is not limited to variables: you can set the type of any expression.
Likewise, getting the type of an AST node is just a matter of calling getType on that node. This would in general be what you want, but there's something that you must understand:
- getType returns the inferred type of an expression. This means that it will not return, for a variable declared of type Object the class node for Object, but the inferred type of this variable at this point of the code (flow typing)
- if you want to access the origin type of a variable (or field/parameter), then you must call the appropriate method on the AST node
It is often required to know the type of an AST node. For readability, the DSL provides a special isXXXExpression method that will delegate to "x instance of XXXExpression". For example, instead of writing:
which requires you to import the BinaryExpression class, you can just write:
When you perform type checking of dynamic code, you may often face the case when you know that a method call is valid but there is no "real" method behind it. As an example, take the Grails dynamic finders. You can have a method call consisting of a method named findByName(...). As there's no findByName method defined in the bean, the type checker would complain. Yet, you would know that this method wouldn't fail at runtime, and you can even tell what is the return type of this method. For this case, the DSL supports two special constructs that consist of "virtual methods". This means that you will return a method node that doesn't really exist but is defined in the context of type checking. Three methods exist:
- newMethod(String name, Class returnType)
- newMethod(String name, ClassNode returnType)
- newMethod(String name, Callable<ClassNode> return Type)
Should you need more than the name and return type, you can always create a new MethodNode by yourself.
Scoping is very important in DSL type checking and is one of the reasons why we couldn't use a pointcut based approach to DSL type checking. Basically, you must be able to define very precisely when your extension applies and when it does not. Moreover, you must be able to handle situations that a regular type checker would not be able to handle, such as forward references:
Say for example that you want to handle a builder:
Your extension, then, should only be active once you've entered the foo method, and inactive outside of this scope. But you could have complex situations like mutiple builders in the same file or embedded builders (builders in builders). While you should not try to fix all this from start (you must accept limitations to type checking), the type checker does offer a nice mechanism to handle this: a scoping stack, using the scopeEnter and scopeExit methods.
- scopeEnter creates a new scope and puts it on top of the stack
- scopeExits pops a scope from the stack
- a parent scope
- a map of custom data
That is to say, that if at some point you are not able to determine the type of an expression, or that you are not able to check at this point that an assignment is valid or not, you can still make the check later... This is a very powerful feature. Now, scopeEnter and scopeExit provide some interesting syntactic sugar:
At anytime in the DSL, you can access the current scope using getCurrentScope() or more simply currentScope. The general schema would be, then:
- determine a "pointcut" where you push a new scope on stack and initialize custom variables within this scope
- using the various events, you can use the information stored in your custom scope to perform checks, defer checks,...
- determine a "pointcut" where you exit the scope, call scopeExit and eventually perform additional checks
Other useful methods
For the complete list of helper methods, please refer to the org.codehaus.groovy.transform.stc.GroovyTypeCheckingExtensionSupport and org.codehaus.groovy.transform.stc.TypeCheckingExtension classes. However, take special attention to those methods:
- isDynamic: takes a VariableExpression as argument and returns true if the variable is a DynamicExpression, which means, in a script, that it wasn't defined using a type or def.
- isGenerated: takes a MethodNode as an argument and tells if the method is one that was generated by the type checker extension using the newMethod method
- isAnnotatedBy: takes an AST node and a Class (or ClassNode), and tells if the node is annotated with this class. For example: isAnnotatedBy(node, NotNull)