Skip to end of metadata
Go to start of metadata

Metadata

Number:

GEP-8

Title:

Static Type Checking

Version:

8

Type:

Feature

Status:

Draft

Leader:

Cédric Champeau

Created:

2011-10-08

Last modification:

2012-02-21

Abstract: Static Type Checking

This GEP introduces a new feature in the language known as static type checking. It is often disturbing for developers coming from a statically typed language (say Java) to discover that the Groovy compiler will not complain at compile time:

  • when assignments are made on different types
  • when a method doesn't exist
  • when a property or variable doesn't exist
  • when returned object type doesn't match the method signature
  • ...

All those are silent because the dynamic nature of the Groovy language makes such code perfectly valid. However, in some situations, a developper may want Groovy to behave like a statically typed language and have the compiler give hints about such "errors". To do this, Groovy must introduce static type checking.

Rationale: Static Type Checking vs Static compilation

It is important to make the difference between static type checking and static compilation. The goal of this GEP is to have an option to turn static type checking (STC) on. If STC is activated, the compiler will be more verbose (you will also see the term "grumpy"), but in the end, the generated bytecode and runtime behaviour will be exactly the same as if you did not activate this mode. This is a major difference from an alternate compiler like Groovy++ which will perform STC then produce a different bytecode and therefore produce different runtime semantics. The scope of this GEP is only a static type checker, and therefore should only be considered as a feature which allows developers to write statically checked code, so is an elegant way for example to leverage the Groovy syntax to reduce verbosity of Java code while still getting strongly checked code. Eventually, IDE could support the STC mode and provide information to the developper.

Implementation details

Development branch

Since Groovy 2.0-beta-2, code has been merged into master branch. However, if heavy developments are done on the type checker, it is advisable to work on the grumpy branch. It adds an AST transformation named TypeChecked. If set, then the AST transformation will perform type inference and store type information in AST nodes metadata. Eventually, if errors are found, it will add errors to the compiler through a dedicated addStaticTypeError method which basically does the same as the traditional addError method but prefixes the messages with a "Static type checking" message. This is done to help the developer determine whether the error he is seeing is a "plain Groovy" error, or an error thrown by the STC mode.

The StaticTypeCheckingTestCase class

Static type checking behaviour must be tested. As there are tons of possible checks to be done, a base test class provides a framework for testing this mode. Unit tests for static type checking should override this class.

Decisions made

About this section

The goal of this section is to provide code samples which demonstrates in what case the STC transformation will actually complain and what is the expected error message, and serves as a basis to future STC documentation. This section may not be up-to-date, and one should always take a look at the STC unit tests found in the src/test/groovy/transform/stc directory.

Features

Feature

Example

Behaviour

Status

Method does not exist

Complains about undefined method

Implemented

Property does not exist

Complains about undefined property "y"

Implemented

Assignment type checking

Assigning a String to an int is forbidden

Implemented

Incompatible binary expressions

Checks that arguments of a binary expression are compatible (here, no 'plus' method is available

Implemented

Possible loss of precision (1/2)

Complains about possible loss of precision

Implemented

Possible loss of precision (2/2)

Will not complain because '2' can be represented as an int

Implemented

Arrays components

Cannot assign an int value in an array of type String[]

Implemented

Method return type check

Ensures that assignments are compatible with method return type

Implemented

Explicit return type checking

Ensures that returned value is compatible with declared return type

Implemented

Implicit return type checking

Ensures that returned value is compatible with declared return type

Implemented

Implicit toString()

Implicit call to toString()

Implemented

Basic type inference

Method calls as well as property access are checked against inferred type

Implemented

Basic flow analysis

Last method call will not complain because type of 'o' at this point can be inferred

Implemented

Instance of

Casts should not be necessary when type can be inferred from a previous instanceof check

Implemented

DefaultGroovyMethods support

Method calls can be resolved against Groovy extension methods

Implemented

with

Static type checking should be aware of the "with" structure

Implemented

Categories

Compiler should be aware that extension method is found in a category

N/A (support will be limited as category support is inherently dynamic)

Groovy list constructor

Type checks the arguments and the number of arguments

Implemented

Groovy map constructor

Type checks the properties and checks for wrong property names

Implemented

Closure parameter types

Type checking the arguments when calling a closure

Implemented

Closure return type inference

Closure return type can be inferred from block

Implemented

Method return type inference

Return type can be inferred from a method if the method is itself annotated with @TypeChecked (or class is annotated with @TypeChecked)

Implemented

Multiple assignments

In case of inline declaration, type check arguments.

Implemented

Multiple assignments from a variable

In case of inline declaration, type check arguments.

Unsupported

Generics

Type checking of generic parameters

Implemented

Spread operator

Type checking against component type

Implemented

Closure shared variables
Closure shared variables
Type check assignments of closure shared variables. The type checker is required to perform a two-pass verification, in order to check that method calls on a closure shared variables belong to the lowest upper bound of all assignment types.Implemented

Open discussions

Closure parameter type inference

With the current version of the checker, idiomatic constructs like :

Are not properly recognized. You have to explicitly set the type of the "it" parameter inside the closure. It is because the expected parameter types of closures are unknown at compile time. There is a discussion about how to add this type information to source code so that the inference engine can deal with them properly. The implementation of closure parameter type inference requires a change to the method signatures. It will probably not belong to the initial release of the type checker.

Unification Types

In cases of for example "x instanceof A || x instanceof B" with A and B being unrelated we could still make an artificial union kind of type, that contains everthing present in A and B, to allow those kinds of method calls. The alternative to this is to allow only methods from Object here, which is less interesintg. This typing can also be used for multicatch, ensuring that a method call is only valid if it exists on each of the exceptions for the multicatch. In the current implementation (Okt-14-2011) the multicatch is already expanded at the point @TypeChecked will check. Meaning effectively this already represents a kind of union type, as the same code is in each catch block and thus the method call would fail, if the method is not available on each type. The proposed behaviour is therefore to align the instanceof case with multicatch.

References

Mailing-list discussions

JIRA issues

  • No labels

16 Comments

  1. Annotation to turn static type checking on class/method would be nice. Although now that I think about it some more, it would be better to cause the code that *uses* the annotated code ignore static type checking when using the class/method. Not sure this is feasible, though.

    Also, was there a discussion if this should really be a language thing or a new set of codenarc rules? I am all for the groovy compiler option myself (I would actually even create a new name for the .bat/.sh, to reflect the very different usage. something like "groosta" or "javvy" :-)

  2. Currently we concentrate on an annotation. That is an approach taken often by us to implement a new feature. If it makes sense to integrate the feature in the language direclty, then this may happen later. Anyway... for now you annotate a class or method and it will then check the types in the annotated area only. The code calling an annotated method will not be checked, unless it is itself in an annotated area. That is much more easy to do and much more recognizable for the programmer - which is why we want to take this approach for now.

    Since it is an annotation atm, there is no need for special starters or commandline options. This kind of thing is for much later, if people really ask for it. As for the codenarc rules, they actually test different things. The type checker is more about checking the types only, while codenarc goes much beyond that and tries to show bad usage patterns. For example in the static type checker returning null in a method that is declared to return boolean will be allowed, but there is a codenarc rule against that. I think those two complete each other, but are not really the same.

  3. It is unclear to me how the STC reads the type information from an expression. If MyClass is defined in a Jar on the compile-time classpath, then will STC by able to check calls to objects of type MyClass? It would be nice if you added information to this GEP specifying how this works. 

    What happens in the case of this class: 

    With this usage:

    Versus this definition:

    where the same example getter/setter call produces errors. The getters and setters are generated in a later phase than AST Transforms run, so does this new annotation duplicate the logic of deciding when a getter and setter is present?

    Also, it would be nice if the GEP lists specifically more things that are not type checked. It's hard to see where the edge cases are when the list is mostly passing scenarios. 

  4. Thanks for your comment Hamlet. If you can checkout the "grumpy" branch, don't hesitate to check an eye, it's already perfectly usable. The type checker will (should) check the type of collaborators too, so should not let you perform a setFoo on a final property. However, I don't have implemented that yet, just because I didn't think about it. So basically, what is implemented and listed here is what I already thought about, and feel free to add more cases to discuss with.

  5. @Cedric I'm not going to have time to check out the codebase soon... but I want to stay up to date with this document. 

    So how does it know that setFoo is available? Does it reuse the getter/setter generation logic already in the compiler? When does STC occur, in what compiler phase?

    How do AST Transformations intersect with it? Is there any way for an AST transformation to provide input to the type checker other than wire up the correct ClassNode hierarchy? 

    Lastly, how is the case of different return types on overloaded methods handled? Here is some sample code to consider: 

    Also, is there a way to turn the STC on globally? I would like there to be, even it is some sort of development backdoor. For instance, we could test this GEP against the CodeNarc codebase fairly easily if there were a way to turn this on globally. 

  6. STC is triggered at the canonicalization phase, so makes use of the same level of information as the compiler provides at that phase. Basically, I performs checks using getProperty() on redirected class nodes, which gives sufficient information.

    The case of overloaded method should already work properly. There is one case which should not be possible in static type checking though : covariant return types with methods for which arguments are in the same type hierarchy. Imagine this :

    As the type of a is potentially only known at runtime, we do not have enough information at compile time to choose the right method...

  7. Hi,

    I really like this feature, but I added a comment to Alex Tkachman article http://groovy.dzone.com/articles/groovy-action-statically-typed 

    Groovy++ in action: statically typed dynamic dispatch

    The main idea behind that is we often want to a new type that is one of a set of types,

    convert(x)  , when x can be one of Integer, Double, BigInteger, or String

    But the compiler is not given that information and cannot check or enforce it any way. My porposal is

    to define an Annotation that will tell the compiler that only the above types are allowed.

    @Typeset(INTEGER, Integer, Double, BigInteger, String, SOME_OTHER_TYPESET) 

    this says that the meta type INTEGER can be any one of the types given above, and that also can include other typeset.

    For groovy this does not change the generated code but gives the compiler sufficient information to do more aggressive type checking.

    On the other hand for groovy++ it would require that we have several convert(INTEGER<Integer> x), convert(INTEGER<Double> x), and_ {}convert(INTEGER<String> x)_

    methods_,_ and the compiler will warn you that convert does not cover all cases. An alternative notation to INTEGER<String> could be INTEGER.String

    Just my thoughts

    The switch on type issue

    But you have nailed the issue of mixing that there are some places dynamic behavior is absolutely required. The solution in Groovy++ is really the the same strategy taken in Java the ugly switch/if-else statement listing "if object instanceof Class1 then preform-1, Class2 perform-2..." this is ugly but worse error prone, if you have more than one place this has to be done. In that case the compiler is of no help making sure that you have changed all the places.

    We need some way of telling the compiler that "Object x" represents on of the following set if classes"Class1, Class2 ...", and when "method1(x)" at compile time check if there are "method1" variants for Class1, Class2... and essentially write the switch statement for you. The compiler can also warn you that there not typed variants of "method1" that 'x' may be, and it can also throw an exception if 'x' is not one of the required types.

    I have not fully thought this through, but the compiler under the actually create an hidden method called say '$dynamic_convert' that encapsulates the switch statement so that the same can be used in multiple places, and secondly support inheritance.

    example: note: below the meaning of the annotation @Typeset is that NUMERIC is s pseudo type that may of type Integer, Double, BigDecimal or any of the types listed in BASE_TYPESET. The new NUMERIC can be use anywhere a Class is used and the effective result of this is that the compiler now knows how to generate the switch.

    Extended.groovy

    in the case above, the compiler generates a hidden method called

    Now for the base class

    Base.groovy
  8. I'm working on a new Groovy Editor for NetBeans API which must support Groovy versions higher than 1.6.4. With STC I encounted a problem. Almost every important  method in the StaticTypeCheckingVisitor class is declared as private. If they were protected I could implement such methods as findMethod(), getTypes() and others not only for groovy source files but for java sources too. Is it possible to make the methods protected ?

  9. @Valery: It's interesting. Could you tell me more about what you are trying to do ? Implement this feature for versions of Groovy < 2.0 ? I am planning to make the transformation pluggable, with the ability to delegate to another class for things like findMethod when the default visitor fails resolving a class, but that's not quite exactly what you need.

  10. Hi, Cédric Champeau

    Thanks for the answer. My starting goal was to improve performance of the NetBeans Groovy Editor and  especially for code completion. But with Groovy 2.0 the editor must discover  STC and work more intellectually. The editor parsing api is based on a subclasses of the groovy CompilationUnit  and CompileUnit in order to resolve not only Groovy sources but Java sources too because a project may contain .groovy and .java files. The editor cannot rely on that .java classes  have been already compiled. Thus, StaticTypeCheckingVisitor cannot (I realize that it must not) get methods from a .java class. So I have a question: If I provide subclassed ClassNode for a .java class (when resolving on SEMANTIC_ANALYSIS phase), override getMethod(String name) to return MethodNode list would it be enough for StaticTypeCheckingVisitor ?

  11. Actually, if you use ClassHelper.make(a java class), static type checker will behave correctly. This is how the visitor works now. If I understand correctly what you do, it should work.

  12. Thank you for your reply. I've tried it and it works well. 
    By the way, TypeChecking has very good performance and virtually no
    impact on the work of the editor.
    Incidentally, this is a good idea to divide the type checking 
    and code generation. I tried to use editor for Groovy++, 
    but realized that it was almost a hopeless undertaking, 
    because there are type checking and code generation takes place simultaneously
    and are already in phase INSTRUCTION_SELECTION.
    For me it was enough to see the editor for Groovy++ 
    by Inlellij IDEA. Errors associated with the type checking is not displayed 
    in the editor, and visible only after compilation.

  13. Where can I report bugs ?

  14. Please report them on JIRA (http://jira.codehaus.org/browse/GROOVY) using the "static type checker" component. If you can, give a try on current master, some bugs are already fixed. 

  15. Hi Cedric.

    Has the idea of some form of "mixed mode" been considered?  I think many groovy programmers would appreciate an additional mode (or alternative) to the "all or nothing" approach to static checking at the method/class level, that would allow static and dynamic calls to coexist.  This would be particularly helpful for those developing idiomatic groovy, such as code including builders, or grails dev, etc...

    How about this interpretation...  What if a mixed mode meant that undeclared (dynamic typed) method overloading was not allowed at compile time for variables whose type can be determined at compile time (type explicitly declared or inferred)?  Overloading in all cases would still be allowed at runtime as well as all other current runtime semantics would be preserved.

    In that case if someone declares the type of a variable (or it can be inferred), then any use of a declared method name for that type would be statically checked?  It is as if the use of declared method name (for a typed variable) would be a signal of intent to statically type check (no need for any additional keyword/annotation).   

    eg.

    Thank you.

  16. Hello,

    Is there any consideration of static type checking wrt primitives?

    In STC mode, will any distinction be made between:

    int x

    Integer x

    I'm assuming not, but with all of the focus on improving "primitive" operation performance, I get a little lost