Should the following be a property, field or local variable?
I guess we can know if its a field or not. Then if not a field a local variable should be used? So how should we set a property?
We could consider a special syntax for field access...
When navigating arbitrary objects using . we'd use property access.
Plus we'd follow java-like rules
Here 'x' refers to the local variable and this.x refers to the field.
We could support .@ notation to refer to a field in some arbitrary object.
When inside a closure, the same rules should apply as being outside a closure.
Fields etc
One interesting mechanism we could employ from Ruby is the use of a special method to change scope. e.g.
where the passed in closure has access to the internal fields of o.

11 Comments
Hide/Show CommentsMay 11, 2004
Chris Poirier
After much discussion, James and I decided to make even this.x references use accessors, if available.
Consider:
It would be very strange if o.x used accessors and this.x didn't. So, all offsets will go through standard dispatch.
There is one exception to this policy, and that is in accessor definitions. The following code will invariably crash the JVM:
and yet this is habit for a lot of people. So, in this one case, we will assume that @x was meant for this.x, and issue only a warning that the syntax is not groovy.
May 11, 2004
Chris Poirier
For the record, here are some of the problems:
In the times() closure, with existing rules, z is always a variable access, and x and y might use accessors (if any were defined) or might directly access the fields in MyClass and Base. If they do use accessors, times() might intercept them and produce very unintuitive results. This is not a minor problem. If they don't use accessors, and a "getY()" method is added to MyClass at runtime, what should happen to the code?
We are planning to allow methods to be added at runtime with the "use" statement. Let's say that Base didn't have an "x", but a use statement adds setX() to Base. Would that code in the closure use a local variable, or the accessor? If the accessor, would it still be interceptible by times()?
The closure delegate stuff effectively adds a new name scope to the language. It inserts that name scope between the local variable scope and the class scope, and it is a name scope that has arbitrary rules about the mapping of names. It is one thing to have this for bare method calls (and GroovyMarkup is certainly reason enough to allow it), but it is something else again to have it affect what appear to be simple variable accesses.
There are four things interacting here, and the results are a tangle of special cases (and bugs in code generation):
If bare identifiers use accessors (when available), it means that all accessors must be known at compile time in order to determine when an assignment creates a new local variable, and when it uses an existing variable or accessor (both accessors in outer classes and accessors in base classes must be known). But accessors can be added at runtime. Paradox.
If bare identifiers use accessors (when available), then the classgen has write dispatch code for all field accesses, so that runtime accessor additions can be considered before the field itself is used. This means that all fields accesses inside a closure are subject to interception by the closure delegate. This is not intuitive. If this.field is used to disambiguate field accesses from inside the closure, what does "this" refer to – the closure object or the outer object? Let's say it is the closure, and the outer class has a field "z" and the closure's scope has a local variable "z" that hides it. What does this.z point to when inside the closure? So instead we require Outer.this.z to access the field in the outer class. Except that we never declared an inner class, so why is Outer required?
If bare identifiers use accessors, and a closure intercepts those method dispatches, it enables the closure to create variables out of nothing. It means that things that appear to be plain old variables can be magically created fully formed on read, despite never being declared, despite never being initialized, and contrary to our undeclared-variable creation policy (that it is only done on assignment). I'm not sure why, but there is a difference between a method call being intercepted and handled at runtime, and a variable receiving the same treatment. The former is a little disconcerting, but quickly grasped in a dynamic system. The latter is an order of magnitude more difficult to comprehend. Where are those variables coming from? Who is creating them? What do they hold? These questions cannot be answer by the casual reader, and it makes the code very hard to understand.
In the end, I think that bare identifiers have to go directly to variables and fields. It is the way they have always worked, and people will expect them to do that. It is one of those fundamental assumptions we make when reading code. And as fields can't be added at runtime, it means we know at compile time exactly which assignments set existing objects, and which create new untyped local variables. It means that you can promote a local variable to a field or property and not have to change the code that uses it.
But how, then, do you use your own accessors? It has to be the same way you use them anywhere else – by doing the access through a reference.
These two pieces of code have the same form and so should have the same effect. If there is a setX(), it will be used. If there isn't, the field will be used directly.
Yes, this does bork the standard practice in constructors (for instance) of using parameter names that match the fields that will be set. With the new policy, if there is an accessor, it will be used. That is the reason for the @field syntax – when direct field access is needed, the @ disambiguates the local and the class scopes:
There are a couple of outstanding issues:
If "this" points to the closure, then it reveals structure not visible in the code (the local variable cache, for instance), and breaks the illusion (and useful conceptual model) that a closure is just a block of code you can pass around. If it points to the "this" in the enclosing scope, it means you cannot access properties on the closure.
The second issue may only be an issue if "this" in a closure doesn't point to the closure, as it will no longer be possible to get the closure's delegate. This might require a keyword variable (closure.delegate.x = 10 or delegate.x = 10), or might require explicit this selection (Closure.this.delegate.x = 10). I think that any of these choices (and possibly a better one someone here will suggest) are more intuitive than the alternative (discussed above).
May 11, 2004
John Wilson
Chris, you say "The closure delegate stuff effectively adds a new name scope to the language. It inserts that name scope between the local variable scope and the class scope". I don't think this is true. The delegate scope is the scope of last resort. If a name is statically resolvable from the closure body then that is the name that is used. Any names which cannot be statically resolved are dynamically resolved against the delegate.
prints
Which seems perfectly reasonable to me. The rule for the compiler and the person reading the code is very simple: if you can resolve the reference against a name in scope then that's the thing to use. Otherwise it's resolved dynamically against the delegate and that may fail at run time.
Closure delegates behave exactly like script bindings in this respect.
May 11, 2004
John Wilson
To address the excellent question "what does "this" point to inside a closure?"
My view is that the use of this should mean "not local to the closure". So the resolution process is to first statically resolve against the enclosing class and if there is no name resolution then resolve dynamically against the delegate.
Note that this has a value. This should be a reference to the enclosing class. If this is assigned to a variable and the variable is used then the normal dynamic resolution will be performed but only against the enclosing class not against the delegate. A compiler could optimise this.
May 11, 2004
Chris Poirier
Hey John,
The behaviour you note on "x" is a very recent change to the language. I made it to resolve some VerifyErrors, and I'm not entirely sure I didn't break stuff by doing it. I considered those changes temporary.
That said, it does simplify things, which is good. But there are still some inconsistencies – presumably, if the delegate can handle gets, it can handle sets, yet we assume all unreferenced assignments to be undeclared local variables, not sets in need of delegation. This means you can never set a property on the delegate without getting an explicit reference to it (or by using setX(), which will work anyway).
And it doesn't resolve the other fuzziness involved in using an accessor supplied at runtime.
Making things that appear to be plain old variables be anything else is really risky business. Method calls are still method calls, even with delegation – they will be executed somewhere or an error will be raised – only one thing about them is changed (the target). But overloading the meaning of a bare variable name means that you not only don't know where the value is coming from/going to, you also don't know how it will be done (and the compiler is in the same boat). Accessing fields externally is something different, because when you go through an interface, you really have no business knowing what goes on behind it.... Plus, when you go through an interface, the compiler knows that you aren't creating anything relevant to its code generation.
May 11, 2004
John Wilson
Chris,
"The behaviour you note on "x" is a very recent change to the language" That's odd. I had an email discussion with James at the beginning of March where we agreed the behaviour that happens now. I think you fixed a bug
I think sets should only be passed to the delegate if they are of the form this.x = 10 and x is not a property or field of the enclosing class. There exists a pathological case where a property or field exists on the delegate and on the enclosing class and you want to use the delegate's one. I'm not sure that this is a problem worth solving at the moment. I suppose something like super.super.x could be used to indicate that the delegate should be used.
Your problem about the adding of method supplied at run time is lovely! My inclination is to ignore the added methods, anything else leads to madness. DefaultGroovyMethods should be ignore as well.
I'm not sure I agree with your distinction between methods and variables. Groovy already has Bindings which provide magic variables to scripts. I don't really see a fundamental difference between bindings and delegates. The Groovy runtime currently handles disambiguating access to delegated properties and fields. There is, of course, a performance hit but it's up to the programmer to decide if this is acceptable.
May 12, 2004
John Wilson
Thinking further about name resolution against the enclosing class - I don't think that it can, in general, be static. The enclosing class or one of its superclasses could implement get/setProperty and/or invokeMethod. So the names are resolved statically inside the closure and then dynamically against the enclosing class instance, if this fails then it is resolved dynamically against the delegate.
This means that methods added at run time are used so adding a getX to the enclosing class with a "use" statement adds a property to that instance.
May 19, 2004
Chris Poirier
Property gets for bare names are back in, when strict mode is not in effect. Bare names will always resolve to local variables first, and field names second. If there is no match, the name will be left for invokeMethod() to handle at run-time. In essence, the compiler won't complain about undeclared variables in normal mode. It is still impossible to write to a property using a bare identifier (the compiler will always interpret it as the creation of an undeclared local variable).
John: does this put things back they way you expected?
May 19, 2004
John Wilson
This lets StreamingMarkupBuilder say as it is which is nice from my point of view
However I have been thinking about name resolution and it's damned tricky. The current compiler assumes that if there is a property on the enclosing class with the same name as a variable used in a closure then it can call getProperty on the enclosing method (i.e. the owner of the closure). this is broken in two ways:
The first issue has been ameliorated by the change I just made to Closure not the only properties on Closure visible to the closure code are owner, delegate, method, parameterTypes, class and metaClass. the compiler can now tell at compile time if reference is being made to a closure property.
The second problem is pretty intractable as far as I can see. I think the compiler must always call getProperty on the closure object and let it try on the owner and then on the delegate if the owner call fails.
May 20, 2004
jrose
I think you are on the way to proving that GroovyMarkup, as designed, is mortally ambiguous. See below, but first my main points.
I have turned the question of scoping round and round in my head for several days, and I think the cleanest principles to base name lookup on are as follows:
I think these principles are clean, powerful, reasonably well-proven, and self-consistent. I'm pretty sure they accommodate all the desired groovy use-cases, except GroovyMarkup.
Saving GroovyMarkup
Here's an example of what I fear is wrong with GroovyMarkup as it stands today. What happens if I (the user) stick a println statement into a nest of Groovy Markup? Or what happens if I use a local method to guide an iteration (since "for" and "while" loops are an advertised feature of this idiom)? It seems to me that there is no conceivable clean principle by which some method calls are handed to the builder's interceptor while others are scoped either locallly into surrounding scopes or passed all the way out to the global scope.
Therefore, I think you need some extra syntax (at least one character like '@') to mark those method calls which are in the name space of the builder's tag schema, rather than regularly scoped calls. I suggest prefix '<' analogously with '@': You can still boast of getting rid of most of the angle brackets.
Here's a wild guess at an acceptable modified markup idiom. Perhaps "<x" is short for "here.x" or something more powerful.
It may also be that GroovyMarkup really wants to be implemented as a macro package (kind of like backquote-comma in Lisp). Can we keep that open as a possibility until we've talked more about syntax extension possibilities?
Finally, consider the possibility of learning to love angle brackets, and including a backquote-comma facility in Groovy directly aimed at XML:
The foregoing suggestions are a little wild; toss them if you like, but please take my initial comments about scoping seriously.
May 24, 2004
Chris Poirier
Hi John,
I played around with the "here" variable idea a fair bit over the weekend, and here are some of the ideas I came up with:
In the end, I'm not sure there is any real need for a "here" variable. In cases where you choose to reuse an existing variable name, chances are you don't need to access the outer variable – if you did, you wouldn't have overridden the name.
However, we do have some significant issues with the "extra" scope added to the language by runtime name-lookup via the closure delegate. This extra scope is the "scope of last resort", against which unresolved names are given a last attempt at resolution. And it is what makes GroovyMarkup and other similar features both possible, and impossibly ambiguous.
What I'd like to suggest is a generalization of your "<" suggestion – a way to make the use of this alternate name-scope explicit. What we need is an operator that forces a name to be directed to the alternate scope first, instead of last. Consider this example of "with" implemented as a closure:
In this example, println() would be evaluated against System.out regardless of any println() defined in the closure's normal scope (for instance).
This operator would allow GroovyMarkup to stand without significant changes, except that when there is ambiguity, it can be overcome (the local scope wins unless * is used, in which case the delegate scope wins). Further, it enables properties to be assigned to on the delegate, which we can't do without it.
I haven't yet worked out if the * scope should stack, though my initial experiments with GroovyMarkup and with() suggest that it shouldn't – if the name fails on the first delegate, the search should be abandoned.