Skip to end of metadata
Go to start of metadata

Metadata

Number:

GEP-3

Title:

Command Expression based DSL

Version:

1

Type:

Feature

Target:

1.8 or 2.0

Status:

Draft

Leader:

Jochen "blackdrag" Theodorou

Created:

2009-06-30

Last modification:

2010-03-23

Last update by:

Guillaume Laforge

Abstract

Since Groovy 1.0 Groovy supports command expressions. These are method calls without parenthesizing the arguments. This would be in theory a nice base for DSLs, but our command expressions are too limited, because we were not able to find easy rules on how to handle multiple arguments. This proposal now tries to close the gap by defining the evaluation order and meaning of those arguments. The concept is very near to what Scala allows, but is not equal for historic reasons.

Rationale

Current Command Expression

examples of current command expressions are:

command expression

meaning

foo a1

foo (a1)

foo {c}

foo ({c})

foo m {c}

foo (m({c})

foo a1, a2

foo (a1, a2)

foo k1:v1, a2, k2:v2

foo ([k1:v1, k2:v2], a2)

foo k1:m{c}

foo ([k1:m({c})])

examples of current command expressions, that are not allowed:

command expression

possible meanings

foo a1 a2

foo(a1,a2)
foo(a1(a2))
foo(a1).a2

foo a1 a2 a3

foo(a1,a2,a3)
foo(a1(a2(a3)))
foo(a1).a2(a3)
foo(a1,a2(a3))

This list is not intended to be complete.

Constraints

  • existing valid usages must be kept as much as possible (for obvious backwards compatibility reasons)
  • the evaluation must be easily explainable
  • the grammar should support it

Details

What I want to allow are expressions such as

expression

possible meanings

allowed in old syntax

foo {c}

foo({c})

(tick) (same meaning)

foo a1

foo(a1)

(tick) (same meaning)

foo a1()

foo(a1())

(tick) (same meaning)

foo a1 {c}

foo(a1({c}))

(tick) (same meaning)

foo a1 a2

(error)

(error)

foo a1() a2

(error)

(error)

foo a1 a2()

(error)

(error)

foo a1 a2 {c}

(error)

(error)

foo a1 {c} a2

(error)

(error)

foo a1 {c} a2 {c}

(error)

(error)

foo a1 a2 a3

foo(a1).a2(a3)

(error)

foo a1() a2 a3()

foo(a1()).a2(a3())

(error)

foo a1 a2() a3

(error)

(error)

foo a1 a2 a3 {c}

foo(a1).a2(a3({c}))

(error)

foo a1 a2 a3 a4

(error)

(error)

foo a1 a2 a3 a4 {c}

(error)

(error)

foo a1 a2 a3 a4 a5

foo(a1).a2(a3).a4(a5)

(error)

foo a1() a2 a3() a4 a5()

foo(a1()).a2(a3()).a4(a5())

(error)

foo a1 a2 a3 a4 a5 {c}

foo(a1).a2(a3).a4(a5({c})

(error)

The table shows enough to recognize the pattern. The attached block has a special role as it does not count as argument on its own directly. Instead the block is always bound to the identifier before and makes a method call. that itself is no command expression, but a normal method call expression. As can be seen too, this syntax nicely extends the existing Groovy syntax. Of course this also means, it will not be possible to omit commas if multiple arguments are used. A case that is not supported today anyway. For a DSL that is not really a problem though.

Summary of the pattern

  • A command-expression is composed of an even number of elements
  • The elements are alternating a method name, and its parameters (can be named and non-named parameters)
  • A parameter element can be any kind of expression (ie. a method call foo(), foo{}, or some expression like x+y)
  • All those pairs of method name and parameters are actually chained method calls (ie. send "hello" to "Guillaume" is two methods chained one after the other as send("hello").to("Guillaume"))

Interesting benefit of the enhanced command expressions

More and more do we see Java Fluent APIs that chain method calls, returning this, so as to "build" a new object.
For instance, you can imagine a fluent API for building an Email message, that would look something like this in Java:

In Groovy, with the extended command expressions, this could become:

Notice the absence of parentheses and dots.

Example: A DSL for SQL

SELECT "column_name"
FROM "table_name"
WHERE "column_name" IN ('value1', 'value2', ...)

In current Groovy this could maybe expressed by

With this new command dsl you could also do

It should be noticed, that both cases have quite different semantics. In the second case the writer saves a lot of commas, but of course not all of them. Also the lack of any kind of operator like the comma makes it diifivult to span the DSL across multiple lines. A more extended example would be

SELECT COUNT("column_name")
FROM "table_name"

To express this in map style is a bit difficult, because of where to place count... a possible version is mabye

More example ideas

Here are some additional examples which relate to various domains, which may make the idea more visual in our minds.
These examples also mix named and non-named arguments, the use closures or not.
In comments, alongside the example, you'll see the equivalent non-command expression interpretation.

Extension to command expressions in the case of assignments

Currently, command expressions are allowed as standalone top-leval statements or expressions, but you can't assign such an expression to a variable with keeping that nice DSL syntax. For instance, while you can do:

If you wanted to assign that command (which could return a Position instance), you would like to do

But you still have to do

So the GEP-3 proposal also suggests we extend command expressions to be allowed on the RHS of assignments.

Differences to Scala

For historic reasons

println foo

has to be supported. This seems to not to be a valid version in Scala, since that would be interpreted as

and not as

. On the other hand

is interpreted as

in Scala and is invalid in current Groovy as well as after this proposal. So it could be stated, that this proposal is less object oriented then Scala, because the DSL usually starts with the method, not the object. On the other hand it is possible to write

So the Groovy notation would be a bit more verbose, but not much.

To be evaluated: Mixed case with explicit parentheses

A possible supported case is also when mixing method calls with explicit parentheses within that extended command expression.
The benefit would be to allow the ability to also be able to call methods not taking parameters, as well as allowing an odd number of "elements" (ie. a method name or a parameter).

would be respectively equivalent to:

Note that the method calls with explicit parentheses could also take a number of arguments.
For instance, this is also a valid mixed command expression:

Mailing-list discussions

JIRA issues

  • No labels

15 Comments

  1. One potential source of ambiguity with optional parentheses in many languages is foo(x+y)*z.  It could be interpreted as (foo(x+y))*z, or as foo( (x+y)*z).  Ruby solves this by looking for whitespace between "foo" and "(".  If there isn't any whitespace, we get the first interpretation, and if there is, we get the second.  Groovy already has an interpretation, which it inherets from Java, so you might want to add this to your examples.  You might also want to consider how your proposal interacts with infix operators in general.

  2. Another source of inspiration is Ruby, which also allows the last argument to be outside the parens if its a closure literal.  It allows parentheses to be left off wherever they're not ambiguous.  There's only one interpretation of "foo bar 23", which is foo(bar(23)), so this is how that's interpreted.  It also allows "foo bar a, b" even though this is ambiguous, it means foo(bar(a, b)), not foo(bar(a), b).  Ruby used to warn about this, but removed the warning in 1.9, presumably because people don't find it confusing.  Ruby gives up (i.e. you get a parse error) for "foo a, bar b, c". It also gives up for "[foo a, b]", since it's not clear whether this is a list of length two with first element foo(a), or a list of length one with element foo(a, b).

  3. It seems you're proposing to make "." optional in certain circumstances, as Scala does, but because of backward compatibility with existing Groovy syntax, you can't always make it optional.  Right?  Here's an alternative, in which "." is always explicit but parens are options, which may or may not be more regular:

    Example

    Interpretation

    foo a

    foo(a)

    a.foo b

    a.foo(b)

    foo a, b, c

    foo(a, b, c)

    a.foo b, c, d

    a.foo(b, c, d)

    a + foo b

    a + foo(b)

    a - b

    a - b

    a(-b)

    a(-b)

    foo a + b

    foo(a + b)

    foo a = b

    foo(a = b)

    foo bar a

    foo(bar(a))

    foo(a+b)*c

    (foo(a+b))*c

    foo bar a, b

    foo(bar(a, b))

     Which do you think is more regular and easier to learn?  Which allows more useful DSLs?  Ruby has brought DSLs into the mainstream with a syntax similar to the latter, but perhaps the Scala inspired one is even better.  To be honest, with the Scala one I'm worried that the rules would be hard to remember, especially with the exceptioins needed for backward compatibility with Groovy's syntax.  But maybe with a little experience it would be straight forward.

  4. I am definitely keen for some extension in this space - but not sure whether the Scala or Ruby or some other way is best.
    I know whenever I hear about the subtle rules for leaving off the brackets for the last term when using ScalaTest if just feels like something is wrong, e.g. "result should not be (null)" can omit brackets depending on whether there is an odd or even number of preceding terms.

    I wonder if one form can be converted to the other using an AST macro?

  5. Hey Paul,

     I don't think this can be done as an AST macro.  Macros change one AST into another, but whether or not you leave the parens off, you get the same AST.  Also, this involves taking something that doesn't parse and making it parse, which can't be done by a macro.

  6. x y z

    is an interesting example in general. There have been ideas to make this into x(y,z) in the past. As of mcspanky it seem Ruby sees that as x(y(z)), and Scala sees this as x.y(z). In my proposal this is not valid. What I very much like about the Scala version is, that you can express x+y with it, using normal method call convention. Of course x+y*z is then x.(thumbs up).(z) and not x.(y.(z)) as operator preference rules force. Currently I am thinking that maybe this style can be added. Of course

    a1 a2 a3 a4

    would then be a1(a2).a3(a4), while

    a1 a2 a3

    would be a1.a2(a3). Let me call the style before the even-style, which this proposal is about. And the other the odd-style, which is more like Scala. As you can see the change is then determined by the number of participants.

    foo a1

    then follows the even rule,

    foo

    alone would follow the odd rule... only that we here require foo(), so it does not count.

    foo x+y

    would normally follow the even rule, since it really is foo(x+y). But if I see + as participant as well, then the even rule has to be applied as well, but this time as foo(error).+(thumbs up). This I see as problem if we want to allow even and odd. Also

    foo x+y z

    is that in Scala foo.x(plus).y(z) or is that foo(x+y).z()? I fear the first is the case. If "+" is no participant, then it would still not be valid in Scala, since it would be something like foo.(x+y)(z), which might make some sense in Groovy, but I guess it does not in Scala.

    The Ruby way with nested method calls would mean x+y is still x.(thumbs up), so that goal would be reached. But x*y+z is x.*(y.(z)), which might be a little problematic regarding the intermediate types, but well. Still my example

    sql.select count("column_name") from "table_name"

    is a bit of a problem, since I see no way to express that in Ruby style.

  7. I must say I have general concerns with this proposal. Especially, I think it fails the "easy to explain" test (wink)

    I can hardly remember what

    a b c

    shall mean in contrast to

    a b c d

    Now throw infix operators into the mix and I'm totally lost.

    I feel that making dereferencing with "." and argument passing with "()" optional at the same time leads to confusion.

    I see the DSL use case but I think that such a DSL should be an external one (ease to parse with Groovy anyway).

    The maximum that I would support atm is making () more lenient in terms of right-to-left grouping:

    println foo bar == println(foo(bar))

  8. the concept is very similar to what Scala has, but because of compatibility we use even parts, while they use odd parts. As a result "a b c" is not allowed, "a b c d" is and it makes it a(b).c(d) in GEP-3. On of the big problems of both versions is, that you cannot simply write a println in front of it and then expect it to work. But that is not different from what we now have if a command expression is used. If we do the grouping you suggest then "a b c d" becomes a(b(c(d))), which might be easy to remember, but is there much value in it? The only parameter you can give in here is "d". Other informations would have to be encoded in method names which limits it to strings more or less, even if GStrings are used to produce more than just vanilla names.

    We already have some simple command expressions, this GEP is just trying to extend what we already have. The synax is not really all that difficult, it is name epxression(name epxression)+, meaning a name is followed by an expression (which cannot be a command expression and then we start from the beginning. Even in a kind of makro you could say that name epxression (name expression)+ becomes name "(" expression ")" ("." name "(" expression ")" )+

    The only possible confusing part here is the expression part, which the DSL user should keep simple. But still I can say that a ++b c d-- is a(++b).c(d--), since I know I start with a name, then an expression has to follow. Since no command expressions are allowed here it must end after ++b. That makes c the next name and d-- the final expression.

  9. @mcspanky: so what I was thinking about was opening up the grammar to support any number of terms but not have a particular interpretation. Then have a subsequent verifier that checked that no more than two terms in a row occur after all AST steps have occurred. Then you could have:

    @AsNestedFunctions
    def calculate() {
        foo bar baz
    }

    would be changed to foo(bar(baz)) whereas:

    @AsTupledArguments
    def calculate() {
        fee fi fo fum
    }

    would be changed to fee(fi).fo(fum)

    And obviously you could have others too, e.g. if we need to support foo.bar(baz) or foo(bar(baz()) or baz(bar(foo)).
    Note, these are just tentative names but you will get the idea.

  10. Hey Paul,

    Interesting idea.  How would this interact with dots and commas?  How would "fee . fi fo, fum blood , englishman" be parsed?  Or with operators, e.g. "fee fi + fo fum blood"?  Should that be parsed as (fee fi) + (fo fum blood)?  fee (fi + fo) fum blood?  Something else?

  11. @mcspanky: yes, it will be tricky to allow ultimate flexibility - maybe not even feasible with the current compiler phases - worthy of some exploration though.

  12. Also a problem would be when you want to mix and mash everything, like chained method calls, and GEP-3, and whatever else.
    In that case, you'd need to differentiate on a line by line basis the meaning you'd want to give to one given statement.
    So this sounds a bit problematic.
    Also the fact of having to annotate the code is also a problem for DSLs, as it's some mysterious stuff end-users would have to learn (like static void main string args blah blah).

  13. @Dierk König

    I agree. Whatever we end up choosing needs to be well understood in a short space of time. Subtle differences in behaviour depending on the number of elements in the expression is a recipe for disaster.

  14. I haven't decided whether it helps or not but one approach would be to require an even number of expressions.
    So "a b c d" would translate to "a(b).c(d)".
    Whereas "a b c" would need to be written "using a b c" or "with a b c" and would translate to "with(a).b(c)" where "with(a)" just returned "a".

  15. Paul, that is an interesting idea. A single "a" becomes "with(a)" and returns a, which is compatible with what we have. "a b" becomes as per GEP-3 "a(b)". "a b c" becomes then "with(a).b(c)", which means "a == b" could be seen as "with(a).comparedEquals(b)". I very much like that idea