Groovy String Handling

Introduction

Groovy builds upon Java String handling capabilities in several ways:

  • support for single quote and double quote String terminators reduces escaping of the other kind of quote
  • GStrings to allow variables and expressions to be embedded within Strings
  • slashy Strings (for scenarios where normal Java backslash style escaping is not desirable, e.g. regexs)
  • Python-like multi-line Strings

Current Forms

Type Sample Character escaping GString expansion EOL handling First/last line handling
single quotes 'hello' normal Java escaping rules, e.g. \t, \n, \uXXXX
' must be escaped using \
no expansion EOL only allowed after '\' in which case it is swallowed n/a
double quotes "hello" normal Java escaping rules, e.g. \t, \n, \uXXXX
" must be escaped using \
$IDENT and $CLOSURE expanded EOL only allowed after '\' in which case it is swallowed n/a
slashy quotes /hello/ \ used to escape EOL
/ must be escaped using \
(change / escaping to '//'?)
$IDENT and $CLOSURE expanded EOL only allowed after '\' in which case it is swallowed n/a
multi-line single quotes '''hello''' normal Java escaping rules, e.g. \t, \n, \uXXXX no expansion EOL allowed and normalized to '\n'
'\' followed by EOL swallowed
all EOL and whitespace significant
multi-line double quotes """hello""" normal Java escaping rules, e.g. \t, \n, \uXXXX $IDENT and $CLOSURE expanded EOL allowed and normalized to '\n'
'\' followed by EOL swallowed
all EOL and whitespace significant

The Requirement for enhanced capabilities

It is sometimes desirable to have slashy style strings that span multiple lines, e.g.:

def usphone = ///(?x)
  \(?     # optional parentheses
    \d{3} # area code required
  \)?     # optional parentheses
  [-\s.]? # separator is either a dash, a space, or a period.
    \d{3} # 3-digit prefix
  [-.]    # another separator
    \d{4} # 4-digit line number
///

def numbers = '''
314-555-4000
800-555-4400
(314)555-4000
314.555.4000
'''.trim().split('\n')

assert numbers.every{ it ==~ usphone } 

It is also useful to have snippets of code (including ones containing slashy strings) being able to be embedded in Strings:

def script = ///
  'foo' ==~ /f.o/
///
assert new GroovyShell().evaluate(script)

Or

assertScriptProducesOutput(///
['red', 'green', 'blue'].each { color ->
    println color ==~ /.*e\S/
}
///, ///\
true
true
false
///)

See also GROOVY-2701

Proposal 1

Introduce a multi-line slashy String.

Type Sample Character escaping GString expansion EOL handling First/last line handling
multi-line slashy quotes ///hello///
or
$/hello/$
\ used to escape EOL $IDENT and $CLOSURE expanded EOL allowed and normalized to '\n'
'\' followed by EOL swallowed
all EOL and whitespace significant

Choosing this option restricts comments that have / as the first letter from appearing in certain places. Most places where banner style comments are normally use would not be affected.

Other options for the quote delimiters include: |/|, $|, ###, %%%, $$$

Proposal 2

Have here doc style strings for these combinations:

Type GString support Java Escaping
' here doc
" here doc
/ here doc
> here doc

Or, for the full details:

Type Sample Character escaping GString expansion EOL handling First/last line handling
multi-line here docs single quotes $IDENT'
hello
'IDENT$
(IDENT can be empty)
normal Java escaping rules, e.g. \t, \n, \uXXXX no expansion EOL allowed and normalized to '\n'
'\' followed by EOL swallowed
opening delimiter should be followed by EOL (any whitespace will be ignored); the closing delimiter must be on a line by itself (any preceeding whitespace and the preceeding EOL is removed)
multi-line here docs double quotes $IDENT"
hello
"IDENT$
(IDENT can be empty)
normal Java escaping rules, e.g. \t, \n, \uXXXX $IDENT and $CLOSURE expanded EOL allowed and normalized to '\n'
'\' followed by EOL swallowed
opening delimiter should be followed by EOL (any whitespace will be ignored); the closing delimiter must be on a line by itself (any preceeding whitespace and the preceeding EOL is removed)
multi-line here docs slashy quotes $IDENT/
hello
/IDENT$
(IDENT can be empty)
\ used to escape EOL $IDENT and $CLOSURE expanded EOL allowed and normalized to '\n'
'\' followed by EOL swallowed
opening delimiter should be followed by EOL (any whitespace will be ignored); the closing delimiter must be on a line by itself (any preceeding whitespace and the preceeding EOL is removed)
multi-line here docs tag quotes $IDENT>
hello
<IDENT$
(IDENT can be empty)
\ used to escape EOL no expansion EOL allowed and normalized to '\n'
'\' followed by EOL swallowed
opening delimiter should be followed by EOL (any whitespace will be ignored); the closing delimiter must be on a line by itself (any preceeding whitespace and the preceeding EOL is removed)

Proposal 3

Cut down version of 2 just supporting styles not currently similar to ''' and """ Strings.

Type Sample Character escaping GString expansion EOL handling First/last line handling
here docs string $'IDENT
hello
IDENT
or
<<'IDENT
hello
IDENT
\ used to escape EOL no expansion EOL allowed and normalized to '\n'
'\' followed by EOL swallowed
opening delimiter should be followed by EOL (any whitespace will be ignored); the closing delimiter must be on a line by itself (any preceeding whitespace and the preceeding EOL is removed)
here docs gstring $"IDENT
hello
IDENT
or
<<"IDENT
hello
IDENT
\ used to escape EOL $IDENT and $CLOSURE expanded EOL allowed and normalized to '\n'
'\' followed by EOL swallowed
opening delimiter should be followed by EOL (any whitespace will be ignored); the closing delimiter must be on a line by itself (any preceeding whitespace and the preceeding EOL is removed)

Proposal 4

  • leave '', "", '''s''', """s""" as they are today because they are consistent
  • remove the slashy notation
  • the rest of the cases seem to be looking at configuring how the escaping should work and how the EOL should be handled. To solve this I am suggesting one of the following:
    r(single-quoted-string, flags) and r(double-quoted-string, flag)
    

    or

    '''s'''<FLAGS> and """s"""<FLAGS>
    

Proposal 5

This is kind of a combination of 1 and 3.

Type Sample Character escaping GString expansion EOL handling First/last line handling
multi-line slashy quotes $/hello/$ \ used to escape EOL $IDENT and $CLOSURE expanded EOL allowed and normalized to '\n'
'\' followed by EOL swallowed
all EOL and whitespace significant
here docs string $'IDENT
hello
IDENT
no escaping no expansion EOL allowed and normalized to '\n' opening delimiter (up to and including IDENT) should be followed by EOL (any whitespace will be ignored); the closing delimiter must be on a line by itself (any preceeding whitespace will be ignored)
here docs gstring $"IDENT
hello
IDENT
no escaping $IDENT and $CLOSURE expanded EOL allowed and normalized to '\n' opening delimiter (up to and including IDENT) should be followed by EOL (any whitespace will be ignored); the closing delimiter must be on a line by itself (any preceeding whitespace is removed)

Slashy String improvement Proposal

Currently, you can't have \ as the last character of a slashy string. There are several workarounds. One is to use an empty single quote string ${''} at the end of the slashy string. It would be nice to not need to do this.

Proposal Current behavior Groovy 1.6 Groovy 2.0
status quo \EOL -> swallowed
\/ -> /
unchanged unchanged
Option 1 as above \EOL -> swallowed
// -> /
\/ -> /
\\\\/ -> \\\
(in general, n \'s before a / -> n-1 \'s
and string is terminated)
\EOL -> swallowed
// -> /
(this is the minimal change but // still conflicts with comment delimiter but that can't appear inside a string)
Option 2 as above \EOL -> swallowed
$$ -> $
$/ -> /
\/ -> /
$\ -> \
\EOL -> swallowed
$$ -> $
$/ -> /
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.