-
Notifications
You must be signed in to change notification settings - Fork 5
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add new proposal for lexical scoping
- Loading branch information
Showing
1 changed file
with
286 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,286 @@ | ||
# Lexical Scoping | ||
|
||
- JEP: (leave blank) | ||
- Author: @jamesls | ||
- Created: 2023-03-21 | ||
|
||
## Abstract | ||
[abstract]: #abstract | ||
|
||
This JEP proposes the introduction of lexical scoping through a new | ||
`let` expression. You can now bind variables that are evaluated in the | ||
context of a given lexical scope. This enables queries that can refer to | ||
elements defined outside of their current scope, which is not currently | ||
possible. This JEP supercedes JEP 11, which proposed similar functionality | ||
through a `let()` function. | ||
|
||
## Motivation | ||
[motivation]: #motivation | ||
|
||
A JMESPath expression is always evaluated in the context of a current | ||
element, which can be explicitly referred to via the `@` token. The | ||
current element changes as expressions are evaluated. For example, | ||
suppose we had the expression `foo.bar[0]` that we want to evalute against | ||
an input document of: | ||
|
||
```json | ||
{"foo": {"bar": ["hello", "world"]}, "baz": "baz"} | ||
``` | ||
|
||
The expression, and the associated current element are evaluated as follows: | ||
|
||
``` | ||
# Start | ||
expression = foo.bar[0] | ||
@ = {"foo": {"bar": ["hello", "world"]}, "baz": "baz"} | ||
# Step 1 | ||
expression = foo | ||
@ = {"foo": {"bar": ["hello", "world"]}, "baz": "baz"} | ||
result = {"bar": ["hello", "world"]} | ||
# Step 2 | ||
expression = bar | ||
@ = {"bar": ["hello", "world"]} | ||
result = ["hello", "world"] | ||
# Step 3 | ||
expression = [0] | ||
@ = ["hello", "world"] | ||
result = "hello" | ||
``` | ||
|
||
The end result of evaluating this expression is `"hello"`. Note that each | ||
step changes that values that are accessible to the current expression being | ||
evaluated. In "Step 2", it is not possible for the expression to reference | ||
the value of `"baz"` in the current element of the previous step, "Step 1". | ||
|
||
This ability to reference variables in a parent scope is a serious limitation | ||
of JMESPath, and anecdotally is one of the commonly requested features | ||
of the language. Below are examples of input documents and the desired output | ||
documents that aren't possible to create with the current version of | ||
JMESPath: | ||
|
||
``` | ||
Input: | ||
[ | ||
{"home_state": "WA", | ||
"states": [ | ||
{"name": "WA", "cities": ["Seattle", "Bellevue", "Olympia"]}, | ||
{"name": "CA", "cities": ["Los Angeles", "San Francisco"]}, | ||
{"name": "NY", "cities": ["New York City", "Albany"]} | ||
] | ||
}, | ||
{"home_state": "NY", | ||
"states": [ | ||
{"name": "WA", "cities": ["Seattle", "Bellevue", "Olympia"]}, | ||
{"name": "CA", "cities": ["Los Angeles", "San Francisco"]}, | ||
{"name": "NY", "cities": ["New York City", "Albany"]} | ||
] | ||
} | ||
] | ||
(for each list in "states", select the list of cities associated | ||
with the state defined in the "home_state" key) | ||
Output: | ||
[ | ||
["Seattle", "Bellevue", "Olympia"], | ||
["New York City", "Albany"] | ||
] | ||
``` | ||
|
||
``` | ||
Input: | ||
{"imageDetails": [ | ||
{ | ||
"repositoryName": "org/first-repo", | ||
"imageTags": ["latest", "v1.0", "v1.2"], | ||
"imageDigest": "sha256:abcd" | ||
}, | ||
{ | ||
"repositoryName": "org/second-repo", | ||
"imageTags": ["v2.0", "v2.2"], | ||
"imageDigest": "sha256:efgh" | ||
}, | ||
]} | ||
(create a list of pairs containing an image tag and its associated repo name) | ||
Output: | ||
[ | ||
["latest", "org/first-repo"], | ||
["v1.0", "org/first-repo"], | ||
["v1.2", "org/first-repo"], | ||
["v2.0", "org/second-repo"], | ||
["v2.2", "org/second-repo"], | ||
] | ||
``` | ||
|
||
In order to support these queries we need some way for an expression to | ||
reference values that exist outside of its implicit current element. | ||
|
||
|
||
## Specification | ||
[specification]: #specification | ||
|
||
A new "let expression" is added to the language. The expression has the | ||
format: `let <bindings> in <expr>`. The updated grammar rules in ABNF are: | ||
|
||
``` | ||
let-expression = "let" bindings "in" expression | ||
bindings = variable-binding *( "," variable-binding ) | ||
variable-binding = variable-ref "=" expression | ||
variable-ref = "$" unquoted-string | ||
``` | ||
|
||
The `let-expression` and `variable-ref` rule are also added as a new expression | ||
types: | ||
|
||
``` | ||
expression =/ let-expression / variable-ref | ||
``` | ||
|
||
Examples of this new syntax: | ||
|
||
* `let $foo = bar in {a: myvar, b: $foo}` | ||
* `let $foo = baz[0] in bar[? baz == $foo ] | [0]` | ||
* `let $a = b, $c = d in bar[*].[$a, $c, foo, bar]` | ||
|
||
### New evaluation rules | ||
|
||
Let expressions are evaluated as follows. | ||
|
||
Given the rule `"let" bindings "in" expression`, the `bindings` rule is | ||
processed first. Each `variable-binding` within the `bindings` rule defines | ||
the name of a variable and an expression. Each expression is evaluated, and the | ||
result of this evaluation is then bound to the associated variable name. | ||
|
||
Once all the `variable-binding` rules have been processed, the associated | ||
`expression` clause of the let expression is then evaluated. During the | ||
evaluation of the expression, any references, via the `variable-ref` rule, to a | ||
variable name will evaluate to the value bound to the variable. Once the | ||
associated expression has been evaluated, the let expression itself evaluates | ||
to the result of this expression. After the let expression has been evaluated, | ||
the variable bindings associated with the let expression are now longer valid. | ||
This is also referred to as the visibility of a binding; the bindings of a | ||
let expression are only visible during the evaluation of the `expression` | ||
clause of the let expression. | ||
|
||
When evaluating the `bindings` rule, a `variable-binding` for a variable name | ||
that is already visible in the current scope will replace the existing binding | ||
when evaluating the `expression` clause of the let expression. This means in | ||
the context of nested let expressions (and consequently nested scopes), a | ||
variable in an inner scope can shadow a variable defined in an outer scope. | ||
|
||
If a `variable-ref` references a variable that has not been defined, the | ||
evaluation of that `variable-ref` will trigger an `undefined-variable` error. | ||
This error MUST occur when the expression is evaluated and not at compile | ||
time. This is to enable implementations to define an implementation specific | ||
mechanism for defining an initial or "global" scope. Implementations are free | ||
to offer a "strict" compilation mode that a user can opt into, but MUST support | ||
triggering an `undefined-variable` error only when the `variable-ref` is | ||
evaluated. | ||
|
||
### Examples | ||
|
||
Basic examples demonstrating core functionality. | ||
|
||
``` | ||
search(let $foo = foo in $foo, {"foo": "bar"}) -> "bar" | ||
search(let $foo = foo.bar in $foo, {"foo": {"bar": "baz"}}) -> "baz" | ||
search(let $foo = foo in [$foo, $foo], {"foo": "bar"}) -> ["bar", "bar"] | ||
``` | ||
|
||
Nested bindings. | ||
|
||
``` | ||
search( | ||
let $a = a | ||
in | ||
b[*].[a, $a, let $a = 'shadow' in $a], | ||
{"a": "topval", "b": [{"a": "inner1"}, {"a": "inner2"}]} | ||
) -> [["inner1", "topval", "shadow"], ["inner2", "topval", "shadow"]] | ||
``` | ||
|
||
Errors cases. | ||
|
||
``` | ||
search($foo, {}) -> <error: undefined-variable> | ||
search([let $foo = 'bar' in $foo, $foo], {}) -> <error: undefined-variable> | ||
``` | ||
|
||
|
||
## Rationale | ||
[rationale]: #rationale | ||
|
||
The let expression proposed in this JEP is based off of similar constructs | ||
in existing programming languages: | ||
|
||
|
||
* Haskell: http://learnyouahaskell.com/syntax-in-functions#let-it-be | ||
* Clojure: https://clojuredocs.org/clojure.core/let | ||
* OCaml: https://v2.ocaml.org/manual/expr.html#sss:expr-localdef | ||
|
||
It's important to use syntax and semantics that are already familiar to | ||
developers. We are introducing lexical scoping, which is not a novel | ||
concept, into the language, so care was taken to be consistent with | ||
the mental model that developers already have. | ||
|
||
|
||
## Testcases | ||
[testcases]: #testcases | ||
|
||
Basic expressions | ||
|
||
```yaml | ||
# Basic expressions | ||
- given: | ||
foo: | ||
bar: baz | ||
cases: | ||
- expression: "let $foo = foo in $foo" | ||
result: | ||
bar: baz | ||
- expression: "let $foo = foo.bar in $foo" | ||
result: "baz" | ||
- expression: "let $foo = foo.bar in [$foo, $foo]" | ||
result: ["baz", "baz"] | ||
- command: "Multiple assignments" | ||
expression: "let $foo = 'foo', $bar = 'bar' in [$foo, $bar]" | ||
result: ["foo", "bar"] | ||
# Nested expressions | ||
- given: | ||
a: topval | ||
b: | ||
- a: inner1 | ||
- a: inner2 | ||
cases: | ||
- expression: "let $a = a in b[*].[a, $a, let $a = 'shadow' in $a]" | ||
result: | ||
- ["inner1", "topval", "shadow"] | ||
- ["inner2", "topval", "shadow"] | ||
- comment: Bindings only visible within expression clause | ||
expression: "let $a = 'top-a' in let $a = 'in-a', $b = $a in $b" | ||
result: "top-a" | ||
# Examples from Motivation section | ||
- given: | ||
- home_state: WA | ||
states: | ||
- name: WA | ||
cities: ["Seattle", "Bellevue", "Olympia"] | ||
- name: CA | ||
cities: ["Los Angeles", "San Francisco"] | ||
- name: NY | ||
cities: ["New York City", "Albany"] | ||
cases: | ||
- expression: "[*].[? let $home_state = home_state in name == $home_state].cities" | ||
result: | ||
- ["Seattle", "Bellevue", "Olympia"] | ||
- ["New York City", "Albany"] | ||
``` |