Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter expression selectors should require parentheses. #183

Closed
gregsdennis opened this issue Apr 24, 2022 · 21 comments
Closed

Filter expression selectors should require parentheses. #183

gregsdennis opened this issue Apr 24, 2022 · 21 comments

Comments

@gregsdennis
Copy link
Collaborator

gregsdennis commented Apr 24, 2022

From this comment it was stated that the current ABNF doesn't require parentheses around the expression in a filter expression. The spec thus contains examples without them.

Previous discussions had occurred around making them optional, but no decision came out of them that I can recall. The closest documented conversation I could find was this comment by @goessner and the following comments which indicate that the parentheses are currently expected.

To be clear, we should currently require [?(<expr>)], and disallow [?<expr>].

The original blog post consistently uses them and never shows a filter expression selector without them.

@gregsdennis gregsdennis changed the title Filet expressions should require parentheses. Filter expressions should require parentheses. Apr 24, 2022
@gregsdennis gregsdennis changed the title Filter expressions should require parentheses. Filter expression selectors should require parentheses. Apr 24, 2022
@cabo
Copy link
Member

cabo commented Apr 25, 2022

I don't have a strong opinion on this simplification, but one data point:
The simplified grammar has been in the document since we added ABNF grammar, i.e., in draft-ietf-jsonpath-base-01 (2021-07-08). The discussion was started in #64, 2021-04-30.

For me, the current ABNF is the status quo..
If we need to have a discussion here, let's have it, but I'm not interested in the "we haven't discussed this enough" angle -- this has been out in the document for 9 months.

So if there are any technical arguments for requiring the parentheses, please add them here.

@goessner
Copy link
Collaborator

I am strongly voting for the simplified, backwards compatible notation with optional parentheses.

@cabo
Copy link
Member

cabo commented Apr 25, 2022

The PR that put in the ABNF (which has never changed in substance with regard to requiring parentheses):

https://github.com/ietf-wg-jsonpath/draft-ietf-jsonpath-base/pull/98/files

There is even a comment from you (Greg) on that ABNF snippet in that PR conversation, so you apparently reviewed it.
(Of course, you might have missed the simplification and there may be technical arguments against it -- let's hear them.)

@gregsdennis
Copy link
Collaborator Author

gregsdennis commented Apr 25, 2022

Yes, looking back, I see point 4.1 in #64 raised the topic, and the fact that it wasn't discussed at all (even though the original post said it should be discussed) could have been interpreted as acceptance and agreement.

Also, that PR you reference is massive. It's easy to miss this one line in the multitude of changes contained therein.

I suspect this was something that was snuck in rather than agreed upon. Agreement requires discussion, and I see none.

Personally, when I see [[email protected]==1], the ? looks as though it's part of the expression. Having the parentheses, [?(@.foo==1)], better isolates the expression visually.

This is not a problem of parsing, but one of the human element.

@glyn
Copy link
Collaborator

glyn commented Apr 25, 2022

For the record, I was originally uncomfortable with the parentheses being optional, but I remember this being discussed at a meeting quite a while ago and I was persuaded at that time that the current approach is reasonable. Perhaps that discussion wasn't highlighted in the minutes. I think we should stay as we are.

@gregsdennis
Copy link
Collaborator Author

gregsdennis commented Apr 26, 2022

@goessner I know you already commented to the contrary, but your original post has this to say:

Filter expressions are supported via the syntax ?(<boolean expr>) as in

$.store.book[?(@.price < 10)].title

I think this is pretty indicative that parentheses should be included as part of the required syntax. Taking this stance also aligns with our preference to accommodate existing implementations. While we say that this is a backward compatible change, it may not be. For instance, my parser won't recognize a filter selector without the parentheses. I expect many other existing implementations will be the same.

@glyn
Copy link
Collaborator

glyn commented Apr 26, 2022

At today's meeting I took an action to write a comparison project test for a filter without parentheses. It turns out there is one already -- https://cburgmer.github.io/json-path-comparison/results/filter_expression_without_parens.html -- and the consensus among implementations is "Not supported".

@danielaparker
Copy link

danielaparker commented Apr 26, 2022

@goessner I know you already commented to the contrary, but your original post has this to say:

Filter expressions are supported via the syntax ?(<boolean expr>) as in

$.store.book[?(@.price < 10)].title

I think this is pretty indicative that parentheses should be included as part of the required syntax.

Hardly indicative. Original Goessner JsonPath used JavaScript for evaluating filter expressions, I imagine that requiring the outer parentheses was just to provide a pragmatic way to delimit the JavaScript part, original Goessner JsonPath was not responsible for parsing that part. That consideration no longer applies when the syntax of filter expressions is defined in the JSONPath specification, mandatory outer parentheses no longer serve any purpose.

Taking this stance also aligns with our preference to accommodate existing implementations. While we say that this is a backward compatible change, it may not be. For instance, my parser won't recognize a filter selector without the parentheses. I expect many other existing implementations will be the same.

Be real :-) The spec in its current form is incompatible to a lessor or greater extent with all existing JSONPath implementations.

@goessner
Copy link
Collaborator

goessner commented Apr 28, 2022 via email

@gregsdennis
Copy link
Collaborator Author

gregsdennis commented Apr 28, 2022

My issue isn't that existing paths can be handled by new implementations (what we're calling "backward comparability"), but rather the inverse: new paths won't be handled by existing implementations ("forward compatibility").

Users will complain that their paths, which don't have parentheses, don't work. Implementors will respond with, "You need parentheses," and then users will be trained to use parentheses anyway.

As it stands, most users are already trained to use parentheses because existing implementations require them.

@cabo
Copy link
Member

cabo commented Apr 28, 2022

My issue isn't that existing paths can be handled by new implementations (what we're calling "backward compatibility"), but rather the inverse: new paths won't be handled by existing implementations ("forward compatibility").

Right. For new features we can't expect forward compatibility anyway. The discussion here was whether it is worth to reduce forward compatibility by admitting new syntax for existing, widely implemented features. And I grudgingly have to acknowledge that that is a different question.

Users will complain that their paths, which don't have parentheses, don't work. Implementors will respond with, "You need parentheses," and then users will be trained to use parentheses anyway.

Actually, implementers will quickly fix that.
Between competing implementations, there is a race to accept the largest amount of inputs (within reason, i.e., balancing it with the cost of implementing them).

But, yes, query creators that care about widest compatibility will be trained to use the parentheses.

(For me, actually, having a feature that only new implementations can understand is a feature. So the expression will break on implementations that haven't been updated. Darwin. And a nice canary.)

As it stands, most users are already trained to use parentheses because existing implementations require them.

Yes, which is part of the reason why I still think we would better give up on forward compatibility here in favor of cleaning up the query language. But that is now a matter of taste, and I have to accept that there is a valid argument to the contrary.

@danielaparker
Copy link

danielaparker commented Apr 28, 2022

My issue isn't that existing paths can be handled by new implementations (what we're calling "backward comparability"), but rather the inverse: new paths won't be handled by existing implementations ("forward compatibility").

Are you only concerned with new syntax? What about new semantics? For example, the draft is proposing new order of precedence rules that are different from all implementations and languages that I know of, including those of Python, JavaScript and C that some other implementations are following, so there are bound to be expressions that are syntactically the same but would now give the "wrong" result in existing implementations.

I just don't think it makes sense to go down this road. Users will understand that IETF JsonPath is just another take on the idea of JSONPath, an idea that took major forks from very early on. Any Stack Overflow question on JSONPath quickly establishes that the answers are generally implementation specific.

@gregsdennis
Copy link
Collaborator Author

I'm going to stop pushing this, but I would like to add this last comment.

JSON Schema has had several forward-incompatible changes over the years as well that I've been fine with. But the difference between this and those is that JSON Schema is already published and versioned. It's easy to tell people, "As of version X, this feature works differently."

We're not technically versioned as yet because we don't really have a proper publication from which a full implementation can be built (because we're not done). It feels different to make this change for a 1.0.

But I'll follow the team's consensus.

@cabo
Copy link
Member

cabo commented Apr 28, 2022

the draft is proposing new order of precedence rules that are different from all implementations and languages that I know of, including those of Python, JavaScript and C that some other implementations are following,

Have you submitted an issue?

@danielaparker
Copy link

danielaparker commented Apr 28, 2022

the draft is proposing new order of precedence rules that are different from all implementations and languages that I know of, including those of Python, JavaScript and C that some other implementations are following,

Have you submitted an issue?

No. I would add that the fact that they're different doesn't mean that they're wrong, because languages like Python, JavaScript and C use different precedence orderings, particularly Python and JavaScript, see e.g. Python (shown high to low) and JavaScript (shown high to low), XPath 3.1 (shown low to high), and C (shown high to low, 1 is high). My comment to @gregsdennis was only about them being different.

@cabo
Copy link
Member

cabo commented Apr 28, 2022

@danielaparker please do; I wasn't aware they are different.
(Thanks for the pointers, very useful! -- the C one is a copy of the XPath one, though, and I can't resolve the Python one.)

@danielaparker
Copy link

@danielaparker please do; I wasn't aware they are different. (Thanks for the pointers, very useful! -- the C one is a copy of the XPath one, though, and I can't resolve the Python one.)

All right, here it is.

@gregsdennis
Copy link
Collaborator Author

I'm not sure what the big fuss about the precedences being different is. For the operations we have defined, all of these are largely equivalent.

Some have variation in that comparisons are higher than equality, but I'm not even sure under what circumstances that'd be useful. Contrivingly, I could say

a < b == c < d

as a shorthand for

(a < b && c < d) || (a >= b && c >= d)

but that's stretching things, IMO. I can't see that such a thing would be so useful that they'd add it into the language.

@danielaparker do you know why that split in precedence is defined?

@danielaparker
Copy link

danielaparker commented Apr 29, 2022

@gregsdennis,

My own, personal view, is that it's unrewarding to think about precedences, and uninteresting to spend any time at all on thinking about the implications of trying something different. That's why my personal preference is to blindly follow what somebody else has already done, and not attempt anything new. I'd like to assume that the authors of C and Python have thought about this, I don't want to.

So, when I implemented this, I followed C, except where C didn't have =~, I followed Perl, and ended up with this. If like @glyn I had a strong preference that the comparison and equality operators belonged at the same level, I would have chosen another language to follow that did that, Python qualifies, and used that language as a model for all operators, except =~, which Python doesn't have, and gone with Perl again for that. What I wouldn't do is is help myself to Python's comparison and equality operators, but C's logical not operator. As much as possible, I would want one model to base my implementation on.

Daniel

@goessner
Copy link
Collaborator

goessner commented Apr 29, 2022

very helpfull indeed.

IIRC ... I took my first proposal of the precedence table from JavaScript (MDN), which should follow C. So I cannot explain the fact now, why ==, != has the same precedence as <, >, <=, >=.

Maybe we also should follow C ... and also include an associativity column.

oh, and insert =~ following Perl.

thanks

@gregsdennis
Copy link
Collaborator Author

why ==, != has the same precedence as <, >, <=, >=

Why is there a difference?

I don't see how the precedence chain we've declared is any different than any of these languages. They're semantically identical!

  • groups
  • not
  • comparison/equality (what benefit is there to separating these?)
  • logical and
  • logical or

All of the listed languages have this ordering. What is being discussed here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants