Giter Site home page Giter Site logo

EBV 4.0 about qtspecs HOT 11 CLOSED

ChristianGruen avatar ChristianGruen commented on September 25, 2024
EBV 4.0

from qtspecs.

Comments (11)

ChristianGruen avatar ChristianGruen commented on September 25, 2024 1

Thanks for the discussion. I think it helps to look at two aspects of the EBV computation separately:

  1. processing the input sequence and
  2. processing single items.

For 1., the current rules are:

declare function ebv($input as item()*) as xs:boolean {
  if (empty($input)) then false()
  else if(head($input) instance of node()) then true()
  else single-ebv($input)
};

I think it would be more intuitive to get rid of any special-casing and use existential semantics instead, so I would propose:

declare function ebv($input as item()*) as xs:boolean {
  some $item in $input satisfies single-ebv($item)
};

This way, the result won’t change if the input sequence is reordered. More importantly, all item types would have “equal rights”. This feels important to me, as the language has evolved a lot since XPath 1.0, which was very node-centric. I really can’t find a good reason today for treating node sequences differently to sequences of other types.

For 2., we currently have…

declare function single-ebv($item as item()) as xs:boolean {
  typeswitch($item) {
    case xs:untypedAtomic | xs:string | xs:anyURI  return $item != ''
    case xs:numeric                                return $item != 0
    case xs:boolean                                return $item
    default                                        return error(xs:QName('err:FORG0006'))
  }
};

It could possibly be:

declare function single-ebv($item as item()) as xs:boolean {
  typeswitch($item) {
    case xs:untypedAtomic | xs:string | xs:anyURI  return $item != ''
    case xs:numeric                                return $item != 0
    case xs:boolean                                return $item
    (: to be discussed... :)
    case xs:base64Binary                           return $item != xs:base64Binary('')
    case xs:hexBinary                              return $item != xs:hexBinary('')
    case array(*)                                  return array:size($item) != 0
    case map(*)                                    return map:size($item) != 0
    default                                        return true()
  }
};

We should get rid of raised error. I can see it was reasonable in the past, but I don’t believe it’s suitable today. If we want to align sequences and arrays, it just makes no sense to me that boolean(()) returns false and boolean([]) raises an error. If we think it does – apart from doing what we did in the past – we should find good arguments for it. Same for boolean(xs:QName('x'))… well, I’m repeating myself.

PS: I wondered why “…and algebra” slipped into my sentence. Should probably have been “…and alignment”.

from qtspecs.

michaelhkay avatar michaelhkay commented on September 25, 2024

Many of our users will have spent many frustrated hours learning the Javascript rules, and I think it's important we remain consistent with them. At present we are well aligned - except that in JS, if it's not one of a small number of falsy things, then its truthy, whereas with our rules things like empty arrays and maps are errors rather than truthy. I'm really not keen on making the rules more complicated especially if it leads to outcomes that are different from JS.

from qtspecs.

ChristianGruen avatar ChristianGruen commented on September 25, 2024

Many of our users will have spent many frustrated hours learning the Javascript rules, and I think it's important we remain consistent with them. At present we are well aligned - except that in JS, if it's not one of a small number of falsy things, then its truthy, whereas with our rules things like empty arrays and maps are errors rather than truthy. I'm really not keen on making the rules more complicated especially if it leads to outcomes that are different from JS.

I don’t see those similarities between JavaScript and XPath. The only thing that is close is the treatment of strings, numbers and booleans, and we would keep this anyway.

The main difference, and the one that regularly causes confusion, is the varying treatment of node sequences and other sequences, and it’s hard for me to grasp why this seems necessary today. The confusing examples that I’ve stated in the initial comment have no counterpart in JavaScript, and I’m convinced we could simplify the rules here by treating all items of a sequence identically, and achieving a more intuitive result. In addition, we can also sort out different behavior across implementations for heterogeneous sequences:

  • Both boolean((<_>x</_>, <_>y</_>)) and boolean(('x', 'y')) would return true.
  • Both boolean(xs:NCName('x')) and boolean(xs:QName('x')) would return true. It makes sense to always return true for xs:QName, because a QName can never be empty.
  • Both boolean((<a/>, 1)) and boolean((1, <a/>)) will return true – no matter which implementation is used.

For function items and arrays (positional, associative), my proposal would bring XPath and JS even closer together, by getting rid of the error message which I believes serves no one in practice. It would be much easier to use if($array) then ... instead of if(exists($array)) then .... For arrays, we could certainly choose the JS way and return true without checking the contents (provided that we believe that sequences and arrays are different enough). Same for maps.

from qtspecs.

michaelhkay avatar michaelhkay commented on September 25, 2024

Both boolean((<a/>, 1)) and boolean((1, <a/>)) will return true – no matter which implementation is used.

I'm not sure why you draw out this case as being implementation-dependent. Currently the first case is unambiguously true, the second case is unambiguously an error.

I don’t see those similarities between JavaScript and XPath.

In Javascript any array or object is truthy, regardless of its contents. I'm not sure what you're proposing for arrays and maps, but for sequences you're proposing something very different, and I'm still not sure exactly what. Or what the use cases are.

Read Javascript tutorials, and you find people advising everyone to steer clear of this minefield. With XPath too, a lot of people suggest using functions like exists() to avoid relying on the complex EBV rules. If we make them even more complex, there will be even more advice telling users not to go there.

from qtspecs.

ChristianGruen avatar ChristianGruen commented on September 25, 2024

I'm not sure why you draw out this case as being implementation-dependent. Currently the first case is unambiguously true, the second case is unambiguously an error.

Sorry for that. Indeed the specification states clearly that it's the first item that’s responsible for the result. I got misled by one implementation (well, not ours) that behaves differently.

I think/hope we can agree that it's at least strange that the order of the input defines here what is going to happen. I cannot think of any good reason for the current behavior for sequences of mixed type (apart from maybe historical reasons and algebra with XPath 1.0).

In Javascript any array or object is truthy, regardless of its contents. I'm not sure what you're proposing for arrays and maps,

In my initial proposal, I suggested checking the map/array size and returning true or false. I’d be open to the decision to always return true, in alignment with JS. The EBV of function items would always be true (similar to JS).

but for sequences you're proposing something very different, and I'm still not sure exactly what.

I hoped that the equivalent XQuery code was self-explanatory. I think it's questionable to base the result on the first item (which can easily change of data is reordered), and to raise errors for sequences… unless the first item is a node. I don't know any other language that behaves similarly.

I really don't believe that the proposed rules would make EBV more complex. Quite contrary, I think that the new rules would be more consistent and easier to explain and teach: For each item in the input sequence, there's a well-defined rule to get true or false. If at least one item matches, the EBV is true.

from qtspecs.

michaelhkay avatar michaelhkay commented on September 25, 2024

I cannot think of any good reason for the current behavior for sequences of mixed type (apart from maybe historical reasons and algebra with XPath 1.0).

In XPath 1.0 there were essentially four types: string, number, boolean, and node-set, and EBV was defined for each of them. When the data model was extended in 2.0, the rules had to be compatible with the 1.0 rules, but also to handle mixed sequences, and there was a significant amount of debate on the best way of doing this. One of the concerns, if I remember rightly, was that the revised rules should not make it necessary to read an entire sequence before making a decision (so if(//x) could still be decided on finding the first x element). But I think there was also a strong view that the rules should not become too unwieldy, and it was better to make most cases (other than 1.0-compatible cases) into errors than to have very complex rules that people would have trouble remembering.

from qtspecs.

michaelhkay avatar michaelhkay commented on September 25, 2024

Starting from first principles, I can certainly see why you want boolean([]) and boolean(map{}) to be false, but the fact that both are true in Javascript feels like we're just making life too hard for our users.

from qtspecs.

ChristianGruen avatar ChristianGruen commented on September 25, 2024

Starting from first principles, I can certainly see why you want boolean([]) and boolean(map{}) to be false, but the fact that both are true in Javascript feels like we're just making life too hard for our users.

If we believe that JavaScript users are (one of) our main target groups today, we should at least return true() instead of an error (which is what I suggested in the first proposal in the first comment of this issue).

from qtspecs.

michaelhkay avatar michaelhkay commented on September 25, 2024

Another point to bear in mind: in XSLT predicates, failure means no match. So in 3.0 match="person[*!string-length(.)]" will be a no-match (not an error) if person has more than one child element. Under your rules it would be a match if any child has a non-zero string length. It's unlikely anyone is doing this deliberately, but dormant template rules that never match anything often lie around in legacy code.

from qtspecs.

ChristianGruen avatar ChristianGruen commented on September 25, 2024

Another point to bear in mind: in XSLT predicates, failure means no match. So in 3.0 match="person[*!string-length(.)]" will be a no-match (not an error) if person has more than one child element. Under your rules it would be a match if any child has a non-zero string length. It's unlikely anyone is doing this deliberately, but dormant template rules that never match anything often lie around in legacy code.

Oh dear; yes, that sounds like a hard nut to crack. If we think this through, it basically disallows us to turn any error in the language into a success (try/catch is regularly used if people are overwhelmed to assess what exactly is supposed to go wrong in more more complex code).

from qtspecs.

ChristianGruen avatar ChristianGruen commented on September 25, 2024

I’m grateful for the discussion! I’ll open another issue with a narrower focus → #829.

from qtspecs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.