Giter Site home page Giter Site logo

map:get and array:get about qtspecs HOT 34 OPEN

michaelhkay avatar michaelhkay commented on August 17, 2024 1
map:get and array:get

from qtspecs.

Comments (34)

michaelhkay avatar michaelhkay commented on August 17, 2024 2

@dnovatchev Thanks for your encouragement. Reopening.

from qtspecs.

ChristianGruen avatar ChristianGruen commented on August 17, 2024 1

It would be very interesting to have a practical example at hand. We could then compare the code that needs to be written to solve it…

  1. with the existing fallback function,
  2. with the proposed *:try-get, and
  3. with the Church Encoding.

The existing examples in the spec are probably too basic (things like array:get([1,2], 3, fn { -1 })) in order to be beaten by a competing approach.

I’m thrilled by the Church and Scott Encodings as theoretical concepts, but in practice I tend to use straightforward solutions, as they are much simpler to decipher for the vast majority (including me).

from qtspecs.

michaelhkay avatar michaelhkay commented on August 17, 2024 1

I think

record(?value)

is preferable and the bool unneccesary.

Logically you're perfectly right (though I think you mean record(value?)). The problem is, the user is then left with using map:contains() and map:get() to access that record and see whether the value exists, so we haven't made their job any easier.

from qtspecs.

michaelhkay avatar michaelhkay commented on August 17, 2024 1

I'm wondering how well this works with map destructuring variable bindings (see issue #37).

Something like:

let ${value, found} := try-get($map, $key)
return if ($found) then $value else "default"

from qtspecs.

ChristianGruen avatar ChristianGruen commented on August 17, 2024

I will be happy to see array:get and map:get reverted, but for a slightly different reason. I think the cases in which empty sequences and missing map entries need to be distinguished are rare enough to justify combined contains/get calls. For other cases, you can do map:get(MAP, KEY) otherwise VALUE.

Regarding map:try-get, can we present a convincing example that illustrates that it would be better readable than map:contains & map:get?

Talking about performance, standard map:contains/map:get calls could also be optimized to single internal lookups (similar to if(doc-available(...)) then doc(...) patterns), but I wonder if implementations would benefit from it (at least in our case, the speedup would hardly be measurable).

from qtspecs.

MarkNicholls avatar MarkNicholls commented on August 17, 2024

@michaelhkay

this seems to be #1349 rehashed? (its an encoding of maybe)

I don't especially like the possibility of invalid states

e.g.

{"found":false(), "value":"Foo"}

and

{"found":true()}

I wasn't aware of the possibility to do

value?

in

record(found as xs:boolean, value? as item()*)

But I'm not really a fan, if you had a 1st class option type I think you could replace all instances of

key? : type

with

key : option(type)

i.e. extend the value space to model optional entries, rather than having optional constructs in the type space (which for me feels far more conceptually complex).

in this case it dissolves the whole use case to

map:try-get() as option(item()*)

@ChristianGruen's simple construction of maybe as a record (on #1349) seems preferable in terms of not modelling invalid states.

both constructions though lack a simple way for the use to process such a data type in a simple operation, i.e. fold.

from qtspecs.

michaelhkay avatar michaelhkay commented on August 17, 2024

It's not a rehash of #1349 which attempts to add fundamental new concepts to the data model and type system, rather it's a simple attempt to define a couple of new functions using existing capabilities.

I don't feel strongly about whether the value field should be absent or empty when found is false. Both are perfectly valid states, they are equivalent for most purposes, and I think it's a pretty arbitrary choice. I've no idea what you mean by your reference to "invalid states" here.

from qtspecs.

MarkNicholls avatar MarkNicholls commented on August 17, 2024

The 'value' in the key is a tangent, I think having a 1st class optional data type, negates the requirement for optional keys in type constructors, 1st class optional data types are common, everyday things in programming languages, a few lines of code, optional fields in a record (i.e. the field itself may exist or not), are quite esoteric, and would seem to make your type space potentially much more complex than it needs to be, and I suspect requiring many more than a few lines of codes to implement.

an invalid state is one that doesn't really make sense, you implementation is the cross product of bool and item()*.

e.g.

{ 'found': 'true'; 'value' : 1 }
means the value 1 is found.

`{ 'found': 'false' }

means nothing is found

{ 'found': 'false'; 'value' : 1 }

means? nothing is found and its value is 1?...which is a contradiction.

{ 'found': 'true'}
means? something was found, but the value is? we don't know.

the latter two examples type check, but they encode either a contradiction or an insufficient state.

many people would call these last two "invalid states".

chatgpt does a better job than me, justifying why its good practice (in my opinion), to try not to do this

Making invalid states unrepresentable is a key principle in software design, particularly in strongly-typed programming languages. Here are some reasons why it's considered good practice:

  1. Error Prevention: By designing your types and data structures in a way that invalid states cannot be represented, you prevent many classes of errors at compile time rather than at runtime. This leads to more robust and reliable code¹.
  1. Simplified Logic: When invalid states are unrepresentable, the logic in your code becomes simpler. You don't need to write additional checks and validations to handle impossible states, which reduces complexity and potential bugs².
  1. Improved Readability: Code that makes invalid states unrepresentable is often easier to read and understand. It clearly communicates the constraints and rules of the domain, making it easier for other developers to grasp the intended behavior³.
  1. Enhanced Maintainability: With fewer edge cases and special conditions to handle, maintaining and extending the codebase becomes easier. Changes to the code are less likely to introduce new bugs⁴.
  1. Performance Benefits: Eliminating the need for runtime checks can lead to performance improvements. The compiler can optimize the code better when it knows certain states are impossible².

So i like your construction for its simplicity, but I don't think its ideal, @ChristianGruen's construction on the other thread shares the simplicity, but without invalid states, the church encoding may not be to everyone's taste but is "minimal" and doesn't represent invalid states.

neither of these add new concepts to the data model or type system (I use the church encoded version in v3 code)

from qtspecs.

MarkNicholls avatar MarkNicholls commented on August 17, 2024

its unfortunate that this conversation is now spread over 2 issues....but anyway.

The church encoding is an implementation, not a specification, the specification is (for my current library)
(lets not stress over labels yet)

declare function option:Some($value as item()*) as option(item()*)
declare function option:None() as option(item()*)
declare function option:Fold(option(item()*),function()  as item(*),function(item()*) as item(*)) as item(*)

If I can write option(item()*), then I can hide the implementation, and no-one (apart from the implementor, me) needs to understand it, any more than I understand how maps or sequences are implemented, I only need to understand the interface, and tbh, the interface for option is above, apart from naming, is pretty standard across most languages and I think pretty straightforward.

BUT It seems it isn't possible of hiding any implementation completely inside a record (and its type) I'm too ignorant to comment, but it seems not.

AND 2 years of conversation seems also not worth the effort to introduce this as a first class type.

I'm not a fan of

record(found as xs:boolean, value as item()*)
for the reasons given in the chatgpt summary.

and I think there is a danger of introducing a compromised construct into the language and then baking it in, only to tie the hands of some future specifiers wanting to introduce an explicit option type.

Just to reiterate, for me, this is a nice to have, I can and do write safe/simple/readable client code (i.e. ignoring the implementation) against these functions in v3, the only issue I have is that others don't like the (optional!) type declaration.

from qtspecs.

michaelhkay avatar michaelhkay commented on August 17, 2024

its unfortunate that this conversation is now spread over 2 issues

Indeed it is. I proposed two simple functions that could be implemented with no changes to our current machinery, and you injected ideas from theoretical computer science and from other programming languages that require a change to our language fundamentals. This isn't the way to make progress.

from qtspecs.

michaelhkay avatar michaelhkay commented on August 17, 2024

Closing this issue because it has been hijacked.

from qtspecs.

dnovatchev avatar dnovatchev commented on August 17, 2024

I am not sure that the issue needs to be closed.

I like the idea of map:try-get and array:try-get and it is obvious that these functions would be immediately useful - as shown by the decade - long practice of providing the Dictionary.TryGetValue method in C# - used by millions of developers.

I would welcome a PR for these two functions.

As an aside: @michaelhkay , you sometimes scold me for using a too-targeted language (or overreacting), but I think that it is you who overreacted this time. Yes, @MarkNicholls is consistently using a "foreign" terminology - but he is just one single participant into this discussion.

This is my feedback - to summarize: very useful proposal. Go for it!

And please, reopen the issue - maybe we can hear more people and get useful feedback.

from qtspecs.

MarkNicholls avatar MarkNicholls commented on August 17, 2024

@michaelhkay

again apologies, I thought this looked like a confusing duplicate.

I think

record(?value)

is preferable and the bool unneccesary.

@dnovatchev

the terminology is public domain, freely available in a myriad of publications (including more recently wikipedia), some of it is almost 100 years old, I did make an effort to present "maybe" in the other thread based on a OO construction using the visitor pattern to make it more palatable, but record(?value) will suffice.

Much of XPath looks quite functional, so it seems natural to talk about functional concepts, I'll try to use more "local" terminology.

from qtspecs.

MarkNicholls avatar MarkNicholls commented on August 17, 2024

ah yes "value?"...

and this is where it crosses over with posts on the other thread.

on the other thread I tried to talk about maybe (in a foreign language), and that in other languages option/maybe would be part of the general environment, in which case, you want a way to create maybes and a way to eliminate them (and then you have the full set of foundational operations).

So in other languages you would have

some:make
none:make
maybe:fold

(fold is often called a myriad of different things - don't get tripped by the labels, or is part of the pattern matching infrastructure)

I accept your "2 year cost" argument, but I would hope that it wouldnt take 2 years to introduce 10 lines of code? (again you'll know better than me).

So....for me....ideally you would introduce maybe/option as something a little less adhoc than a standalone map/record (or at least not one with a wrinkle in it).

Then we seem to get sidetracked into different implementations of it....which frankly are largely irrelevant as long as they work, and are either encapsulated or at least left open in the spec)....I favour church encodings/final tagless encodings/visitors and other dont.....but that is irrelevant if its encapsulated.

so actually all I'm really suggesting is, rather than having an explicit 'found' key you have

maybe:fold($maybe as map(*),$processNone as function() as item()*,$processSome as function(item()*) as item()*) as item()*

this is 1 or 2 lines of code, different implementations of some/none, would have slightly different 1 or 2 lines of code, if ever maybe is promoted to a 1st class type, then actually you don't need to change anything the signature becomes

maybe:fold($maybe as maybe(item()*),$processNone as function() as item()*,$processSome as function(item()*) as item()*) as item()*

the corresponding signatures to tryget change to return a 1st class maybe(item()*) and you've dodged the backwards compatability bullet.

(I do note though that if maybe existed as a data type in the language optional attributes in records would effectively be redundant, which I would have thought was a simpler model).

from qtspecs.

michaelhkay avatar michaelhkay commented on August 17, 2024

I would hope that it wouldnt take 2 years to introduce 10 lines of code?

If it changes the data model and/or type system then it will be time-consuming. It's not the size of the addition as much as the amount of controversy that it generates, and changes to fundamentals always (rightly) generate a lot of controversy.

from qtspecs.

ChristianGruen avatar ChristianGruen commented on August 17, 2024

Thanks for reopening the issue (I was also hoping for more discussion)!

let ${value, found} := try-get($map, $key)
return if ($found) then $value else "default"

…this would work well indeed.

Currently, the example above could be written as map:get($map, $key, fn { 'default' }). In the initial comment, you mentioned that the existing use cases and examples for this syntax are tenuous – but in principle, they are fairly similar to this one. I wonder whether we can think of a use case for which the syntax we have would be too restrictive?

from qtspecs.

MarkNicholls avatar MarkNicholls commented on August 17, 2024

I'm wondering how well this works with map destructuring variable bindings (see issue #37).

Something like:

let ${value, found} := try-get($map, $key)
return if ($found) then $value else "default"

ah, my suggestion of dropping 'found' doesn't (actually christians code), you would need discriminated unions to be native in the language to have this (to me) pattern match style assignment.

maybe your suggestion is the sensible compromise in the short term, though this doesnt work with value? (what goes in value, if the key doesnt exist) ?

The bodge to make the destructuring work would be an additional function that mapped from a maybe to a record(found,value) but its getting ugly.

from qtspecs.

MarkNicholls avatar MarkNicholls commented on August 17, 2024

@michaelhkay

I would hope that it wouldnt take 2 years to introduce 10 lines of code?

If it changes the data model and/or type system then it will be time-consuming. It's not the size of the addition as much as the amount of controversy that it generates, and changes to fundamentals always (rightly) generate a lot of controversy.

There was a similar issue in C# many years ago, where people wanted a tuple, but the language people kicked the can down the road (i.e. having pattern matched assignment and 'nice' grammar in the language).

The short term compromise was to have a 'class' that captured tuple, and then when the language caught up, the 'class' was effectively deprecated.

I was hoping such a compromise was possible, where maybe could be introduced via records (with some encapsulation and a set of standard functions...as i say its 10 lines of code).

from qtspecs.

dnovatchev avatar dnovatchev commented on August 17, 2024

I'm wondering how well this works with map destructuring variable bindings (see issue #37).

Something like:

let ${value, found} := try-get($map, $key)
return if ($found) then $value else "default"

A typical example how Dictionary.TryGetValue is used in C# is producing the frequencies of characters in a string:

public static Dictionary<char, int> GetCharFrequencies(string s)
{
    var result = new Dictionary<char, int>(26);
    foreach (char c in s)
    {
         if(!result.TryGetValue(char.ToLower(c), out var _)) result[c] = 0;
         result[c]++;
     }
     return result;
}

from qtspecs.

ChristianGruen avatar ChristianGruen commented on August 17, 2024

A typical example how Dictionary.TryGetValue is used in C# is producing the frequencies of characters in a string:

I would probably solve it this way or another:

let $string := 'helloworld'
let $cps := string-to-codepoints($string)
return map:build(
  0x61 to 0x7A,
  char#1,
  fn($cp) { count($cps[. = $cp]) }
)

@dnovatchev How would you do it with try-get?

from qtspecs.

dnovatchev avatar dnovatchev commented on August 17, 2024

A typical example how Dictionary.TryGetValue is used in C# is producing the frequencies of characters in a string:

I would probably solve it this way or another:

let $string := 'helloworld'
let $cps := string-to-codepoints($string)
return map:build(
  0x61 to 0x7A,
  char#1,
  fn($cp) { count($cps[. = $cp]) }
)

@dnovatchev How would you do it with try-get?

@ChristianGruen Probably something like this:

let $s := "abracadabra",
    $chars := string-to-codepoints($s),
    $update-freq := function($m as map(*), $char as xs:integer)
    {
      let $result := map:try-get($char)
       return
          ( $m =>map:put($char, $result?value +1)[$result?found], $m =>map:put($char, 1)[not([$result?found])] )
    }
 return
   fold-left($chars, map{}, $update-freq)

@ChristianGruen What would be your implementation for the more general problem where the value-space for the keys is not known in advance (for example the keys are not characters of the alphabet, but could be any atomic value) ?

from qtspecs.

michaelhkay avatar michaelhkay commented on August 17, 2024

A typical example how Dictionary.TryGetValue is used in C# is producing the frequencies of characters in a string:

I would do

characters($str) => map:build(value := fn{1}, combine := op('+'))

from qtspecs.

dnovatchev avatar dnovatchev commented on August 17, 2024

A typical example how Dictionary.TryGetValue is used in C# is producing the frequencies of characters in a string:

I would do

characters($str) => map:build(value := fn{1}, combine := op('+'))

Wow ... , I am not even trying to understand this - find it completely incomprehensible.

Too many new concepts on the same line of code...

from qtspecs.

ChristianGruen avatar ChristianGruen commented on August 17, 2024

characters($str) => map:build(value := fn{1}, combine := op('+'))

Can hardly be beaten!

Another solution with group by (my previous solution created an entry for each lower-case ASCII character):

map:merge(
  for $c-group in characters($str)
  group by $c := $c-group
  return map:entry($c, count($c-group))
)

Let’s see if we find some more potential use cases for try-get.

from qtspecs.

michaelhkay avatar michaelhkay commented on August 17, 2024

find it completely unreadable

I'm surprised, it seems completely intuitive to me. For each character ch in the string, create a map entry (ch -> 1), and if it's a duplicate then combine the two entries by addition.

It's interesting though that there are useful expressions that consist entirely of new 4.0 constructs. I had the same experience with 2.0 and 3.0/3.1.

from qtspecs.

dnovatchev avatar dnovatchev commented on August 17, 2024

find it completely unreadable

I'm surprised, it seems completely intuitive to me. For each character ch in the string, create a map entry (ch -> 1), and if it's a duplicate then combine the two entries by addition.

Yes, but it requires the reader (me) to squeeze my brain hard in order to "get there".

And yes, a very nice solution.

from qtspecs.

ChristianGruen avatar ChristianGruen commented on August 17, 2024

Finally (promised), a classical XPath 3.1 solution:

let $input := 'abracadabra'
let $chars := characters($input)
return map:merge(
  for $char in distinct-values($chars)
  return map:entry($char, count($chars[. = $char]))
)

from qtspecs.

dnovatchev avatar dnovatchev commented on August 17, 2024

Finally (promised), a classical XPath 3.1 solution:

let $input := 'abracadabra'
let $chars := characters($input)
return map:merge(
  for $char in distinct-values($chars)
  return map:entry($char, count($chars[. = $char]))
)

Maybe could be improved:

  1. characters is new in XPath 4.0

  2. Performing count($chars[. = $char]) for count($chars) times is O(N^2) - then why use maps at all?

A good example why we need the set datatype.

from qtspecs.

dnovatchev avatar dnovatchev commented on August 17, 2024

A true XPath 3.1 solution:

let $s := "abracadabra",
    $chars := string-to-codepoints($s),
    $update-freq := function($m as map(*), $char as xs:integer)
    {
      if($m => map:contains($char))
       then $m => map:put($char, $m($char) +1)
       else $m => map:put($char, 1)
    }
 return
   fold-left($chars, map{}, $update-freq)

image

from qtspecs.

ChristianGruen avatar ChristianGruen commented on August 17, 2024
  1. characters is new in XPath 4.0

True, thanks, it should have been string-to-codepoints($input) ! codepoints-to-string(.).

  1. Performing count($chars[. = $char]) for count($chars) time is O(N^2)

In BaseX, a hash join is applied at runtime. But I agree that this may not apply to all implementations. When it comes to performance, I guess that the XQuery group by clause will perform best.

from qtspecs.

ChristianGruen avatar ChristianGruen commented on August 17, 2024

I’ve moved future comments on the dictionary use case to Slack (https://app.slack.com/client/T011VK9115Z/C01GVC3JLHE …to be hidden to the public in ~90 days).

One more thought: If we think that the current map:get and array:get functions are too complicated, we could replace the function parameter with a plain fallback value (as Java did it with Map#getOrDefault):

  • For most use cases, this would probably be sufficient and easier to read. For example, array:get([], 234, ()) would return an empty sequence instead of an error if the index is out of bounds.
  • For other use cases, there are enough alternative solutions, as the comments above exemplify.

from qtspecs.

dnovatchev avatar dnovatchev commented on August 17, 2024

I’ve moved future comments on the dictionary use case to Slack (https://app.slack.com/client/T011VK9115Z/C01GVC3JLHE …to be hidden to the public in ~90 days).

One more thought: If we think that the current map:get and array:get functions are too complicated, we could replace the function parameter with a plain fallback value (as Java did it with Map#getOrDefault):

* For most use cases, this would probably be sufficient and easier to read. For example, `array:get([], 234, ())` would return an empty sequence instead of an error if the index is out of bounds.

* For other use cases, there are enough alternative solutions, as the comments above exemplify.

This is closely related to the proposal for total maps - having a default value for unexpected keys. And a total map is simpler and more powerful tool:

  1. No need for additional function-argument when calling many map-related functions
  2. Saves huge time for all developers by eliminating the need to specify an additional argument and even to think about it.
  3. Can represent any value as a map
  4. Gives more control to the map-creator and maintainer
  5. Allows us to represent any value as an object that has properties that are common to all objects, and also its own, specific properties.

I am not sure why it was labeled as "PRG-hard", while this is trivial to implement.

from qtspecs.

MarkNicholls avatar MarkNicholls commented on August 17, 2024

actually (in the context of the original post)

record(value as item()*)?

seems quite reasonable too as a return type,

It doesn't model invalid states, it naturally fits processing the found values (you wouldn't even have an if, unless you were interested in missing values), in fact it feels a bit more XPathy than a tuple, the '?' on the type naturally leads you to the conclusion that there maybe a result or not, and there's no conditionality on the structure of the record itself i.e. its always the same structure.

from qtspecs.

michaelhkay avatar michaelhkay commented on August 17, 2024

record(value as item()*)?

Nice idea, will think about it.

from qtspecs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.