Giter Site home page Giter Site logo

String Interpolation about koka HOT 34 OPEN

TimWhiting avatar TimWhiting commented on August 23, 2024 1
String Interpolation

from koka.

Comments (34)

chtenb avatar chtenb commented on August 23, 2024 1

<> is not a bad idea, since their usage in type signatures is not likely to come up in string interpolation contexts, solving the visual problem that {} has with code blocks.
However, if you were to do html or xml generation (which is a pretty common thing to want to generate), the choice of both <> and & suddenly becomes very cumbersome.

What about making it configurable per interpolator? This would allow the user to optimize for whatever kind of strings they are generating, and make this feature very helpful for embedded template languages.

from koka.

TimWhiting avatar TimWhiting commented on August 23, 2024 1

With configurable delimiters you could even do lisp/scheme style quoting. :)

scheme"(eval ,(list.map(do-something)))"

from koka.

chtenb avatar chtenb commented on August 23, 2024 1

With configurable delimiters it would probably be wise to have the escape character \ be fixed for all interpolators? Intuitively I think that would keep things more sane than using repeated delimiters as a means of escaping.

from koka.

TimWhiting avatar TimWhiting commented on August 23, 2024 1

@kuchta By the way #533 was just merged, and Daan is planning on releasing a new version of Koka soon, so you can add named arguments when using trailing lambdas.

from koka.

chtenb avatar chtenb commented on August 23, 2024 1

Interesting. This makes me think of function application in Haskell and PureScript, as you pass a bunch of primitive values into a function without using parenthesis and commas. In fact, PureScript does not have special string interpolation syntax, and the function i from https://pursuit.purescript.org/packages/purescript-interpolate/5.0.2 is commonly used instead.

In your example, html is a function that takes n parameters, except using whitespace to delimit arguments instead of commas like normal Koka functions. I wonder if this idea generalizes to an alternative function call syntax + variable length parameter lists.

from koka.

TimWhiting avatar TimWhiting commented on August 23, 2024 1

Whitespace separation is hard to do for general arguments, especially prior to type checking and with operators:

For example

dosomething a > b

Does this mean dosomething(a,>,b) or dosomething(a > b).
Of course you could put parentheses around the terms but does this look better:

dosomething (a > b) (c < d)
// or 
dosomething(a > b, c < d)

I guess we could allow both, but the error messages would have to be really good to help people find where to put the parentheses they forgot that they think they didn't need. And maybe a more extended discussion on this topic should go somewhere besides the string interpolation issue. Trailing lambdas don't have this problem because they must start with fn or {, and then have a clearly delimited block or indentation scope.

For string interpolation we can get around this issue by requiring a "" between non-string adjacent parts, or because there is a clearer expectation to delimit individual interpolated parts with (), or eagerly parsing as much as can be determined to be an expression - which if we allow whitespace separated function arguments might get really confusing quickly.

from koka.

TimWhiting avatar TimWhiting commented on August 23, 2024

One advantage of this approach is it doesn't do any dynamic dispatch or anything special, it is just an extension of the current implicits and static overloading. We could even allow multiple parameters and have format specifiers for number precision etc.

from koka.

chtenb avatar chtenb commented on August 23, 2024

In response to the starting character vs delimiter, when choosing { as delimiter you have a character that doesn't occur naturally in strings much. I agree that most of the starting character candidates are more common.
I personally have more experience with C# and Python languages, which both use { as delimiters and no starting character. Both these languages require you to double them like {{ when you want a literal brace. I have good experiences with this design, except when generating C# code, where it is a somewhat annoying choice, since C# syntax involves many braces.

from koka.

TimWhiting avatar TimWhiting commented on August 23, 2024

I like the idea of no starting character, and I do think braces are a good choice but I wonder if we could still a shorter ` for times where you are just using an identifier. \ doesn't make sense in those situations since \n could either mean a newline or interpolate the variable with the name n.

Here is what that might look like:

val err = Error("problem")
"Result: {match err { Ok() -> ""; Error(err) -> "Error! `err"}}"

With proper syntax highlighting or maybe indenting the whole match it might look good, but it seems a bit strange to me especially since blocks in Koka use } so many interpolated expressions might end in }} which almost seems like an escape rather than an end of block and then escape from interpolation. Simple expressions like abc + 1 might look better.

from koka.

chtenb avatar chtenb commented on August 23, 2024

I see what you're getting at. In the context of string interpolation you are more likely to want to put everything on a single line, increasing the chance that braces are needed to delimit blocks. That indeed may conflict visually with the interpolation syntax. A small mitigation might be to escape { using a backslash \{ instead of {{, such that the end of your example }} does not look like an escaped closing brace anymore.

from koka.

kuchta avatar kuchta commented on August 23, 2024

Using escaped braces like \{ and \} or prefix characters solves just the syntactical problem, but not visual. What about using < and > for that. People already somehow associate them with markup and they are the only readily accesible symbols (on the keyboard) apart from those heavily used ones ({}, (), []) which have some "pairable" characteristics.

from koka.

TimWhiting avatar TimWhiting commented on August 23, 2024
val err = Error("problem")
"Result: <match err { Ok() -> ""; Error(err) -> "Error! `err"}>"
val err = Error("problem")
"Result: |match err { Ok() -> ""; Error(err) -> "Error! `err"}|"

The problem with <> is that it looks like markup, but feels kind of reversed.

| maybe works better and is also not used a bunch in strings, but might have issues with parsing since it could be in the middle of an expression as an or & is not pairable.

For simple identifiers an & might look good. Reminds me of taking a reference to something, which is kind of similar.

val err = Error("problem")
"Result: <match err { Ok() -> ""; Error(err) -> "Error! &err"}>"

And using it as a start character might not be too bad.

val err = Error("problem")
"Result: &(match err { Ok() -> ""; Error(err) -> "Error! &err"})"

from koka.

kuchta avatar kuchta commented on August 23, 2024

Well, I think the less characters used, the better. () are also quite heavily used in such contexts. <> feels like some substitution, like the <body> and <expr> used in the documentation, so If only value is used, it's resembling the language used there. For expressions it could look outlandish at first, but I think it's just because we are not used to it...

from koka.

TimWhiting avatar TimWhiting commented on August 23, 2024

@chtenb I like the idea, maybe we have a definition like:

// delimit-start, delimit-end, simple-identifier 
pub interpolator debug [<,>,$] 
// you can omit allowing simple identifiers, and you can have multiple characters for starting delimiter
pub interpolator html [${,}] 

Of course this means that these definitions need to be at the top of files like infix declarations since we might want to resolve this prior to parsing. Though for infix operators we transform into an intermediate representation and resolve after parsing.

@kuchta @chtenb
Curious what you think about inverting the angle brackets. I kind of think it might be pretty nice. (It's like a saying insert >here<), and stands out.

val err = Error("problem")
"Result: >match err { Ok() -> ""; Error(err) -> "Error! &err"}<"

Of course then what would a DSL for HTML look like in Koka?

html">div(inner=html">div(text="Hi")<")<"

or maybe a bit better formatted.

html">div(
  html">div(
    text="Hi"
   )<"
)<"

At that point we almost need to auto-infer which prefix tag to use for plain strings "" when a string is passed into a function needing a particular type:
e.g.

fun div(inner: maybe<html-builder> = Nothing, text:string = "")

html">div(
  ">div(
    text="Hi"
   )<"
)<"

// Non inverted
html"<div(
  "<div(
    text="Hi"
   )>"
)>"

Obviously this looks really nice the non-inverted way with html, but I personally think it looks really nice the other way for more general expressions.

Additionally you can argue that you really don't want to be doing this with strings anyways for HTML: (There is no static string in those examples, just nested div calls. So you can still just omit the string interpolation and have the following api which would work for generating strings or ASTs.

div(
  div(
    text="Hi
  ))

The one difficulty about this API is that you really don't want to build up the subpieces of the tree and then have a bunch of string appends. You'd rather generate from the outside in and append directly to a string-builder. You could create an intermediate AST, but that wastes time.

Or the vector api in #527 sort of supports the above already via the spread api:
html["div", ...html["div", "Hi"]]
Where the add-item adds a tag and add-items could add children to the last tag.

from koka.

kuchta avatar kuchta commented on August 23, 2024

I quite often think of koka as one of the best languages for the web, because it has compatible syntax that allow dashes in indentifiers. I image a future where I can write JSX like expressions in it. Writing html in a template languages (even tagged strings) never felt very pleasant to me and it would miss a lot of opportinities the React world already realized...

from koka.

kuchta avatar kuchta commented on August 23, 2024

Yes, I think using configurable delimiters are probably the best way how to go about it. I was also thinking about inverted parenthesis πŸ™ŒπŸ», but it probably could visually distract even more, if we are used to interpret them in some way (I mean in the context where there would be even non-inverted ones)

from koka.

kuchta avatar kuchta commented on August 23, 2024

Regarding current syntax, it should be even possible to write something like this, right?

fun some-component(attr1="default", attr2="default" attr3="default", children={})
   div(attr1=att1, attr2=attr2)
      span(att3=att3)
          children()
      other-component()

With all nested blocks treated as trailing lambdas.

IMHO this is vastly superior syntax to something like JSX. It matches quite nicely do HTML, but doesn't suffer from having to close the tags...

I'm just not sure if named parameters don't have to come last, but what about trailing lambda? documentation don't show how to write function consuming it...

from koka.

kuchta avatar kuchta commented on August 23, 2024

@TimWhiting: "You could create an intermediate AST, but that wastes time." - Not if that's what you'll need to update the DOM on the client...

from koka.

TimWhiting avatar TimWhiting commented on August 23, 2024

@kuchta Of course, I was thinking about string interpolation specific to this issue, not saying that AST is bad, but one heavily used use case for advanced string interpolation would be a server side renderer, and especially if you do not use the ast in any way and just plan to convert it to a string, it seems a bit wasteful.

Yes, your syntax with trailing lambdas would be a great way to be build an AST unfortunately we need #491. Currently named parameters have to come last, including after trailing lambdas (since they just get desugared I think to the last parameter). With the change in the PR you could make the trailing lambda be a positional argument. Alternatively we could adjust the desugarer to put trailing lambdas after all positional arguments, but before named ones.

from koka.

kuchta avatar kuchta commented on August 23, 2024

@TimWhiting Exactly and those advanced server-side renderers might want to have features like React Server Components (RCS) for which some form of (build-time) code transformation (compilation) would be probably needed anyway.

Aren't trailing lambdas always a positional argument, if as you say must come before named ones? But it then can't have a default value. That's unfortunatte...

from koka.

TimWhiting avatar TimWhiting commented on August 23, 2024

Back to the issue though: I realized a major flaw with allowing user configurable interpolation delimiters.

Due to nested strings / interpolation, you have to resolve this at lexing time, otherwise you cannot find the end of the string! This means we would not be able to lex & parse in parallel, and would need to lex just the imports, later lexing the rest of the body after we know the delimiters to use for any prefixed string. This is not only complex, but a lot more work than the original proposal which would be able to be desugared directly in the parser.

As much as I would like to see user configurable delimiters, it seems like at least for now we would need to settle on what to use, though ultimately the decision rests with Daan. It seems like most of us are interested in trying out <> and maybe a for identifiers? Though with good colored highlighting we might be open to{}. And ` for escaping delimiters or the identifier interpolator.

Daan has more important issues he would like to work on first I think (specifically a robust async library, http/s tcp and other I/O).

from koka.

chtenb avatar chtenb commented on August 23, 2024

Due to nested strings / interpolation, you have to resolve this at lexing time, otherwise you cannot find the end of the string!

Yeah, not surprising :) The grammar essentially becomes configurable using a language construct.

Perhaps instead of ` the & could also be considered. To me & makes the interpolation visually easier to read, and the reminiscence of taking a reference is indeed a nice coincidence. I don't think & comes up more often as a literal in interpolated strings than backticks (outside html).

from koka.

kuchta avatar kuchta commented on August 23, 2024

I though it wouldn't be so easy, since practically nobody is using it. I will leave here some prior art that led me to angle brackets. Unix man pages syntax was probably the first where I encountered them and to this day most of the (not just) unix commands are using them as a placeholder for substitution of required arguments.

from koka.

kuchta avatar kuchta commented on August 23, 2024

@TimWhiting Wow, I'm really looking forward to it. πŸ€— Yesterday I found out that my koka installation is quite outdated, since homebrew channel is probably no longer maintained...

from koka.

TimWhiting avatar TimWhiting commented on August 23, 2024

So here is a radical idea: Just don't use delimiters (if we have a delimiter why not use the normal delimiter "). And then an interpolation is just sequence of expressions beginning with a tagged string, you can have spaces or not.

f"Result: " match err { Ok() -> ""; Error(err) -> "Error! &err" } ""

html"<" div(
  html"<" div(
    text="Hi"
   ) ">"
) ">"

scheme"(eval " list.map(do-something) ")"

Maybe an auto-formatter with some basic rules could make this look nice.

I think it would still be good to have a rule for desugaring that nested strings in interpolation inherit the same tag as their parent, unless specified otherwise.

html"<" div(
  "<" div(
    text="Hi"
   ) ">"
) ">"

I realize the html example isn't necessarily the best example, especially since I just realized I messed up the syntax anyways, but it illustrates the point still, and gives something to consider when talking about indentation / formatting.

This is not a totally crazy idea. Dart has 'adjacent string literals', which allows you to split string literals onto multiple lines for better readability, and preventing super long lines, they were basically implicit concatenation. Dart differs in the fact that it also has 'normal' interpolation.

I'll clarify that I'd still like to see "an identifier &ident is cool" for simple concatenation.

from koka.

kuchta avatar kuchta commented on August 23, 2024

Why not use join right away? πŸ™‚

[ "Result: " match err { Ok() -> ""; Error(err) -> "Error! &err" } ">" ].join/concat/...

But I like it. It's general and minimal, koka style πŸ•ΊπŸΌ
BTW, making commas at least optional would be also great. IMHO they are superfluousness most of the time and if not, there are always parenthesis to the rescue πŸ›Ÿ Or maybe I'm missing something, but they are often source of problems probably due to their low visibility. One reason less to argue if there should be trailing commas variants or not πŸ™‚

from koka.

TimWhiting avatar TimWhiting commented on August 23, 2024

The main difference is that it is not an n parameter function - so you don't have to worry about unification with different sized lambdas, instead it desugars to n function calls with a interpolate-begin , interpolate-value, interpolate-string and interpolate-end, or whatever the names end up being. Ideally most of them get inlined due to being simple. This feels almost akin to C's VARARGS interface but less manual and desugared at the application site into n function calls (which can use overloading of functions for type determination) instead of reflecting inside the function in a type unsafe way. (For a varags example see: https://stackoverflow.com/questions/15784729/an-example-of-use-of-varargs-in-c)

Koka already has parameters separated by whitespace (trailing lambda arguments). But they are clearly delimited by indentation and the fn keyword or braces for an anonymous function with no arguments {}.
For interpolation I don't mind omitting commas, but in general I find they make things more readable.

@kuchta
Part of my design was to introduce intermediate builders to make this more efficient than just building a list and then joining.

By the way:

[ "Result: " match err { Ok() -> ""; Error(err) -> "Error! &err" } ">" ].join/concat/...

Would not be possible for the general ?debug example at the top where you mix types, it would be weird for arrays to allow mixed types like this in just special situations.

This is how it would look with using " as our 'non-delimiter'.

debug"Result: " err "!"

fun general/debug/string-interp-value(sb: string-builder, value: a, ?debug: (a) -> string): string-builder
  sb ++ ?debug(value)
  
fun error/debug/string-interp-value(sb: string-builder, value: error<a>): string-builder
  match value
    Ok -> sb
    Error(err) -> sb ++ "Error! &err"

Of course if two overloads could match we might want some way of distinguishing which one to use. Either we need #531 or we could allow something strange.

debug"Result: " err>general "!"
debug"Result: " general>err "!"
debug"Result: " general/(err) "!"

from koka.

kuchta avatar kuchta commented on August 23, 2024

@TimWhiting I don't know if I would call it argument separation by whitespace if there could be just one trailing argument, but making them optional would leave that decision to the author...

I see where are you heading, Tim. Yes, it definitely has it's usage...

from koka.

TimWhiting avatar TimWhiting commented on August 23, 2024

You can actually have multiple trailing arguments:

while { n > 1 } 
  ...

gets desugared to

while(fn() n > 1, fn() ...)

Or slightly elongated, and with explicit

fun main()
  var n := 0
  while { 
    n > 1
    } fn()
      print("Enter a number: ")
      n := n + 1

Or

fun main()
  var n := 0
  while { 
    n > 1
    } {
      print("Enter a number: ")
      n := n + 1
    }

from koka.

kuchta avatar kuchta commented on August 23, 2024

Oh, true... You are right. But wouldn't it then be more consistent to allow even non-trailing arguments to be also separated by whitespace?

All separeted by whitespace, just trailing arguments delimited by indentation, non-trailing by parenthesis...

from koka.

chtenb avatar chtenb commented on August 23, 2024

This conversation leads me to another possible way to think about it. We'd like interpolation to be flexible, both syntactically and semantically, but we also want to keep the language grammar simple and be able to desugar it early on in the compilation process.
This makes me think of a macro system. I'm not a fan of arbitrary textual rewrite macro's, like the C preprocessor, but perhaps there is some middle ground where a macro would act on expression tokens instead of raw text or something.
I haven't fully thought this out, but it might be an interesting angle to investigate. Maybe this would unify a broader set of desugaring features under a single umbrella.

from koka.

chtenb avatar chtenb commented on August 23, 2024

Something we haven't discussed in the context of string interpolation is formatting, where you specify the formatting of arguments via a format specifier. For an example, see https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/tokens/interpolated
For a description of the format-string minilanguage in C#, see https://learn.microsoft.com/en-us/dotnet/standard/base-types/formatting-types

from koka.

TimWhiting avatar TimWhiting commented on August 23, 2024

I think formatting is the easy bit.

fmt"Result: " err "!"

value struct fmt-prec<a>
    v: a
    precision: int

fun fmt/string-interp-value(sb: string-builder, value: a, ?show: (a) -> string): string-builder
  sb ++ ?show(value)

fun prec/fmt/string-interp-value(sb: string-builder, value: fmt-prec<float64>): string-builder
  sb ++ value.v.show(precision=value.precision)

fun prec(v: float64, precision: int): fmt-prec<a>
  Fmt-prec(v,precision)

"Here is a precise " d.prec(10) " floating point value"

Since you can overload the string-interp-value function and Koka picks the one that requires the fewest implicits, it will use the formater for precision. I know, not as short as other solutions, but arguably more developer friendly and discoverable due to autocompletion and hovering / documentation.

As far as metaprogramming, let's discuss that in a new issue #536

from koka.

kuchta avatar kuchta commented on August 23, 2024
dosomething (a > b) (c < d)
// or 
dosomething(a > b, c < d)

@TimWhiting I haven't commented on this, because everything is already said in the next paragraph, maybe except one thing. If you put it this way, the second example is definitely more natural, but being able to use both syntaxes would be great for DSL like shell, which I'm quite interested in....

from koka.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.