johnynek / bosatsu Goto Github PK

View Code? Open in Web Editor NEW

222.0 222.0 11.0 132.99 MB

A python-ish pure and total functional programming language

License: Apache License 2.0

Scala 99.86% Shell 0.05% HTML 0.08%

functional-programming language

bosatsu's People

Contributors

Stargazers

Watchers

Forkers

xuwei-k snoble silky puffnfresh jxub honkfestival scala-steward ngbinh jonasackermann martijnhoekstra gilcu2

bosatsu's Issues

universal quantification syntax

forall f; f[a] -> f[b] will never be inferred (currently, or possibly ever).

We also have no way to write that currently, such as:

struct TypeEq(fn: forall f . f[a] -> f[b])

Then if we have a TypeEq[a, Int] we can use it to convert any f[a] into an f[Int].

better errors in parser

The parser needs better error messages. When we fail to parse e.g. missing ) it does not give very good messages.

One interesting law: if we make a random package and then modify a random character in it, we should either continue to parse (it was a legal change) or we should point to the position where we made the change.

This is probably hard to pass that law, but it would be awesome if we could.

rules for total functional programming

http://www.jucs.org/jucs_10_7/total_functional_programming/jucs_10_07_0751_0768_turner.pdf

This paper lays out three rules:

all cases must match #103
recursion on types must be in covariant position #102
recursion in functions must be on a syntactic subcomponent of a formal parameter #87

It seems by following this paper, we could add a lot of power to bosatsu. We recently removed ANY recursion in structs, but the example we found to be problematic was a contravariant recursion. We could (and probably should) re-add recursion in types in the covariant positions.

Part 3 is less of a priority to me, but might not be too hard. Is covered by #87.

parser scalacheck tests sometimes fail

[info] - we can parse quoted strings *** FAILED *** (200 milliseconds)
[info]   TestFailedException was thrown during property evaluation.
[info]     Message: failed to parse: : "\u0000" at 0 in region empty string, idx = 0 with trace: escapedString:1:1 / "\u0000":1:1 ...""
[info]     Location: (ParserTest.scala:40)
[info]     Occurred when passed generated values (
[info]       arg0 = (,,) // 18 shrinks
[info]     )

This may be a corner case of generation, and possibly a bad example due to bad (or no) shrink instances.

consider supporting structural recursion

https://en.wikipedia.org/wiki/Structural_induction

We could implement both range and foldLeft if we allowed structural recursion: which is to say, we allow recursion in a certain case where we can prove it is total.

This may be tricky to implement and it adds considerable power to the language, but retains totality, which is very nice.

Improve package and type error messages

similar to #3 we want to give actionable feedback and hopefully point to the root cause of the issue.

One idea is in unification we could accumulate an evidence monoid. So when unification fails, we can see all the points in the code that made us thing the thing on the left was X, but the thing on the right was Y. This should help users find the errors.

For package errors we can usually suggestion remedies: add export Foo, etc... which will solve the issue.

set up code coverage metrics in sbt.

There are almost certainly many bugs currently.

It would be good to measure code coverage, work to get it higher, and add a lot more property check tests.

actually use the def type annotations

currently, def annotations are totally ignored by inference.

To use them, we need a new Expr which is a type-annotated lambda and a type annotationed expr.

add a format command

Modern languages should have integrated formatting.

We are almost there, since we implement Document instances for values, but we don't have a command to spit back out cleaned code.

Bug in the if expression type inference

we have a typo and use the true branch twice here:

bosatsu/core/src/main/scala/org/bykn/bosatsu/rankn/Infer.scala

Line 567 in 33892e7

rF = tExp.getType

We are not actually translating syntax into if currently (only into matches) so we can't hit this bug now. This is more reason to set up code coverage (#67).

To fix this we should first write a failing test to prevent similar recursions. Such a test would be something that has two different types in the if/else result, it should return the type of the true branch.

implement inference of rank-n types

https://www.microsoft.com/en-us/research/publication/practical-type-inference-for-arbitrary-rank-types/

We might want something like:

struct TypeEq[a, b](subst: forall f. f[a] -> f[b])
struct Monad[f](pure: forall a. a -> f[a], bind: forall a. f[a] -> (a -> f[b]) -> f[b])

this is a pretty expert thing, so a low priority, but currently, we don't have syntax for this and inference wouldn't work.

language server protocol (lsp) server

https://microsoft.github.io/language-server-protocol/specification

might not be that hard if we build the compiler around the ability to do tooling. Good to think about this early rather than later.

Improve parsing in parens

Inside parens, newline chars should be considered whitespace in most (or all?) contexts.

Consider:

struct Foo(
  a: Bar,
  b: Quux,
  c)

open questions on well foundedness

The paper we implemented for type inference has no custom types.

It does have special handling for function types. I think the only special handling for function types is that they are contravariant in input and covariant in output.

Our current defined types, I believe as they are implemented, has every type being invariant in all type parameters.

However, allowing recursive types #102 as discussed in #104 allows recursion only in covariant positions. So, to support that, we need to understand variance of types (or at least some useful lower bound on which parameters are covariant). But, once we understand variance, we want to use it in substitution in inference, and not require strict unification.

Implement single line expressions for if/match/def

Currently we require indentation on all new blocks. It would be nice to find a parser combinator that we can reuse to allow code like:

def foo(x): x + 1

if x: 1
else: 2

match x:
  None: 0
  Some(y): y

add a list literal

It would be great to have:

[a, b, c, d]

be the same as:

NonEmpty(a, NonEmpty(b, NonEmpty(c, NonEmpty(d, Empty))))

It would be nice in both creation and pattern matching.

unify operator overloading

a syntax like:

operator def (a: Int) + (b: Int) -> Int:
  ...

could allow us to unify unary and binary operators. We might not want many operators which have not been popular in most programming languages. That said, it is ugly to have some exceptions to rules and it makes things harder to learn.

Document design principles, goals and non-goals.

One somewhat unspoken guiding principle of bosatsu is that implementing tooling, and understanding dependencies should be very easy to do.

This is why we have explicit imports and exports at the top of the file, and it should be possible with a very simple parser to read and correctly see the dependencies between bosatsu files. This should allow it to be very easy to integrate with build tooling, since all dependencies are declared.

This has some implications that currently, I do not think are fully fleshed out.

if a value has a type that refers to an imported package, when exporting that value, how do we communicate and exported dependency?
when resolving declarations to expressions, we currently only use the package name, but probably we should parse some kind of resolver out of the file first that can turn names in the files into their fully qualified names (either pointing externally or locally). In this way, the expression graph is always dealing with fully qualified names.
should we have syntax to allow users to refer to fully qualified names? I tend to say no. We should require import statements. It may seem that imports are duplicative, and they are somewhat: it is a tax to write external names or exported names twice, but that tax aids in allowing humans and tools to understand dependencies without running type inference.

improve the predef

with a good predef, Bool is probably just a Predef enum and Integer and String might just be external structs.

We might also consider a PositiveInt which is also closed under + and *.

Add implicit package names

for many DSLs we might want to have implicit package names, either based on directory name (a-la bazel's BUILD files) or by full file name.

This could be an option of a tool to automatically assign packages relative to a bosatsu root.

Improve pattern matching

Currently the only kinds of pattern matches are a full list of all constructors with each variable bound or replaced with _.

It would be nice to support:

enum List:
  Empty
  NonEmpty(head: a, tail: List[a])

x = NonEmpty(1, NonEmpty(2, Empty))

match x:
  NonEmpty(a, NonEmpty(b, _)):
    f(a, b)
  _:
    0

So, this shows two features:

patterns inside the pattern.
a catch-all _ pattern.

The only challenge here is totality checking, which may not be too much work actually.

An additional thing is to allow if statements:

match x:
  NonEmpty(x, _) if x > 0:
    foo(x)
  NonEmpty(x, _):
    bar(x)
  Empty:
    baz

add a Dict[String, a] literal

would be nice to have:

{"a": 1, "b": c, ...}

to return a hash or sorted Dict type.

allow explicitly listing type params when naming types in enums/structs

the bug in #82 was that we didn't name the type parameters, but we discovered them in the opposite order you would expect.

In that example, no type parameters were listed, but if we had done struct Pair(fst: a, snd: b) the result would have been the same.

We should allow struct Pair[a, b](fst: a, snd: b) and do a name check that all the types must be fully specified in such a case.

implement anonymous tuples

Should be able to do:

x = (1, 2, 3)
match x:
  (a, b, c):
    foo(a, b, c)

add build and basic run instructions to the README

The two steps I've tried is sbt run which results in No main class detected. and sbt package followed by java -jar core/target/scala-2.12/bosatsu-core_2.12-0.1.0-SNAPSHOT.jar which results in

Error: A JNI error has occurred, please check your installation and try again
Exception in thread "main" java.lang.NoClassDefFoundError: com/monovore/decline/Command

type variables don't work with nested defs

see the TODO here:

bosatsu/core/src/test/scala/org/bykn/bosatsu/EvaluationTest.scala

Line 187 in 858324e

# TODO if we write pair: Pair[a, b] below we get an internal

add syntax highlighting files

some for vim and emacs would be nice.

cc @non you have experience with this, right?

document using bosatsu for json generation

There are currently bazel rules that allow you to use bosatsu as a json generation system similar to https://github.com/google/jsonnet

See BUILD in the root directory currently.

implement for comprehensions

Parsing is done, but we didn't yet actually support expanding for comprehensions out into Expr.

Should be pretty easy to implement a for comprehension as a call to map/filter/etc...

difference between = and as

In a package import, Foo as Bar gets you the ability to talk about Bar as a type, constructor function, and in a pattern match.

if you do foo = Bar you only get Bar as a constructor. Should we allow as in the program as a kind a name binding:

struct LongNameIPlanToExport

LongNameIPlanToExport as F

This is related to #36

literate bosatsu with markdown

Similar to: https://github.com/tpolecat/tut we would compile and optionally evaluate code in markdown documents.

we could use https://github.com/atlassian/commonmark-java

It would be nice to use this to write a type-checked syntax guide.

A kind of dual of this is to support parsing markdown out of comments, and using that to product markdown docs of the code.

implement interface files

to have a good implementation of a bazel rule (see #2), we need to emit the interface file for each bosatsu package.

This would include the package name, all exports with their full types, and the transitive set of external defs that need to be added.

Then a backend can error if any needed external calls are absent.

using structs and enums as namespaces

Currently, all names are top level in packages, so the types in an enum, all structs, all defs.

Alternatively, we could put defs and enum values in namespaces:

enum Option:
  None
  Some(a)
 
  # in the Option namespace, we don't need Option#None...
  def isDefined(a):
    match a:
      None: False
      Some(_): True

some1 = Option#Some(1)

isSome = Option#isDefined(some1)

isDef = Option#isDefined

isSome2 = some1.isDef

struct Foo(a):
  def getA(f):
    Foo(a) = f
    a

myFoo = Foo("someValue")
myFoo.Foo#getA

Here we use # as a character to access elements in an enum/struct namespace. This is different from some languages which use . for this.

Comments welcome @snoble

remove support for recursive structs

see comment: #100 (comment)

Support for recursive structs struct W(fn: W[a, b] -> a -> b) is enough to give rise to full blown recursion due to the y-combinator. I should have realized this before.

We need to add a recursion check to struct/enum definition that requires them to only reference things in scope above them excluding themselves. Currently, we do no such check, so as long as eventually the type is in scope before full type checking (after the entire package has been parsed), we can type check.

support list pattern matching

related to #15 #13 #91

Add the bazel build to travis

we aren't currently testing the bazel rules in travis. We should have a separate process for this.

Improve parsing error messages

we can use fastparse's ability to capture position and show a nice error message (just a window of the code pointing directly to the error).

unify Scheme with Type

Scheme is basically a top-level, non-nested forall type. It is a bit ugly that we have to separate this out. It may be related to #5 and we can't easily remove Scheme until we address that.

Change the Int type to a big integer

add a linting phase

error on unused non-top-level let bindings.
make sure the all types needed to describe exports are also exported
consider an error on shadowing with a different type

Fix name resolution

We currently have approximately three phases:

syntax
expression AST
type annotated AST

In the conversion from syntax to expression AST, (Declaration to Expr), we do this only using the package name. We should use a resolver rather than punt to a later phase. This way, type inference is pointing to fully resolved names and does not have to deal with unification issues of aliasing of the same types.

relates to #52

implement java sourcecode or jvm-bytecode backend

bazel makes it pretty easy to use its java rules, so if we compile bosatsu to java code, it should be fairly easy to get a bazel rule.

automatic folds from for loops

a common idiom in python is something like:

x = 0
for y in z:
  x += y
return x

This does not look so functional, but it could be. We could convert for comprehensions like this into folds. We could do this by considering x += y to be rebinding a new x to x + y and we could look at all such rebound variables in the loop. Then we could convert it to something like:

z.foldLeft(0, \y, x -> x + y)

This is not so dissimilar to having the do notation in haskell which allows you to write imperative looking code that still returns a value.

Similarly, we could do the same translation for map-like for comprehensions:
[f(x) for x in y] to y.map(f)

There may be some subtlety with making these fully composable which we would want to support.

add an ADT to Evaluation

I thought about having an ADT in Evaluation, but then didn't mostly due to some ill-conceived anxiety about performance.

Instead I am just returning Any but sometimes pattern matching. This is almost certainly a bad idea because we have given 0 attention to performance it without compilation it is likely to be very slow.

We could add something like:

sealed abstract class Value
object Value {
  sealed abstract class ProductValue extends Value
  case object UnitValue extends ProductValue
  case class ConsProduct(head: Value, tail: Eval[ProductValue]) extends ProductValue
  case class SumValue(variant: Int, data: Eval[Value]) extends Value
  // it would be nice to give FnValue a good equals method, which
  // we can do better than scala because we have the enviroment it captures and the ast.
  case class FnValue(toFunction: Value => Eval[Value]) extends Value
  case class ExternalValue(toAny: Any) extends Value
}

change ffi syntax to external def

the approach is to do:

external def foo(a: Int, b: String) -> Int

Then at interpretation phase we give a mapping file that is a list of package, def and actual implementation.

This separates the ffi data from the code and allows us to have different ffi approaches in different use cases.

support pattern matching on literals

related to #15

Equality on all values

I believe that bosatsu, like the simply typed lambda calculus, is strongly normalizing (we forbid recursion, are total, but we can infer a few more types due to using the rank n algorithm of Peyton Jones et. al.)

I think this means that if we have some normalization rules no matter the order applied we always reach the same final state.

We could use this property to define equality on all identically types pairs: if after being fully normalized they are structurally identical, we return True else False.

external def eq(left: a, right: a): Bool

Having a free equality would be very nice. But there may be some issues I am missing with this.

One clear issue is that this would allow us to have equality on functions despite there being more than one structural way to express the same function (if by same we mean returns the same values for all inputs).

Apologies for the many claims above given without proof or reference. Some may be wrong and I’d be glad to be corrected.

destructuring assignment

currently, the only way to access items from a struct or enum is with a pattern match. This is a pain for big items.

For structs (single item enums) we should be able to do:

x = Foo(1, 2)
Foo(y, z) = x

and have that be the same as the match variant. This should only work for structs (or eventually tuples) to preserve totality.

implement fully typed AST format

similar to #8 we should be able to emit a fully typed AST that is reasonably easy to parse. The goal here is to make compilers easy to implement since we can fully separate type inference from code generation.

Implementation of evaluators is also much improved if we can fully know the types.

At least getting the internal representation for this seems high priority to finish the adoption of the rankn type inference #5

add an ability to escape to allow any unicode string to be an identifier

scala has a nice feature that you can use backticks to get any identifier to be legal. This is nice especially if we use bosatsu as a configuraton language and we want to use structs to generate yaml/json keys. We might need a key that is not a legal name in bosatsu.

val `val` = 3

More precisely, we could have:

struct Foo(`def :` bar)

so we can support any possible needed identifier.