Giter Site home page Giter Site logo

vrtbl / passerine Goto Github PK

View Code? Open in Web Editor NEW
1.0K 25.0 36.0 1.27 MB

A small extensible programming language designed for concise expression with little code.

Home Page: https://passerine.io

License: MIT License

Rust 100.00%
programming-language passerine compiler interpreter vm macros

passerine's Introduction

Note: Isaac is taking a two-year break from technology (announcement and explanation coming soon), and will no longer be developing Passerine. The project has been handed over to Ben Siraphob, who will continue the maintence of the community while Isaac is away.

The Passerine Programming Language

Made with ♡ by Isaac Clayton and the Passerine Community – a cornerstone of the Veritable Computation Initiative.



Why Passerine?

Passerine is a small, concise, extensible functional scripting language, powered by a VM written in Rust. Here's a small taste:

Passerine has roots in Scheme and ML-flavored languages: it's the culmination of everything I expect from a programming language, including the desire to keep everything as minimalistic and concise as possible. At its core, Passerine is lambda-calculus with pattern-matching, structural types, fiber-based concurrency, and syntactic extension.

Who started this?

Passerine was started by Isaac Clayton at the end of 2019. In August of 2022, Isaac handed over the maintenece of Passerine and its community to his friend Ben Siraphob.

A number of people have offered feedback and suggestions from time to time. Huge thanks to Raul, Hamza, Lúcás, Anton, Yasser, Shaw†, Ben, Plecra, IFcoltransG, Jack, Xal, Zack, Mahmoud and many others!

Shaw is the developer of MiniVM, a portable cross-platform runtime. The eventual goal is to retire Passerine's current virtual machine, and adopt MiniVM Assembly (VASM), then later LLVM IR, as first-class targets of the compiler.

An Overview

We've recently moved the Overview to a separate website, the Passerine Codex, which can be read online.

Note: Passerine is a work in progress: features mentioned in this overview may be unimplemented or subject to change.

FAQ

Q: Is Passerine ready for production use?

A: Not yet! Passerine is still in early stages of development, with frequent breaking changes.

Q: Wait, why is the compiler broken on master?

A: In July of 2021, Isaac started a small PR to refactor a small part of the language, paving the way to integrate a typechecker. This PR was originally aimed at cleaning up a part of the codebase to make this integration easier.

Making this change required a small change to how the parser parsed expressions. This, in turn, required modifying the lexer. Passerine's old lexer was very bad, so instead of modifying the lexer, I rewrote the lexer from scratch. This changed the output of the lexer significantly, so I added an additional read pass to prepare the input for the macro system (which we had decided to modify as well). Because the output of the read pass differed greatly from what the parser expected, the parser had to be significantly rewritten as well.

Around the same time, it was decided to transition towards using MiniVM as a backend, given its high performance and ease of integration. This required modifying the existing VM to more closely match how MiniVM worked, which required the code generation pass to be modified as well.

Throughout this entire process, a little refactor that was meant to take a week turned into an entire rewrite of the codebase, taking over a year (and counting). Once all passes were finally complete, we put everything together. It was quickly discovered, however, that none of the passes matched, and nothing was compiling correctly anymore.

It was now March of 2022, and the product of 8 months of work was a broken compiler, each pass working individually (for the most part), but nothing working together. Around this time Zack and Mahmoud started working on the refactor, and we were able to write property-based tests to fuzz the front-half of the compiler. Writing property-based tests for the back-half of the compiler is ongoing work.

In July of 2022, a year after this quick PR had started, the PR was nowhere close to being ready to merge. A part of this was out of fear of merging something that wasn't ready yet, and another part out of not having enough time to work on pushing the PR through to completion due to school and work.

At the end of July, we decided to bite the bullet, break the build on main, and merge big-refactor. This refactor is far from being completed, but now that we've let go of this large PR blocking prospective contributers, we invite prospective contributors to submit patches to help finish building this next iteration of Passerine.

The largest takeaway, in my opinion, is to scope changes, and build features incrementally. This big refactor turned massive rewrite is the exact opposite of a productive approach. Don't let refactors turn into rewrites!

Q: Is Passerine statically typed?

A: Currently Passerine is strongly and dynamically¹ typed (technically structurally typed). This is partially out of necessity – Types are defined by patterns, and patterns can be where predicated. However, I've been doing a lot of research into Hindley-Milder type systems, and the various extensions that can be applied to them.

I'm working towards making a compile-time type-checker for the language, based on Hindley-Milner type inference. With this system in place, I can make some assumptions to speed up the interpreter further and perhaps monomorphize/generate LLVM IR / WASM.

This type checker is actually the target of the next release, so stay tuned!

Q: What about algebraic effects and kind-based macros?

A: I'm interested in eventually adding both these things to the language, but first I need to implement a nice type-checker and do some more research. Algebraic Effects would fill the design space of fibers, and kind based macros would provide a more solid base for passerine's macro system. Got any fresh language features you think would complement Passerine's design philosophy? Reach out!

Q: What is vaporization memory management?

A: When I was first designing Passerine, I was big into automatic compile-time memory management. Currently, there are a few ways to do this: from Rust's borrow-checker, to µ-Mitten's Proust ASAP, to Koka's Perceus, there are a lot of new and exciting ways to approach this problem.

Vaporization is an automatic memory management system that allows for Functional but in Place style programming. For vaporization to work, three invariants must hold:

  1. All functions params are passed by value via a copy-on-write reference. This means that only the lifetimes of the returned objects need to be preserved, all others will be deleted when they go out of scope.
  2. A form of SSA is performed, where the last usage of any value is not a copy of that value.
  3. All closure references are immutable copies of a value. These copies may be reference-counted in an acyclical manner.

With these invariants in place, vaporization ensures two things:

  1. Values are only alive where they are still useful.
  2. Code may be written in a functional style, but all mutations occur in-place as per rule 2.

What's most interesting is that this system requires minimal interference from the compiler when used in conjunction with a VM. All the compiler has to do is annotate the last usage of the value of any variables; the rest can be done automatically and very efficiently at runtime.

Why not use this? Mainly because of rule 3: 'closure references are immutable'. Passerine is pass-by-value, but currently allows mutation in the current scope a la let-style redefinition. But this is subject to change; and once it does, it's vaporization all the way, baby!

Q: Aren't there already enough programming languages?

A: Frankly, I think we've barely scratched the surface of programming language design. To say that Programming Language Design is saturated and at a local maxima is to not understand the nature of software development. Passerine is largely a test as to whether I can build a modern compiler pipeline. But what I'm even more interested in is the tooling that surrounds development environments.

Case in point: text-based entry for programming languages has been around forever because it's fast. However, it's not always semantically correct. The number of correct programs is an infinity smaller than the number of possible text files. Yet it's still possible to make text-based entry systems that ensure semantic correctness while encouraging exploration. In the future, we need to develop new tools that more closely blur the line between language and environment. Pharo is a step in the right direction, as are Unison and similar efforts.

I'd like to focus more on this in the future. An interesting project would be an editor/environment like Pharo/Unison for a small minimal language, like Scheme, or perhaps even Passerine.

Installation

Passerine is still very much so a work in progress. We've done a lot, but there's still a so much more to do!

For you pioneers out there, The best way to get a feel for Passerine is to install Aspen¹, Passerine's package manager and CLI. If you use a *nix-style² system, run³⁴:

cargo install aspen

This will always install the latest stable version of Passerine. To get a feel for the current state of development hell, check out a local copy for your own enjoyment:

git clone https://github.com/vrtbl/passerine
  1. If you're having trouble getting started, reach out on the community Discord server.
  2. Tested on Arch (btw) and macOS.
  3. Now tested on Windows™!
  4. In the future, we plan to distribute prebuilt binaries, but for now, a Rust install is required.

Contributing

Contributions are welcome! Read our Contribution Guide and join the Discord server to get started!

passerine's People

Contributors

fuzzypixelz avatar hhhapz avatar julienbe avatar kethku avatar kozova1 avatar mathis2003 avatar nivpgir avatar plecra avatar shawsumma avatar slightknack avatar threecgreen avatar zack466 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

passerine's Issues

Abstract Graph Rewriting and Reduction as the Basis of the Compilation Pipeline

Can a Graph/Term Rewriting/Reduction system be implemented such that: code -[graph/term rewriting/reduction rules]-> transformed code -[graph/term rewriting/reduction rules]-> AST -[graph/term rewriting/reduction rules]-> AST Lowering -[graph/term rewriting/reduction rules]-> optimised IL/IR -[graph/term rewriting/reduction rules]-> machine code.

Vaporization vs Perceus?

I was in the middle of working on a reimplementation of Koka's Perceus when I stumbled upon this project. Could you speak a little bit about the differences between vaporization and perceus? Specifically, what values are reference counted in vaporization? Are the function params that are "passed by value via a copy-on-write reference" reference counted (does that CoW reference include a refcount)? I think the gist of what I'm getting at is that vaporization shifts more of the work of memory management to the runtime compared to Perceus in terms of these CoW references?

Overall this project looks super awesome! Really cool to see such an awesome project at the crossroads between systems programming and functional language design. Also, thanks for pointing me toward the As Static As Possible thesis. I'll definitely be reading that soon.

Small error in the README?

Shaw pointed me to this language.
And when reading the README, I got puzzled by this:

linear = m b x -> b + m * x
linear 2 3 5
-- evaluates to 13

the following explaination makes sense tho, but it use 3 2 5, not 3 2 5.

Maybe I'm mistaken.
If I'm not, I could do the PR to fix this.

WASM

Just submitting my interest in having the ability to embed Passerine in a WASM game written in Rust.

Commit Convention?

Right now, commit messages are a bit all over the place. I think we should instate something along the lines of conventional commits: https://www.conventionalcommits.org/en/v1.0.0/.

The downside to this are that I'm not sure conventional commits are the convention I want to adopt; I've never personally been a fan of them, I'm only considering them after a suggestion from a friend.

Some questions regarding suitability for embedded systems.

Context

@slightknack I run a R&D studio based in the UK providing services for projects and clients involving Robotics, Automation, IoT and ML. Many of the projects that my company provides services for are hard realtime, ie., if a switch isn't toggled at the right moment, people could get hurt kind of realtime. We use C++ for everything embedded, Python and more recently Go for gui/application/user facing code. I'm aware that most people in the industry would suggest that I stick to C/C++/Rust because of performance, close to metal, control, safety etc. While these are nevertheless important, in our experience, these aren't even in the top 5 pain points. Instead it is lack of joy while programming, lack of support from the compilers, too much friction from our tools, requiring too much assistance from the developer etc., creature comfort features that obscure runtime behavior predictability etc. Now the immediate reply I can foresee someone reply is that we're probably doing something wrong or if we use X or do Y then we can workaround these issues etc. There have been innumerable moments where I have thought about giving up, and just changing careers. But I love robotics too much to be able to ever do that. In any case, I say all of this to point out that my frustrations have been making me feel that I might have to write a programming language and develop tools for what my studio needs that best serves our clients. In the process of research, I found Passerine and was struck by its simplicity. In fact, if I had to pick my favorite feature about it, I'd choose the macro system. It's just beautiful, and something I have never seen before. I'm sure other programming languages might have done it, but in my experience, I've never seen something this beautiful.

Preface

The aesthetic of a language is an important thing that many tend to brush aside as subjective and superficial. I'd counter that everyone would agree on certain qualities/characteristics of beautiful code. One of the reasons we picked Go was that it was small, in that we could hold the entire language and its specification in our head while working on solving problems. Having to think about which tool to use and micromanaging its details while implementing a solution is a lot of wasted cognitive energy which is better spent on the problem. Unfortunately, as much as we'd wish, we can't use Go on our embedded platforms with a few hundred kilobytes of memory, and we have no control or influence on the runtime behavior of programs written in Go or many times even in C++ (without going low-level, at which point we might as well use C). This is perhaps the most important aspect of the kind of work that we do - predictability of runtime characteristics of a program. I want to make it clear that these are some things I have been thinking about and wanted to ask you. In no way is this a criticism or a request of sorts to influence the design of your language. I would like to understand what and how you would approach an issue like the one I'm going to communicate if my company were to hypothetically use Passerine in a hard realtime context.

Issue

It's been my observation that from a hard realtime standpoint, the most critical aspect of programs is the repeatability, responsiveness and worst-case execution times of critical sections. Most of the development energy is spent optimizing for these aspects so that runtime behavior analysis is possible and quantifiable before the program is even executed. The reason why C is still used is, primarily, not because it is low-level and provides direct access to the metal, but that it is transparent. Let me explain. Features, whether provided at compiletime or runtime, encapsulate some behavior because the code that demonstrates the effects of a feature needs to run at some point. Any code that's not in the text of the program contributes uncertainty about its runtime characteristics. This is why such systems take long and are difficult to develop because they need to be thoroughly tested and verified against a set of requirements of runtime behavior. In addition, specific patterns of language usage or idioms have been developed and passed down as tradition because it's been observed that they don't introduce unnecessary variability. One extreme example when using C are programs where all variables are global, and functions take no parameters, return nothing and don't even use local variables. I have seen programs like this controlling actuation systems in entertainment rides. Whatever or however one may feel about this, it cannot be argued with that this code has been justified by the developers for some XYZ reason, and that it works as per the specifications. I am reminded of Dijkstra's "GOTO Considered Harmful" paper where he essentially says that our static text must reflect dynamic runtime behavior. Given that Passerine, and so does Rust and many new languages use compiletime to reason about the text to enable certain optimizations whether for safety or performance, how can the text help reason about the runtime behavior in a predicatable sense without ever executing it? Alternatively, how can we look at programs written in Passerine with its interesting memory management strategy or its clever compiletime analysis and provide an analysis of its runtime behavior that matches or guarantees this behavior once deployed? Is it possible? If no, can it done in a hypothetical Passerine, and how would that look like? I ask this because, Passerine is bloody beautiful (pardon my French), and I'm enamored by it, to say the least. I'd love to know your thoughts and pick your brain.

Sincerely,
CA

README logo displays poorly with GitHub Dark Mode

I'm pretty sure that 95% of users on GitHub are using the dark theme at this point. The Passerine logo was designed before this was a thing, and now displays black-text on black. This needs to be fixed:

  • We could make the text white, but this is bad for the same reasons
  • We could make the background white, but the background would no longer be transparent.
  • We could make the background black and the text white, but the background would no longer be transparent
  • We could make the text green like the logo; this works because green is between black and white in terms of value, but then the logo has green text.
  • We could make the text grey; see the above.
  • We could add a white drop-shadow: with light mode this would be invisible, but on dark mode this would outline the text. Likewise, we could just add a white outline to the text.
  • I'm not sure if GitHub supports it, but SVGs can be styled with CSS. If GitHub exposes a 'dark-mode' selector, we could change the SVG dynamically to make it work on different platforms. Cross-platform support though.

All in all, a lot of choices. Will have to think about it; what are your thoughts?

Variable Hoisting

Currently variables are only available after they are defined, as Passerine has no late-bound 'global scope' so to speak. However this makes things like mutual recursion a little annoying:

-- TODO: remove this explicit hoist
odd = ()

even = n -> if (n == 0.0) {
    true
} else {
    odd (n - 1.0)
}

odd = n -> if (n == 0.0) {
    false
} else {
    even (n - 1.0)
}

print even 35.0

We have to declare odd before using it in even. Ideally this is something that can be resolved at compile time. I've started work on this in the dev branch by adding a step that resolves the scope of all locals / captured before compilation, but there are still a few kinks to work out.

For those interested, here's how the system works:

Each time a lambda is introduced, a new scope is entered:

-- outer scope
x = true

capture_function = () -> {
    y = 3.14
    -- inner scope
    print x
    print y
}

if a variable (i.e. x) is defined in the outer scope, it's accessible in the inner scope. This works by 'capturing' x when capture_function is defined (basically just making it a shared reference on the heap).

While compiling, we can imagine this as a chain of scopes. Just after the expression print y, the scope stack looks like this:

[ 
    Scope { locals: ["x"], nonlocals: [] }, 
    Scope { locals: ["y"], nonlocals: ["x"] },
]

Currently, whenever a variable is encountered, here's what currently happens:

  1. If the variable is local to the scope, add it to locals in the current Scope
  2. If the variable is not local to the scope, search backwards through all enclosing scopes. If it is found and add it to the nonlocals of all the scopes we've searched through
  3. If the variable can not be found:
    a. If this is a declaration, i.e. x = true or x -> {} define the local in the current scope
    b. If this is an access, i.e print x, raise the error 'variable referenced before assignment'

This works well in most cases: it's able to hoist captured variables, declare new variables, etc. etc.

However, there are a few problems with this methodology. Consider this simple example:

recurse = x -> recurse x

At the point in time where the expression is compiling recurse (), the scope stack looks like:

[
    Scope { locals: [], nonlocals: [] }, // This is empty!
    Scope { locals: ["x"], nonlocals: [] }, // no reference to `recurse`
]

So this code will throw the error 'variable recurse referenced before assignment'.

How do we fix this? We do something a bit more complicated. Steps 1 - 3a are the same: let's focus on 3b:

b. If this is an access, i.e print x, raise the error 'variable referenced before assignment'

x, for instance, can be defined after we parse the expression. consider the recurse example, or the first example with mutual recursion. What we need to do is somehow mark x as used, and when x is later defined, join the later assignment.

Let me introduce the solution: Hoisting chains! 😱 🎉

So, here's the skinny, 3b becomes:

b. If this is an access, i.e print x, add x to a table called unresolved_captures, and mark x as captured a la rule 2 in all enclosing scopes.

This second bit, 'mark x as captured' builds up what I like to refer to as a 'hoist chain'. We need one more change to make this work, 3a becomes:

a. If this is a declaration, i.e. x = true or x -> {} check unresolved_captures for x.
i. If x is in unresolved captures, define it in the local scope, remove it from unresolved_captures and then remove it as captured in all enclosing scopes (a la rule 2).

An finally, we need an additional rule, rule 4, for dealing with cases where variables are actually undefined:

  1. After compilation, if there are still variables in unresolved_captures, raise the error 'variable(s) referenced before assignment' (obviously point out which variables and where this occurred).

What does this do in practice? consider the following code:

flubber = () -> {
    foo = () -> {
        print (pi + e)
    }
    pi = 3.14
    foo ()
}
e = 2.72

flubber ()

Note that we have two variables that are 'referenced before assignment': e and pi.

After we reach pi in the expression print (pi + e), the scope stack looks like this:

[
    Scope { locals: [], nonlocals: ["pi"] }, 
    Scope { locals: [], nonlocals: ["pi"] },
    Scope { locals: [], nonlocals: ["pi"] },
]

Note how we built a hoisting chain of pis. pi has also been added to unresolved_captures. We repeat this when we reach e.

When we reach pi = 3.14 in the enclosing scope, we see that pi is in unresolved_captures. We then remove it from unresolved_captures and from being captured in all enclosing scopes:

[
    Scope { locals: [], nonlocals: ["e"] }, 
    Scope { locals: ["foo", "pi"], nonlocals: ["e"] },
    // this innermost scope has been popped at this point,
    // but it is correct as we hoisted `pi` and `e` in it
]

Likewise, when we reach e = 2.72:

[
    Scope { locals: ["flubber", "e"], nonlocals: [] }, 
    // scope has also been popped, but stil correct
]

These scopes are packaged into the AST and then used during bytecode emmision to resolve variables.

It's a tad bit more complicated than this, because we're also renaming variables as we do this, but more on that later. Anyway, this feature should be merged in 0.9.X or 0.10.0 depending.

Simplifying the Compilation pipeline with Typed Tagless Final Interpreters

For those who don't know what a TTFI is, there's an excellent paper here.

Because Passerine does a number of compilation passes, it has different syntax trees for each stage. This works well enough, but it means we have what I feel like is a lot of redundant overlap between various layers.

Here's my proposal: We use Rust's trait system to define a TTFI representing Passerine's syntax tree. Note that this does not mean the language is tree-walk interpreted; we're not using the evaluation strategy present at the beginning of that paper. So what does this look like?

Basically, we'll have a trait called SyntaxTree or the like that implements a TTFI for the core language:

  • Symbols
  • Data
  • Blocks
  • Assignments
  • Lambdas
  • Function Calls
  • FFI

This is all that's really needed to compile Passerine atm. As time goes on, this could be extended (i.e. with type information) or reduced (Blocks might be redundant or something).

Then, for each step up on the pipeline latter, we implement traits for each feature. As an example, the desguar step removes macros and function application from the syntax tree. This could be defined as a trait called Desugar, with the following trait functions:

  • Macros
  • Function application

With Desugar in place, we can then define DesugaredSyntaxTree to just be a supertrait of SyntaxTree and Desugar:

trait DesugaredSyntaxTree: SyntaxTree + Desugar {}

And so on, for each layer on top of the core, we implement a TTFI that extends the base, and a reduction step to lower the Syntax tree at this point. The paper goes into a lot more depth, and we'll have to figure out how each syntactic element maps to this representation, but I think it would significantly simplify the compiler implementation (and perhaps speed up the pipeline too).

Code Generation and JIT

Have the ability to have multistage programming with code generation, JITing or compiling.

Add `match` expressions

Course of action:

  • Add match keyword to the lexer
  • Add match keyword to the parse tree
  • Implement pass to expand match arms into decision trees
  • Reduce decicion trees to constructs present in the language core

Here are some papers that might help with the third step. There was this really good gist or issue that explained the algorithm in detail, but I can't seem to find it.

Alternative Backends (Minivm, Wasm, LLVM)

I may write some ideas down for codegen for multiple backends later. Requires some sort of typed (or type-erased) low level IR format. I plan to move away from the current Lambda + Closure format once we introduce the Qualm distributed runtime (formerly FLEx).

More Rust idioms

It's common (almost universal~) for Rust projects to use cargo fmt and cargo clippy to keep code predictable and easier to collaborate on.

Vaporization Memory Management

I'd like to know more about this particular technique mentioned here:

https://www.slightknack.dev/perma/master/131/53d2e32b388719c6165058f69621281fb78b5669434cf69b6f704fde2eb35969

Do you think you could finish the blog post or give me a brief rundown here?

Also, I think your software has a lot of potential, hitting "watch" on GitHub for future development ;)

One more thing, I'm intrigued by the Veritable Computation Initiative but little information is given about it. Is a website coming soon?

🐝

VM should be Send

The VM should be Send, but not Sync. This will most likely be resolved once we start implementing Passerine's concurrency model.

Github Sponsorship

This project is really cool and I'd like to sponsor the development of it :)

Efficient parsing of UTF-8

for char in self.remaining().chars() {
// \n indicates a token, so it isn't 'whitespace'
if !char.is_whitespace() || char == '\n' {
break;
}
len += char.len_utf8();
}

UTF-8's ability to represent ASCII in-place allows lexers for ASCII characters to be implemented extra-efficiently by skipping decoding of utf-8 bytes. It's fine to simply scan directly through the bytes (and a crate like https://docs.rs/memchr can vectorize this) for characters like ' ' and '\t'

Refactor Compiler Pipeline

We've added two new compiler features, but haven't done much in the way of project restructuring to accommodate this change. Referencing #33, the new compiler pipeline should be:

  • lex
  • parse
  • desugar (split from macro expansion)
  • hoist (move up before expansion)
  • expand (split from desugar)
  • infer
  • gen

With the syntax trees moved out to another module. I think we should implement some traits that represent a compilation step / syntax tree.

This, along with the goals outlined in #34, will be the target of the 0.9.3 release.

Tracking Issue: Vaporization Checklist

Vaporization is not yet implemented, as there are other features that need to be in place before it makes sense to add it in. Here's how we'd need to do this, from #52:

To do so, we'll need to add a new pass to the compiler, potentially named last_use.rs or something, that walks the AST backwards and records the first occurrence (i.e. last use) of each variable. The second thing we need to do is add support for Rust's Cow pointers (clone on write) in tagged.rs. After that, we'll need to modify the generation step (gen.rs) to take last use into account, and emit instructions for moving out of shared Cow references. Finally, the VM will need to add minor runtime support around instructions for copying shared references. I'll open an issue with a tracking list so we can keep track of work wrt that.

Here's that issue! And, as promised, the checklist:

  • Remove mutation in closures, enforce only mutable in local scope rule.
  • Add last_use pass to compiler.
  • Add support for CoW in tagged.
  • Modify gen to work off of last_use pass.
  • Add instructions that manage CoW pointers.
  • Add support for instructions in vm
  • Tests and so on - fine-grained extension to more complex data types, like slices of lists.

For those interested, there's a forever-WIP post on vaporization on my blog: https://slightknack.dev/blog/vaporization/.

Formalised Parallelism: Join Calculus, Transactional Events

Decimal floating point

It seems that C23 will contain native support for decimal floating point number types. I think new languages (like Passerine) should also provide native support for decimal floating point as a much safer option instead of binary floating point (there are many important reasons to this and only one viable reason contra - which is performance which makes the computation about 2x as long on average with emulated decimal floating point).

It should preferably become the new default floating point type (with that I simply mean that floating point literals should be treated as decimal floating point literals if not explicitly casted otherwise).

In addition to supporting decimal floating point I'd like to point out, that Passerine shall report all lossy conversion from a floating point literal to a binary floating point number in compile time. Otherwise Passerine would be unsafe already in the language specification itself 😮.

Records and Modules

Passerine's module system is pretty simple. At it's core, a module is defined as:

my_module = mod {
    x = 0
    y = n -> x + 1
    double = a -> 2 * a
    -- ...
}

my_module::double 4
-- is 8

Basically a module turns a scope into a struct, where all top-level variables become fields on that struct. As seen in the above example, we use mod to do this.

All files are modules, and are imported using the use keyword.

-- my_module.pn
x = 0
y = n -> x + 1
double = a -> 2 * a
-- ...

-- main.pn
use my_module
my_module::double 3
-- is 2

A folder with a file named mod.pn in it is also a module. This is similar to how Rust behaves:

-- my_module/mod.pn
-- ...

-- main.pn
use nest::my_module
-- ...

Modules are resolved at compile time. Upon using use, Passerine:

  1. Looks for the file X.pn or X/mod.pn in the current scope of the file being compiled
  2. If the file has already been compiled, use the cached version
  3. Insert the compiled file into the current file, wrapping it all in a mod operation.

All modules must form a tree hierarchy. i.e, there should be only one path from the root ,nest, to any other module. Modules can be mutually recursive, but in this case the full path must be specified:

-- a.pn
use nest::b
print b::x
y = "Hello, "

-- b.pn
use nest::a
print a::y
x = "World!"

-- main.pn
use nest::a
use nest::b

Modules are only executed once, upon first load. Hello, World would be printed to the terminal in the above example. Here's the big ol' todo list. If you'd like to help, I'll mentor anyone through any one of these steps:

  • Implement a Record variant on Data in src/common/data.rs, including methods for displaying it.
  • TODO: This will require some discussion, but we basically need to add support for paths.
  • Implement lexing/parsing for the index (::), mod <block>, use <path>, and nest keywords.
  • Implement lexing/parsing for record types, e.g. { foo: "Bar", baz: 27.5 }
  • Implement destructuring on record types, e.g. { thing: banana } = { thing: "Hello" }
  • Add support for compiling mod blocks to bytecode, converting them to records
  • Add support for indexing on records, and the nest keyword.
  • Add support for loading modules from other files. TODO: We might have to loop in Aspen to resolve these files.
  • Add support for loading modules from other folders. TODO: Aspen, see above step.
  • Polish and make sure everything works together.

The dev branch is a bit of a mess right now as I'm doing some big refactoring, so please make a feature branch off of master and implement any changes there. This will be a 0.9.X feature.

Designing a Type System for Passerine

Goals of a Type System

Passerine's type system is a bit fuzzy at the moment, due to it's dynamic nature in terms of implementation. However, Passerine is a functional programming language, and in that vein I'd like it to have an ergonomic albeit flexible type system. Here are the goals we should try to meet:

  • The type system (TS) should be both sound and decidable.
  • The TS should be inferred at compile time through the use of hindley-milner (HM) type inference.
  • The TS should reflect data representation; all base types should be destructurable.
    • As a rule of thumb, this means Algebraic Data Types: Records & Enums
    • Additionally, this means that Passerine is primarily structurally typed, with support for nominal types to resolve ambiguities in structure.
  • The TS should support generics and polymorphism:
    • Let polymorphism (i.e. forall x) for functions.
    • Row polymorphism for records.
    • Traits (basically typeclasses) for open enumerations (perhaps through row polymorphism).
    • Generics for types.
  • The TS should support Algebraic Effects:
    • Interaction with the FFI should be done through such handlers for effects.
    • same goes for fibers and Passerine's concurrency model
  • The TS should support runtime refinement types (the where predicate |) that are total (verified through algebraic effects) and verified upon construction.
  • The TS should support what I call 'local mutability' - in other words, variables can be mutated, but only in the current scope: this allows for vaporization as a memory management strategy.

What we need to know

For each of these points, we need to decide the following:

  1. How is this written out in terms of syntax?
  2. What does this mean in terms of semantics?
  3. How does this affect type inference?

TL;DR and how you can help

I'm not going to answer these questions in this first post, rather, I hope to start a discussion on the topic. I'll try to formalize most of the type system tomorrow, including rules for structuring/destructuting, type definitions, algebraic effects, and so on. I'd really like some input on what you think are needed features of the type system.

The goal of the type system is a simple one: make it feel as flexible as a dynamic type system while still providing the guarantees and speed of a static one. Either way, Passerine is strongly typed.

It doesn't matter if you have 10 years of experience with type systems or 10 minutes - This is the first time I've designed a full-on type system, and I'd like to get it right, from both a theoretical and practical point of view - so don't worry, any feedback is appreciated. Please reply to this issue with any thoughts, issues, comments, questions, suggestions, etc. If there are features in other languages you think have merit, bring them up. Thanks!

Syntax example

Am I stupid?

linear = m b x -> b + m * x
linear 2 3 5
-- evaluates to 17
  • 2 is matched against m, so m = 2,
  • 3 against b, so b = 3,
  • 5 against x, so x = 5,

If b = 3, why is this 2 + 3 * 5 instead of 3 + 2 * 5?

modulo vs remainder

x % y

passerine:

n % m m = 7 m = -7
n = 4 3 1
n =-4 1 3

c rust haskell lua js python:

n % m m = 7 m = -7
n = 4 3 -1
n =-4 1 -3

This behavior is not what I expected. I think the operator % should behave like it does in other languages.
An operator for what passerine has now is great, but it should be named something else.

A few Exciting Announcements

A couple updates and announcements:

It seems like there's been a spike in interest, so hello to all you new people! Most discussion takes place on the Discord server (https://discord.gg/XQfxBcHnJj), so drop by there if you want to say hello!

Now, on to the big news:

Towards a Production-Ready Compiler

There has been a lot of interest around Passerine recently. The compiler currently in use, and the D compiler, have proven that the language is easy to compile and readily maps to existing architectures. (here's a toy self-hosting version of Passerine in about 600 lines of Passerine).

Despite this, the current architecture of the Rust Passerine compiler is not flexible to support generalized features, like an effect system and a better VM, without a lot of additional work. For this reason, we're rewriting the Passerine compiler, from start to end, with support for these systems in mind. Because the compiler isn't that large, this shouldn't be that big of an undertaking, and gives us a fresh start.

Get Involved!

If you would like to be responsible for a portion of this new pipeline, send a message to let me know. Whatever your experience with Rust or compiler engineering may be, I'd happy to get you up to speed. (And if you are experienced, please help get us up to speed!) This compiler will be developed until it matches feature-parity with the existing compiler, after which it will replace the existing compiler, being licensed under the same permissive MIT terms. Here are the people who are already involved:

  • @slightknack (myself): Design of the type system and compiler middle end, as well as general organization and aspen.
  • @Plecra: Design of the new generalized token-based macro scheme and compile-time evaluation.
  • @ShawSumma: Implementation of backends, namely C, Wasm, and minivm. Shaw is also working on the D compiler.
  • You?

We're still looking for people to:

  • Help write language servers and syntax highlighters to improve editor integration.
  • Implement additional backends, like LLVM IR and so on.
  • Write a standard library for Passerine and extend the language core.
  • Improve the website and documentation, including writing a Rust-style The Book based on the existing README.

If you have experience in any of these areas and would like to pitch in, reply to this thread. Now, onto the topic meetings:

Consistent Monthly Meetings

We currently have weekly meetings on Discord in the discussion channel, whose contents are documented in the vrtbl/meta repository. (This repository will be updated soon, it's slightly out of date.) These meetings are usually inconsistent and unplanned though, and not many people can make them.

To make meetings more accessible while still allowing more structure, Weekly Meetings are moving to be Monthly Meetings, and will have more structure. This is not to diminish the pace of development, it's to improve it. More specifically, we'll start by reviewing the changes made to the compiler that month, have an optional speaker, then have a normal discussion as per usual: more information to come.

Of course the discussion channel will still be open to discussion any time. I'm hoping that by spacing out meetings some more, more people will be able to attend. If you would like to present on a project you've been working on at the next weekly meeting, please send a message to let me know.

Sponsorship

Some people (@Kethku :P #54) have expressed interest in sponsoring the development of Passerine. We're still working out what sponsorship will look like, but it will a viable option if you're looking to speed up the development of the language in the future. Cool beans!

Random last thing

If you write a toy project that uses Passerine, please put it up on GitHub! If we reach 200 repositories, Passerine will be an official language on GitHub, which would pretty neat!

Thank you!

I would like to thank everyone who has been involved in the development of the language thus far. Everything, from discussing traits and effect systems, to rolling up your sleeves and slinging some code, is immensely helpful: I really appreciate it. Thank you!

(By the way — If you have made a contribution to Passerine, please feel free to open a PR adding yourself to the README.)

New Branch & Github rules

For the next few releases, I'd like to standardize our DevOps and release procedures (this is sort of related to #10, but a little bit more general; this is more meant to discuss the way that branches are laid out).

Additionally, I'd like to talk about what contributing guidelines, CoC, and PR/issue templates should look like.

Document effect system

I've locked down the type and effect system, but I haven't documented it anywhere. Seeing that the internet is seriously in need of an approachable explanation as to what algebraic effects even are, I'd like to document Passerine's effect system either here in in a blog post of some form.

cargo run doesn't pretty-print errors

cc @Plecra.

Signature of main is Result<(), String>, which results in string not being pretty printed. Easy fix: we can fix now or later when we add the compile and run functions to lib.

The installation shell command on Passerine's website doesn't work.

The problem

The installation shell command on the Passerine's website, which is:

sh <(curl -sSf https://www.passerine.io/install.sh)

Doesn't work. Running it on every shell I have tested (bash and zsh) will cause the following error:


    Welcome to Passerine!

This script will download and install the official compiler and
interpreter for the Passerine programming language, and its package
manager and CLI, Aspen.

The Aspen home, registry, and bin directories are:

    /home/firefly/.aspen
    /home/firefly/.aspen/registry
    /home/firefly/.aspen/bin

This can be modified with the ASPEN_HOME environment variable.

The bin directory will then be added to your PATH environment
variable by modifying the the profile files located at:

    /home/firefly/.bash_profile
    /home/firefly/.profile

Would you like to proceed with the installation? (y/N)

/proc/self/fd/15: 48: read: Illegal option -n


Installation aborted.

How to fix

The following shell command can be replaced with this one:

curl -sSf https://www.passerine.io/install.sh -o install.sh; chmod +x install.sh; bash install.sh; rm install.sh

This will install the installation shell file and put it in the install.sh file, then we will change its permission to be executable, then run it with bash, and finally remove install.sh.

Remaining tasks from `big-refactor`

Remaining tasks copied from #52.

This PR still has a long way to go, but is getting closer to completion. Here's what's left:

Phase 1

  • Refactored the representation of syntax trees in the compiler, to reduce boilerplate
  • Rewrote lexer to be token-based macro friendly
  • Refactored parser to work with new lexer
    • Refactor the way spans are represented for richer errors
    • Take advantage of new error format in parser
  • Reintroduce rest of compilation pipeline, ensure it builds
    • Lex
    • Read
    • Parse
    • Desugar (type definition mismatch oversight, working on it now)
    • Hoist
    • Gen
  • Write tests for new pipeline
    • Lex
    • Read
    • Parse (only unit tests so far, need proptests)
    • Desugar
    • Hoist
    • Gen
  • Test pipeline E2E
  • Get snippet tests all working
  • Clean up and document new code
  • Remove overly generalized compilation pipeline
  • Clean up library exports
    • Figure out what really needs to be in passerine_common.
  • System Injection
    • Figure out Rust side of things
    • Proof of concept (in private pnx repo)
    • Writeup (WIP)
    • Remove magic (finally!)
    • Conversion for builtin types
    • Conversion proc derive macro
    • Add effect (eventually to be replaced with type)
    • Switch existing system code over to effects
  • Look at mlua/hlua, rhai, and pen to determine what the high-level API to using Passerine from Rust should look like.
    • Calling Rust from Passerine
    • Calling Passerine from Rust
    • Passing data to and from Passerine/Rust
    • Sandboxing side effects
    • Inspecting and modifying the results of the top level (requires modules)
    • Binding handlers through system injection (implemented in pnx, needs to be transferred)

Phase 2

  • Introduce a token-based macro system
    • In between lexing and parsing
    • That is hygenic (TODO: how to define hygine)
  • Introduce prelude
    • standard library and core
    • reintroduce syntax-based macros in terms of token-based macros

Phase 3

  • Finish implementing type ... and record
  • Introduce rough modules
  • Match expressions using macros and fibers
  • Polish implementation write some projects to test for bugs
  • Polish standard library and prelude a tad

Phase 4 (Maintenance stuff, post merge)

  • Polish high-level API, write documentation and examples
  • Go over README and correct anything that is inaccurate
  • Move README Overview to the codex and link to it
  • Update the code example in the repo and on the website
  • Redo the website, add a 'try it' wasm-based playground

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.