danilopedraza / symstatic Goto Github PK

2.0 1.0 0.0 551 KB

Use the type system to create restrictions over symbolic math operations.

Home Page: https://danilopedraza.github.io/symstatic/

License: GNU General Public License v3.0

Rust 98.92% CSS 0.39% HTML 0.69%

free-software interpreted-programming-language memoization pattern-matching programming-language prototyping symbolic-computation

symstatic's Introduction

Symstatic

Una calculadora simbólica.

Sobre el proyecto

Motivaciones

Llevo mucho tiempo pensando en un sistema que me permita hacer conjeturas matemáticas rápido y sin introducir demasiados detalles que me alejen del problema inicial. Lo que se suele utilizar en estos casos son paquetes como Mathematica, Maple o Sage. Hay cosas que todos estos tienen:

Son muy grandes, con muchas funcionalidades.
La sintaxis de sus lenguajes no me gusta mucho. La de Sage me gusta, pero es el sistema que menos conozco.
Nunca son realmente cómodos. Esto tiene sentido, considerando que no son específicos a ningún área de las matemáticas. Debo imaginar que nadie está del todo cómodo.

Además, Mathematica y Maple no son de código abierto.

Al principio pensé que las palabras clave del lenguaje estuvieran en español (¡por esto este README está en español!). Cambié de opinión.

Precedentes

Hay dos lenguajes que me parecen una buena forma de partir: Miranda y SETL. Aún no estoy seguro de si este lenguaje será completamente funcional. Tal vez no. Las cosas que más me interesa incluir son:

Laziness
Pattern matching robusto
List/Set comprehension

Además, si termino enfocando esto a problemas de matemáticas discretas, el compilador tendrá que poder optimizar diversas operaciones recursivas.

Lo que más quiero sacar de estos lenguajes de su expresividad. Como este es un lenguaje para hacer experimentos de matemáticas, no tengo gran interés en rastrear errores en tiempo de compilación. El sistema de tipos y la expresividad de los mismos sólo existe para ayudar al usuario a específicar las propiedades de los símbolos que manipula. Por ejemplo, si el usuario desea integrar simbólicamente una expresión pero no está interesado en obtener funciones en los complejos como resultado, la idea es que esto sea fácil de expresar a través del sistema de tipos.

Recientemente me crucé con Picat, un lenguaje con símbolos, predicados y pattern matching. También tiene utilidades para resolver CSPs, problemas de planeación y cosas por el estilo. Es casi como lo que busco con este lenguaje.

symstatic's People

Contributors

Stargazers

Watchers

symstatic's Issues

Functions

The language does not have functions yet. Functions can be defined with let expressions, like this:

let f(x, y) := x + y

and they can be called like this:

f(5, 1+1)

Code blocks

I thought of a syntax for code blocks with tuples, like this:

let f(a) := (
  let x = a + 1,
  x * 2
)

Refactoring the AST

The AST is messy. Is entirely recursive and its lack of structure does not let me take advantage on compile-time guarantees that would avoid code repetition and clumsy code (e.g. the exec function, where I have to make a lot of double-checks).

My first priority for the AST right now is adding information on each node of the text that generated it. I already modified the lexer to give that information to the parser, but I have to do a much harder effort in the parser to attach that information to every AST node.

Since the structure of the AST is blocking me and adding debt every time I add functionality, I'm going to refactor it first.

Set implementation

Right now, sets are implemented as lists. I'm thinking of implementing them as HashSets. The Rust standard library has that data structure already. The main issue is that every Symstatic object must be hashable, which means that ASTNodes must be hashable.

Integer exponentiation

I implemented the exponentiation on integers badly. When the exponent is non-negative, The result is an integer and I can keep going.
When the exponent is negative, I can return a fraction.

for loops

I need syntax for a for loop. Something like this:

let nums := [1, 2, 4, 8]
for num in nums: print(num)

if expression

I need an if expression. Something like this:

let abs(a) := if a < 0 then -a else a

It is not a procedural if. The else part is mandatory.

comments

The lexer does not have support for comments yet.
The only requirement is, for now, single line comments with the # prefix.

Parser bad token in expression error

I need to report in a better way when there is an unexpected token in the main expression handler in the parser. I should create a new error type, something like ExpectedExpressionError.

`parser.rs` code quality

This file has almost 1k lines of code (two thirds are tests). The code has a lot of inlining and some repetition. The tests have a lot of repetition and are difficult to understand. This needs to be improved.

Tests for file parsing

The file parsing function (in src/file.rs) needs testing.

list/set comprehension

I want to use list and set comprehensions, just like in Python, SETL and Picat (which I found recently). That will require the addition of a keyword like for, in, or both.

Symbol disambiguation

In the programming languages with free symbols that I know, there is some syntax-level mechanism to make clear that an identifier represents a symbol to be matched or a value to be replaced in the code. For example, the Wolfram Language does this:

f[u_] := u + 1 (* the value that u holds is going to be replaced in the expression *)
f[u] := 1 (* This definition is going to be matched when I call f with the 'u' symbol *)

This is how Picat does it:

f(val) = 1. % Matches the 'val' symbol
f(Val) = Val + 1. Replaces the value identified with Val

I need a way to make the same distinction. I propose to decide what interpretation to choose this way:

If the parameter is used in the function body, replace;
else, match.

Here is an example:

let f(val) := val + 1 # replace
let f(val) := x # match

anonymous functions

I want anonymous functions. I see two options:

Since let expressions return the value of the thing being defined, just use let expressions where I want to return functions. Something like this:

myFunc(5, let _(x, y) := x + y)

This code should do the same:

let f(x, y) := x + y
myFunc(5, f)

Use a JS-like syntax to write anonymous functions. Something like this:

myFunc(5, (x, y) => x + y)

Of course, we could define a function like this:

let f := (x, y) => x + y

Assignments

The language should let the user change the value of an existing symbol. After all, I'm not that compromised with functional style; I just promote it.
I think of this syntax:

let a := 5
a := 6
a # 6

Absence of tests for builtins

The standard environment is not tested. It just needs a tiny regression test.

Using REPL tests as e2e/acceptance tests

Recently I added some tests to the REPL with the sole purpose of implementing full features. These tests are more like little acceptance tests, and they execute a lot of code. I should put these in a special place and use a more convenient format (that gives more information on my progress).

Arbitrary precision numbers

Right now the language only has 64-bit signed integers. I need to add arbitrary precision numbers. This includes:

BigInts
BigDecimals
Fractions

I'm not sure if I should mix BigInts and BigDecimals into one type, or have them separated.

program function problem

The parser has a function called program, that consumes a token stream and returns a list of ASTNodes. It uses the external expression method to do this. It should use the external next method.

Error messages

This is an umbrella issue to track my ideas on the error messages of the compiler. Currently, I created error types on all stages of the pipeline to get information on common errors. I should think of an initial design for an error message creator, and see how other compilers do it.

add infix operators

There is a lot of operators to add. I already refactored the parser's code to make it easier, so there is some work to do:

End Of Expression token

There is not an explicit expression terminator. When I first thought the language, the candidate was the dot .. Right now it does a lot of stuff, and I don't to add another use. I think I will use linebreaks as expression terminators. This will require more coding, but not a lot, and will look nice.

Semantic analysis

I must start the semantic analyzer soon. I will put the things it should do here:

Check for congruence of function pattern declarations (not using symbols for a function and something else)

Absence of tests for REPL

A lot of the REPL logic is not tested. I already refactored it to be testable.

Playground

I need to make a playground, so it is easy to try the language.

I looked at the code of Dart, Go and Rust playgrounds, and all of them run with a backend where all the code runs.

Docs

I need to write documentation for the compiler and for the language (since this code is, right now, the only specification of it!).

Rustdoc looks nice for the code docs. I maybe could use mdBook for the language docs.

List comprehension correctness

Right now, the correctness of list comprehensions is checked in exec and solved by panicking. This is bad and should be done in the weeder.

Assert

I need a builtin function to verify a statement and stop the program with error if the statement is false. This is the first step to make tests with the language.

Using BigInts as machine words

Currently, the default integer type is a BigInt. It is not bounded, so I have to check for overflows. The casting behavior that I defined with code (not tests) is:

If the number is negative, interpret it as zero
Else, take the minimum between the number and the maximum word-sized unsigned integer.

types

For now, I want these types:

Booleans
Chars (Bytes)
Integers (i64)
Strings

Syntax highlighting

It will be very helpful to make syntax highlighting and analyzer plugins for some IDEs and text editors. My first candidate is VSCode, since it is what I use. VSCode needs a token grammar and a LSP interface for the language.

Error presentation

I need to create a message for every error instance. Some examples to follow are the error messages of

Python
Rust
Go

Ranges

I want a special syntax to specify a range. Something like this:

(1..3).reduce((acc, cur) -> acc + cur) # 3

Notice that a..b includes a and excludes b.

prefix operators

I need some prefix operators:

- Additive inverse
~ Bitwise negation
! Logic negation

I/O functions

I need two functions in the global scope: input() and print(). I think of these interfaces just like the python functions with the same name.

Scope logic in for loops

Right now, the code inside a for loop runs in the same scope where the loop was written. I did this to test the loop, but I must change it.

pattern matching

To start, I won't add a match-like expression, but to write patterns in let expressions. Something like this:

let fib(0) := 0
let fib(1) := 1
let fib(n) := fib(n - 1) + fib(n - 2)

pseudo-OOP function call

I'd like to add this feature. Calling a function like this:

myFunc(arg1, arg2)

Is equivalent to this:

arg1.myFunc(arg2)

I found this feature in Picat.

REPL

I need a REPL. Right now there is a basic REPL that evaluates single lines. I need some features:

Getting previous commands like in the terminal
Searching previous commands like in the terminal
Multiline evaluation (this looks difficult)
Remember assigned values (currently it does not have memory)

`in` keyword

I was thinking about the behavior of this keyword as an operator and as part of the for loops. The two syntax rules are easy to distinguish and parse. As an operator is more complex. I see two big cases:

Inside list/set comprehensions,
as a question.

Test syntax

I saw a nice way to write unit tests in r/ProgrammingLanguages: put a decorator above a function, with an input/output pair:

@test((2, 3), 8)
@test((3, 2), 9)
let f(n, k) := n**k

File evaluation

The system should get a file and process it as a Symstatic program.

Scopes

Currently, I don't have environments to take values out of symbols. First, I need to determine the behavior of the language scopes.
For now, it will be the typical one: Search for a symbol in the first scope, and if it is not there, search in the parent scopes.