Giter Site home page Giter Site logo

kaleido's Introduction

I'm Rahul Chhabra (he/they), a student of Information Technology at Indian Institute of Information Technology, Allahabad, graduating in 2025.

My primary interests are best described by "computational trinitarianism" - logic, category theory, dependent type theory (especially cubical ones), and the development and engineering of proof assistants.

Much of my work right now is on my realizability project - a formalisation of categorical realizability from the ground up in Cubical Agda.

kaleido's People

Contributors

rahulc29 avatar

Stargazers

 avatar  avatar

Watchers

 avatar

kaleido's Issues

Add function, variable and prototype structures for AST

The AST structure consists of nodes for simple unary and binary expressions such as addition, multiplication, division, etc. The basic structure therefore allows for parsing of expressions like 3 + 4 * ( 5 / 6 ) into an AST however not for more sophisticated ones like ( x + 5 ) * ( y - 3 )

Add GitHub Actions

Add CMake based GitHub actions so that every push builds successfully and passes all the unit tests.

Add conditionals to the language

The language currently only allows for one-expression functions and conditionals such as if, else, elseif, are not supported. The language is not Turing complete yet.

Add parser driver

The method kaleido::parser::RecursiveDescentParser::parse is currently unimplemented. This method should return an std::vector<std::unique_ptr<kaleido::ast::Function>>. This can be done as of now since top-level expressions separated by newlines are each parsed into one anonymous functions.
Example

def add(a b) a + b
3 + 4 * add(5, 7)

Should return an std::vector with two elements.

  1. The first element is a Function with a prototype consisting of 2 parameters (a and b), and the name add. The body should be an Addition with both the left and right children as Variables with names a and b respectively
  2. The second element is the anonymous function generated for all top level functions, it should be an Addition with left child a Literal with value 3, and the right child should be a Multiplication, the left child of which would be a Literal with value 4 and the right child would be a Invocation with function-name add and argument list consisting of two Literals with values 5 and 7.

Add `if` statements

  • Grammatically, if may be defined as a keyword. So, something like [if](\W)* would do.
  • if keyword must be followed by a (
  • parenthesis must be followed by a boolean expression (a >= b && (c != d || e == f)) -> this can be implemented with a little bit of modification to the recursive descent parsing algorithm
  • the expression should legally end with a )
  • the condition of the if should be followed by the body declaration. The body may be modelled with an std::vector<kaleido::ast::TreeNode>. Once we add assignment operations we'll get somewhat functional if blocks.

Add comparison operators

  • implement comparison operators (<, >, <=, >=, ==, !=) in lexer and parser
  • implement comparison operators in AST
  • implement code generation for comparisons using fcmp instruction of LLVM

Add mutable variable

  • A new bool property for node type Variable akin to mIsMutable. Could also add a new subtype of Variable, MutableVariable to handle the case but I feel polymorphism here is unnecessary. Kaleido is very simple and mutables and immutables don't defer significantly enough for polymorphism to actually achieve anything.
  • A new AST node type Reassignment : BinaryExpression based around the = operator to allow for reassignments
  • Add methods to generator API to prevent generation of Reassignment instances iff the Variable to reassign has mIsMutable true.

Add assignment operator

  • The assignment operator = can be understood as yet another binary expression with an LHS and an RHS.
  • New keywords such as val and var may also be added to aid in parsing and declaring mutability
  • A legal assignment may then be val a = 3. Here the LHS is a Variable and the RHS is a Literal.
  • RHS can easily be parsed using recursive descent parsing while LHS must necessarily be a valid identifier
  • In grammatic pseudocode, we may write [val|var](\s)+{id}(\s)*=(\s)*{expr}

Add Parser API

Add a parser API with the following abilities:

  • store and const-access a unique reference to a Lexer and obtains the Lexer using a (hardcoded for now) factory method
  • iterate through the tokens provided by the lexer and create AST
  • return unique reference to AST and provide ownership to client code

Add Fortran I inspired parser

Add an implementation of the parser API wherein the operator precedence strategy is the simpler one used by the Fortran I compiler as described here

Allow parsing of non-pretty expressions

Non-pretty expressions are expressions wherein every tokenizable is not separated by whitespace. So while 60 + 3 * 3 is pretty, 60 + 3*3 is not.
Currently, the lexer is only able to tokenize the first kind.

Add generator functions for all AST node types

The kaleido::gen::Generator class acts as an interface between AST nodes and the LLVM IR generator. Each node generates bitcode for itself and recursively calls the generator of all it's children. This converts the entirety of the AST into LLVM IR.
Each node must therefore implement it's generator function.

Dynamicise the matching of tokenizable classes

When tokenizable classes (such as keyword, identifier, operator, etc) are matched and the matching indices stored, the internal API insertMatches is called for each class. For a large number of classes, this is infeasible. The classes must therefore be put into a std::vector and dynamicise the insertions using for (auto &clazz : classes) of some sort.

Add Regex based Lexer

The current lexer is essentially an adhoc gluing of string processing. This string processing should be done via regex instead.

Add generated Lexer and implement Lexer API for it

The current Regex based Lexer RegexLexer uses an algorithm that runs in O(n^2 log(n)). Also it is incapable of handling even slightly complicated cases so that while for (int i=0;i<5;i++) can be tokenized for(int i=0;i<5;i++) does not tokenize perfectly (this particular case is flexible however I do not how many such corner cases exist). Also, it is not very extensible and adding if-else, loops and function definition using this lexer is going to be a nightmare.

Add basic structure of AST

AST must consist of

  • common tree node superclass
  • unary expression support (!, -, values themselves are unary expressions)
  • binary expression support (+, -, *, /, =, ==, !=, >, <, >=, <=, etc)
  • a different kind of node for everything else (function signatures, return statements, function calls, etc)

Add high-level client code

The high-level client code would perform the following tasks

  • factory methods to produce default implementations of the Lexer, Parser, and Generator APIs
  • determine the mode of input (standard input, file stream, HTTP, etc) and constructs the appropriate lexer
  • take ownership of the AST produced by Parser API
  • call the generator functions for the AST
  • provide the context for generation stage

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.