tjhancocks / bolt Goto Github PK
View Code? Open in Web Editor NEWThe main repository for the Bolt Programming Language and Compiler
License: MIT License
The main repository for the Bolt Programming Language and Compiler
License: MIT License
libc.bolt
requires compiled programs to link against the C standard library. This is currently hard coded into the linking phase of the compiler. Indeed, not all programs will require it, and some may even need it to not happen. There should be a way for libc.bolt
to instruct the compiler to include it. This can then also expand out for future stdlib files to include additional libraries.
The current syntax idea for providing a compiler directive is this:
@pragma(<domain>, <flag>)
@pragma(linker, -lc)
Support for this will need to be added into the language, and an appropriate definition added into the Sublime Text syntax definition
There is currently no branching in Bolt, which means programs will execute in a linear fixed manner. There needs to be support for conditions (if-elseif-else
) so that programs can execute appropriate functionality based on runtime state.
There will be a number of aspects that need to be added for this, including parsers, sema, code generation, etc.
The basic idea for conditions will be as follows:
if condition {
// code here
}
elif condition {
// code here
}
else {
// code here
}
This will require further operators/operations to be implemented on top of those added in #38, in order to allow conditions to be formed (these include ==
, !=
, <
, >
, >=
, <=
, &&
, &
, ||
, |
, !
)
The AST generated by the Parser will potentially have errors (due to syntactic errors in the source code) and thus need validation to ensure correctness. This is the role of the Semantic Analyser.
It needs to be able to sure matching types between components of an expression, correct arity of function calls, and so on.
The initial implementation will likely be simple, serving to get to the goal of a Hello World executable.
Once semantic analysis has been completed, the compiler needs to begin the process of code generation (we're ignoring potential optimisations at the moment). For this it needs to produce the LLVM IR code necessary to represent the program being compiled.
v0.0.1 was about reaching Hello, World at all cost. v0.0.2 is about making a future to build upon. The AST built in v0.0.1 is very OO and not even in a well thought out way. This should be rectified, and the whole thing cleaned up or rebuilt.
I see 2 main ways that this could be solved.
Clean up and massively reorganise the AST. Try to make use of generics and protocols to avoid duplication and too much inheritance when eliminating code duplication.
Use an Enum, with different cases representing a different type of expression in the AST.
The problem has arisen due to my desire to create a traversal algorithm whilst trying to eliminate the number of nodes present. This means the number of distinct/unique decisions that need to be made on each node increases, thus massively complicating a traversal algorithm. Whilst the traversal itself is simple, trying to act on some nodes and not others and mutate the tree at the same time has proved to be a bad idea and led to some questionable decisions.
There may be requirement to be add a Concrete Syntax Tree to the compiler, which is generated prior to the ASTs creation.
The hope with the project is provide integrations to various tools right alongside the compiler, and to make them as supported by the toolchain as possible.
The sublime text syntax definition should try to be in sync with the current language grammar as possible and provide at least the following features:
Bolt is going to be using LLVM as the backend for the compiler as it will give access to assembly code generation, object file generation, linking, optimisation, etc. There maybe some experimentation without LLVM in the future, but currently there is no desire to do so.
This project will be making use of LLVMSwift.
The compiler needs to parse a stream of tokens and identify syntactic structures within it. At its most basic it needs to be able to do the following:
The bare minimum should be built for this to allow for the target Hello World executable and basic standard library imports required in this version to be met.
Most programming languages and indeed programs require more than just Int
, String
, Int8
and None
as types.
The following scalar types need to be added to the language, as well as a mechanism in the compiler that allows specifying the target architecture so that IntPointer
, UIntPointer
, Int
, and UInt
all adopt the correct bit width.
Int16
Int32
Int64
IntPointer
UInt8
UInt16
UInt32
UInt64
UInt
UIntPointer
Bool
Currently the language has no means of performing basic arithmetic operations. This makes it extremely limiting. There needs to be support for the basic arithmetic operations added into the language.
In order to implement this in the language a concept of prefix, postfix and infix operations need to be added to the language as either unary and binary operations. In addition to this support for precedence levels between operations needs to be added, along with the appropriate semantic analysis, optimisation and code generation.
As a continuation of issue #12, once we have the LLVM IR, the compiler needs to produce any required object files for the program being compiled.
If a source code file is terminated by a comment, and that comment is not terminated by a newline, then the lexer will run outside of the bounds of the file and throw and error.
let<Int> foo = 24 // No newline at the end of this comment
There will likely need to be an improvement to character consumption in the lexer to ensure this doesn't occur in other scenarios either.
Object files that were produced by the compiler should be linked together into a single executable. This executable should be able to run on the current host architecture.
A lot of initial test code and files was produced for early bolt syntax tests. These will be utterly invalid going forwards and should be removed from the repository to avoid confusion.
The C standard library, and indeed the Bolt standard library will need to support variadic arguments in functions for the implementation of aspects like printf()
in libc.bolt.
Support needs to be added to the parameter representation in the AST to allow it to represent a variadic argument, and then into the code generation of functions so that they can correctly create the IR representation of such a function.
Further to test this a new function declaration should be added to libc.bolt for printf
.
printf
declaration to libc.Although the compiler itself can be built using swift run
or swift build
, it would be nice if the project had make rules and triggers to kick off installing various integrations, builds and what have you.
Everything should be as easy to use as humanly possible and not require new users to torture themselves for no good reason.
Currently error reporting has rudimentary support in the compiler but it is not fantastic and it is not comprehensive. This really needs to be sorted before v0.0.1 so that it does not become tedious to manage later on.
Some error states are impossibilities for the compiler to be in, and thus should be treated as such. If the compiler gets into such a state it should raise a fatalError()
with a message about why the fatal error was raised. This can help track down bugs in the compiler.
These are the kinds of errors that need to be reported to the user, regarding their code that they are trying to compile. The compiler already has a concept of locations in the source code (see Mark
) and should be passing this via the Error. However the coverage of such errors is spotty and/or inconsistent. Further to this some errors are using .unknown
for the Mark.
The lexical analysis functionality of the compiler needs to be implemented to the point of being able to identify the following types of token:
The lexer should not attempt to resolve type information or symbol names, as this will be done by the parser later in the Compiler.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.