Giter Site home page Giter Site logo

cosec's Introduction

Cosec C Compiler

Cosec is a toy optimising C compiler, written to learn more about compiler theory. Cosec only generates x86-64 NASM assembly code (which is my MacBook Pro's architecture) for the moment.

My goals for the project are:

  • Maintainable: the source code is clear and well commented, written in a clean, modular, easily maintainable, and extensible fashion.
  • Complete: the compiler is compliant with the C99 standard (excluding a few minor features, and including a few extra GCC conveniences), and can compile itself.
  • Technically unique: the compiler uses a select set of complex algorithms for parsing, optimisation, and code generation.
  • Standalone: the compiler doesn't have any external dependencies.

Current features include:

  • Three levels of IR: Cosec uses 3 levels of intermediate representation (IR) to compile C code, including a high-level abstract syntax tree (AST), intermediate-level static single assignment (SSA) form IR, and low-level assembly code IR.
  • Complex optimisations: the compiler performs a complex set of optimisation and analysis passes on the SSA IR, to try and generate more efficient assembly.
  • Register allocation: Cosec uses a complex graph-colouring algorithm for register allocation, including support for pre-coloured nodes, register coalescing, and spilling.
  • Tests: Cosec comes with a suite of (relatively basic) tests using a simple Python wrapper and CMake's unit testing feature.

Compiler Pipeline

The compiler pipeline is the journey from C code to final output assembly. It occurs in multiple stages, with the output of one stage being fed into the next.

  1. Lexing: the C source code is read and converted into a sequence of tokens.
  2. Preprocessing: the C preprocessor is run on the output of the lexer, resolving things like #includes and #defines.
  3. Parsing: an abstract syntax tree (AST) is constructed from the preprocessed sequence of tokens. Several validation steps occur in this process, such as ensuring the syntax is well-formed, variables are defined before use, type checking, etc.
  4. Compilation: the static single assignment (SSA) form IR is generated from a well-formed AST. No validation occurs on the AST during this process; this is the responsibility of the parser (i.e., compilation shouldn't generate any errors).
  5. Optimisation and analysis: various SSA IR analysis and optimisation passes are interleaved to try and generate more efficient assembly.
  6. Assembling: the three-address SSA IR is lowered to the two-address target assembly language IR (only x86-64 is supported, for now).
  7. Register allocation: the assembler generates IR that uses an unlimited number of virtual registers; it's the job of the register allocator to assign virtual registers to physical ones.
  8. Emission: the final assembly code is written to an output file (only NASM assembly format is supported, for now).

Using Cosec

You can compile a C file using Cosec with:

$ Cosec test.c

This generates the output x86-64 assembly file test.s in NASM format. You can then assemble and link this file with:

$ nasm -f macho64 test.s
$ ld test.o

On macOS Big Sur (which is my MacBook Pro's operating system), you need to add the following annoying linker arguments:

$ ld -lSystem -L/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/lib test.o

You can print the AST, (optimised) SSA IR, and assembly IR (prior to register allocation) for a C file with:

$ ./Cosec --ir test.c

Other command line arguments include:

Argument Description
--version, -v Print version information
--help, -h Print usage information
-o <file> Set the output file
--ir Output AST, SSA, and assembly IR
--ast Output the AST
--ssa Output SSA IR
--asm Output assembly IR (before register allocation)

Building Cosec

You can build a release version of Cosec with CMake:

$ mkdir build
$ cd build
$ cmake -DCMAKE_BUILD_TYPE=Release ..
$ make

You can build a debug version with:

$ cmake -DCMAKE_BUILD_TYPE=Debug ..
$ make

You can run the test suite with:

$ make test

Hopefully in the future, you'll be able to build Cosec with itself!

cosec's People

Contributors

benanders avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.