Giter Site home page Giter Site logo

sysprog21 / shecc Goto Github PK

View Code? Open in Web Editor NEW
1.0K 27.0 108.0 1 MB

A self-hosting and educational C optimizing compiler

License: BSD 2-Clause "Simplified" License

Makefile 1.41% C 93.14% Shell 5.45%
arm armv7 compiler self-hosting c linux risc-v rv32i rv32im riscv

shecc's Introduction

shecc : self-hosting and educational C optimizing compiler

logo image

Introduction

shecc is built from scratch, targeting both 32-bit Arm and RISC-V architectures, as a self-compiling compiler for a subset of the C language. Despite its simplistic nature, it is capable of performing basic optimization strategies as a standalone optimizing compiler.

Features

  • Generate executable Linux ELF binaries for ARMv7-A and RV32IM.
  • Provide a minimal C standard library for basic I/O on GNU/Linux.
  • The cross-compiler is written in ANSI C, making it compatible with most platforms.
  • Include a self-contained C front-end with an integrated machine code generator; no external assembler or linker needed.
  • Utilize a two-pass compilation process: the first pass checks syntax and breaks down complex statements into basic operations, while the second pass translates these operations into Arm/RISC-V machine code.
  • Develop a register allocation system that is compatible with RISC-style architectures.
  • Implement an architecture-independent, static single assignment (SSA)-based middle-end for enhanced optimizations.

Compatibility

shecc is capable of compiling C source files written in the following syntax:

  • data types: char, int, struct, and pointer
  • condition statements: if, while, for, switch, case, break, return, and general expressions
  • compound assignments: +=, -=, *=
  • global/local variable initializations for supported data types
    • e.g. int i = [expr]
  • limited support for preprocessor directives: #define, #ifdef, #elif, #endif, #undef, and #error
  • non-nested variadic macros with __VA_ARGS__ identifier

The backend targets armv7hf with Linux ABI, verified on Raspberry Pi 3, and also supports RISC-V 32-bit architecture, verified with QEMU.

Bootstrapping

The steps to validate shecc bootstrapping:

  1. stage0: shecc source code is initially compiled using an ordinary compiler which generates a native executable. The generated compiler can be used as a cross-compiler.
  2. stage1: The built binary reads its own source code as input and generates an ARMv7-A/RV32IM binary.
  3. stage2: The generated ARMv7-A/RV32IM binary is invoked (via QEMU or running on Arm and RISC-V devices) with its own source code as input and generates another ARMv7-A/RV32IM binary.
  4. bootstrap: Build the stage1 and stage2 compilers, and verify that they are byte-wise identical. If so, shecc can compile its own source code and produce new versions of that same program.

Prerequisites

Code generator in shecc does not rely on external utilities. You only need ordinary C compilers such as gcc and clang. However, shecc would bootstrap itself, and Arm/RISC-V ISA emulation is required. Install QEMU for Arm/RISC-V user emulation on GNU/Linux:

$ sudo apt-get install qemu-user

It is still possible to build shecc on macOS or Microsoft Windows. However, the second stage bootstrapping would fail due to qemu-arm absence.

To execute the snapshot test, install the packages below:

$ sudo apt-get install graphviz jq

Build and Verify

Configure which backend you want, shecc supports ARMv7-A and RV32IM backend:

$ make config ARCH=arm
# Target machine code switch to Arm

$ make config ARCH=riscv
# Target machine code switch to RISC-V

Run make and you should see this:

  CC+LD	out/inliner
  GEN	out/libc.inc
  CC	out/src/main.o
  LD	out/shecc
  SHECC	out/shecc-stage1.elf
  SHECC	out/shecc-stage2.elf

File out/shecc is the first stage compiler. Its usage:

$ shecc [-o output] [+m] [--no-libc] [--dump-ir] <infile.c>

Compiler options:

  • -o : Specify output file name (default: out.elf)
  • +m : Use hardware multiplication/division instructions (default: disabled)
  • --no-libc : Exclude embedded C library (default: embedded)
  • --dump-ir : Dump intermediate representation (IR)

Example:

$ out/shecc -o fib tests/fib.c
$ chmod +x fib
$ qemu-arm fib

Verify that the emitted IRs are identical to the snapshots by specifying check-snapshots target when invoking make:

$ make check-snapshots

shecc comes with unit tests. To run the tests, give check as an argument:

$ make check

Reference output:

...
int main(int argc, int argv) { exit(sizeof(char)); } => 1
int main(int argc, int argv) { int a; a = 0; switch (3) { case 0: return 2; case 3: a = 10; break; case 1: return 0; } exit(a); } => 10
int main(int argc, int argv) { int a; a = 0; switch (3) { case 0: return 2; default: a = 10; break; } exit(a); } => 10
OK

To clean up the generated compiler files, execute the command make clean. For resetting architecture configurations, use the command make distclean.

Intermediate Representation

Once the option --dump-ir is passed to shecc, the intermediate representation (IR) will be generated. Take the file tests/fib.c for example. It consists of a recursive Fibonacci sequence function.

int fib(int n)
{
    if (n == 0)
        return 0;
    else if (n == 1)
        return 1;
    return fib(n - 1) + fib(n - 2);
}

Execute the following to generate IR:

$ out/shecc --dump-ir -o fib tests/fib.c

Line-by-line explanation between C source and IR:

 C Source                IR                                         Explanation
------------------+---------------------------------------+----------------------------------------------------------------------------------

int fib(int n)      def int @fib(int %n)                    Indicate a function definition
{                   {
  if (n == 0)         const %.t1001, $0                     Load constant 0 into a temporary variable ".t1001"
                      %.t1002 = eq %n, %.t1001              Test "n" equals ".t1001" or not, and write the result in temporary variable ".t1002"
                      br %.t1002, .label.1177, .label.1178  If ".t1002" equals zero, goto false label ".label.1178", otherwise,
                                                            goto true label ".label.1177"
                    .label.1177
    return 0;         const %.t1003, $0                     Load constant 0 into a temporary variable ".t1003"
                      ret %.t1003                           Return ".t1003"
                      j .label.1184                         Jump to endif label ".label.1184"
                    .label.1178
  else if (n == 1)    const %.t1004, $1                     Load constant 1 into a temporary variable ".t1004"
                      %.t1005 = eq %n, %.t1004              Test "n" equals ".t1004" or not, and write the result in temporary variable ".t1005"
                      br %.t1005, .label.1183, .label.1184  If ".t1005" equals zero, goto false label ".label.1184". Otherwise,
                                                            goto true label ".label.1183"
                    .label.1183
    return 1;         const %.t1006, $1                     Load constant 1 into a temporary variable ".t1006"
                      ret %.t1006                           Return ".t1006"
                    .label.1184
  return
    fib(n - 1)        const %.t1007, $1                     Load constant 1 into a temporary variable ".t1007"
                      %.t1008 = sub %n, %.t1007             Subtract ".t1007" from "n", and store the result in temporary variable ".t1008"
                      push %.t1008                          Prepare parameter for function call
                      call @fib, 1                          Call function "fib" with one parameter
    +                 retval %.t1009                        Store return value in temporary variable ".t1009"
    fib(n - 2);       const %.t1010, $2                     Load constant 2 into a temporary variable ".t1010"
                      %.t1011 = sub %n, %.t1010             Subtract ".t1010" from "n", and store the result in temporary variable ".t1011"
                      push %.t1011                          Prepare parameter for function call
                      call @fib, 1                          Call function "fib" with one parameter
                      retval %.t1012                        Store return value in temporary variable ".t1012"
                      %.t1013 = add %.t1009, %.t1012        Add ".t1009" and ".t1012", and store the result in temporary variable ".t1013"
                      ret %.t1013                           Return ".t1013"
}                   }

Known Issues

  1. The generated ELF lacks of .bss and .rodata section
  2. The support of varying number of function arguments is incomplete. No <stdarg.h> can be used. Alternatively, check the implementation printf in source lib/c.c for var_arg.
  3. The C front-end is a bit dirty because there is no effective AST.

License

shecc is freely redistributable under the BSD 2 clause license. Use of this source code is governed by a BSD-style license that can be found in the LICENSE file.

shecc's People

Contributors

alx-lai avatar chaosunity avatar chiangkd avatar chinyikming avatar cratuki avatar drxiao avatar eecheng87 avatar erreth-akbe avatar idoleat avatar jserv avatar lecopzer avatar oscarshiang avatar tigercosmos avatar vacantron avatar visitorckw avatar wanghanchi avatar zoanana990 avatar zoo868e avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

shecc's Issues

Handle non-zero integers in if statements

In C, non-zero integers are considered truthy, implying that they will be evaluated as true in if statements and similar constructs. However, shecc does not accurately account for this behavior.

Example:

--- a/src/cfront.c
+++ b/src/cfront.c
@@ -1968,7 +1968,7 @@ int eval_expression_imm(opcode_t op, int op1, int op2)
         /* TODO: provide arithmetic & operation instead of '&=' */
         /* TODO: do optimization for local expression */
         tmp &= (tmp - 1);
-        if ((op2 != 0) && (tmp == 0)) {
+        if (op2 && (tmp == 0)) {
             res = op1;
             res &= (op2 - 1);
         } else

The alteration mentioned above results in an incomplete bootstrap.

Not enough register for reading expression

There's a test case:

a = 1 + (2 + (3 + (4 + (5 + (6 + (7 + (8 + (9 + (10 + (11 + (12 + (13 + (14 + (15 + (16 + (17 + (18 + (19 + 20))))))))))))))))))

In shecc, it will store all constants in register. Obviously, there's not enough register.
In modern compiler, I found they just load an immediate. e.g.

li a4, 210

Even I mix identifier into expression, modern compiler still load an immediate value. Do we need to rewrite read expression? Maybe we need to evaluate expression before run time. Also, I think we need symbol table to store these identifier.

Can't define symbol as string

Current shecc can't define symbol as string, as a result, #47 fail to refactor.

#define ARCH_PREDEFIND "__arm__"

Above expression is invalid, later patch will fix it.

Unable to bootstrap when "init_val != 0" is reduced into "init_val"

Reproducible with the following change:

--- a/src/codegen.c
+++ b/src/codegen.c
@@ -77,7 +77,7 @@ void size_funcs(int data_start)
                        data_start + elf_data_idx);
         /* TODO: add .bss section */
         if (strcmp(blk->locals[i].type_name, "int") == 0 &&
-            blk->locals[i].init_val != 0)
+            blk->locals[i].init_val)
             elf_write_data_int(blk->locals[i].init_val);
         else
             elf_data_idx += size_var(&blk->locals[i]);

Stage 2 of the bootstrapping would fail.

Fail to write global variable

Hi, I am trying to add feature: global variable initialization.
I modify some code and it seems good when I dump IR, following is simple sample

...
0x000122d4         x1 := 44
0x000122d8         x0 = &a
0x000122e0         *x0 = x1 (4)
0x000122e4     main:
0x000122fc         {
0x000122fc             x0 := &data[36]
0x00012304             x1 = &a
0x0001230c             x1 = *x1 (4)
0x00012310             x0 := printf() @ 978
0x00012314             x0 := 0
0x00012318             return (from main)
0x0001231c         }
0x0001231c         exit main

corresponding C code

int a = 44;
int main(int argc, char *argv[])
{
    printf("%d\n", a);
    return 0;
}

As I mention, it seems good for assigning initial value to a in IR, but the value printed is 0. This weird stuff stuck me a few day. Anything I ignore about writing value to global variable? Thanks!

Refactor code generations

Both src/arm-codegen and src/riscv-codegen.c have some functions in common. The code generation can be refactored with the following changes:

  1. split the shared functions/variables to new file src/codegen.c from src/{arm,riscv}-codegen.c
  2. In the end of src/codegen.c, there should be a statement `#include "src/arch-codegen.c" which links to Arm or RISC-V code generation implementation.
  3. The generated shecc executable file should be capable of showing its configurations such as the supported architecture and ABI.

Parse syntax for include macro in parser.c

Concern about code from line 3288 in parser.c,

    if (lex_peek(T_include, token)) {
        if (!strcmp(token_str, "<stdio.h>")) {
            /* ignore, we include libc by default */
        }
        lex_expect(T_include);

If the comment is valid, the lex_expect line should be inside the if block. As it is, stdio.h include lines will be processed by a lex_expect call.

If the comment is valid, here is a change that would put the lex_expect is put in an else block, and removing negation on the strcmp clause,

    if (lex_peek(T_include, token)) {
        if (strcmp(token_str, "<stdio.h>")) {
            /* ignore, we include libc by default */
        } else {
            lex_expect(T_include);
        }

Uninitialized variable: pred

Reported by Cppcheck:

rc/ssa.c:180:33: warning: Uninitialized variable: pred [uninitvar]
                if (bb->idom != pred) {
                                ^
src/ssa.c:163:31: note: Assuming condition is false
                for (i = 0; i < MAX_BB_PRED; i++) {
                              ^
src/ssa.c:180:33: note: Uninitialized variable: pred
                if (bb->idom != pred) {
                                ^

Implement basic optimizations

With the inclusion of SSA in the middle-end, it is now time to implement some common optimizations on the new SSA-based IR. These include constant folding, copy propagation, and dead code elimination.

Support macros defined in <stdbool.h>

C99 introduces a built-in type called _Bool for handling Boolean values. Additionally, the standard header <stdbool.h> defines macros bool, true, and false to facilitate boolean operations.

As per C99's specifications, except for bit-fields, all data types, including bool, must be allocated at least one byte of memory. This allocation ensures that objects consist of contiguous byte sequences, the specifics of which depend on the implementation or are explicitly stated. Since a char's size is universally one byte, it follows logically that the size of a bool must also be one byte, at most. In summary, based on C99 standards, the size of a bool is definitively set to one byte.

Incorporating the bool type into shecc would enhance the expressiveness of source code and improve memory usage when expressions solely need to represent binary relationships.

C99 6.2.6.1 General

Except for bit-fields, objects are composed of contiguous sequences of one or more bytes, the number, order, and encoding of which are either explicitly specified or implementation-defined.

C99 6.3.1.1 Boolean, characters, and integers

The rank of _Bool shall be less than the rank of all other standard integer types.

Fail to pass stage1

commit 5979cb8 introduces preliminary support of C99 _Bool. However, it fails to pass stage1:

$ make 
  SHECC	out/shecc-stage1.elf
Aborted (core dumped)
make: *** [Makefile:80: out/shecc-stage1.elf] Error 134

Tested on Ubuntu 20.04.6 LTS.

Eliminate compilation warnings

There are several compilation warnings during the generation of the cross-compiler, as partially shown below.

src/ssa.c: In function โ€˜bb_build_dfโ€™:
src/ssa.c:231:24: warning: unused parameter โ€˜fnโ€™ [-Wunused-parameter]
  231 | void bb_build_df(fn_t *fn, basic_block_t *bb)
      |                  ~~~~~~^~
...
src/ssa.c:771:36: warning: format โ€˜%pโ€™ expects argument of type โ€˜void *โ€™, but argument 3 has type โ€˜basic_block_t *โ€™ {aka โ€˜struct basic_block *โ€™} [-Wformat=]
  771 |     fprintf(fd, "subgraph cluster_%p {\n", bb);
      |                                   ~^       ~~
      |                                    |       |
      |                                    void *  basic_block_t * {aka struct basic_block *}
...
src/ssa.c:1056:35: warning: unused parameter โ€˜fnโ€™ [-Wunused-parameter]
 1056 | void bb_reset_live_kill_idx(fn_t *fn, basic_block_t *bb)

Support Arm/RISC-V targets without integer division instructions

Some Arm targets such as Cortex-A9 might lack of integer division instructions, and the current shecc would not generate the correct instruction sequence for them. Consequently, we have to provide the fallback by means of software emulation. Here is the reference implementation for integer division:

__div:
        push  {fp, lr}
        add   fp, sp, #4
        // check for divide by zero
        cmp   r1, #0
        beq   __div_end
        push  {r0, r1}
        // variables
        mov   r0, #0        // quotient
        pop   {r1, r2}       // dividend / remainder, divisor
        mov   r3, #1        // bit field
 __div_shift:
        // shift divisor left until it exceeds dividend
        // the bit field will be shifted by one less
        cmp   r2, r1
        lslls r2, r2, #1
        lslls r3, r3, #1
        bls   __div_shift
__div_sub:
        // cmp sets the carry flag if r1 - r2 is positive, which is weird
        cmp   r1, r2
        // subtract divisor from the remainder if it was smaller
        // this also sets the carry flag since the result is positive
        subcs r1, r1, r2
        // add bit field to the quotient if the divisor was smaller
        addcs r0, r0, r3
        // shift bit field right, setting the carry flag if it underflows
        lsrs  r3, r3, #1
        // shift divisor right if bit field has not underflowed
        lsrcc r2, r2, #1
        // loop if bit field has not underflowed
        bcc   __div_sub
__div_end:
        sub   sp, fp, #4
        pop   {fp, pc}

Reference:

Refactor architecture specific configurations

Instead of #43, I would propose an elegant way to specify the architecture and its configurations. The WIP changes are shown as following:

diff --git a/Makefile b/Makefile
index 9d3e9ca..8a31660 100644
--- a/Makefile
+++ b/Makefile
@@ -20,21 +20,22 @@ OBJS := $(SRCS:%.c=$(OUT)/%.o)
 deps := $(OBJS:%.o=%.o.d)
 TESTS := $(wildcard tests/*.c)
 TESTBINS := $(TESTS:%.c=$(OUT)/%.elf)
-TARGET_EXEC := `cat $(OUT)/target`
 
 all: config bootstrap
 
-config:
-ifeq (riscv,$(ARCH))
-	@$(VECHO) "$(RISCV_EXEC)" > $(OUT)/target
-	@$(VECHO) "#define TARGET_RISCV 1" > $@
-	@ln -s $(PWD)/$(SRCDIR)/riscv-codegen.c $(SRCDIR)/codegen.c
+# set ARM by default
+ifeq ($(strip $(ARCH)),riscv)
+ARCH = riscv
 else
-	@$(VECHO) "$(ARM_EXEC)" > $(OUT)/target
-	@$(VECHO) "#define TARGET_ARM 1" > $@
-	@ln -s $(PWD)/$(SRCDIR)/arm-codegen.c $(SRCDIR)/codegen.c
+ARCH = arm
 endif
-	@$(VECHO) "Target machine code switch to %s\n" "$$(cat out/target | sed 's/.*qemu-\([^ ]*\).*/\1/')"
+
+TARGET_EXEC = $($(shell echo $(ARCH) | tr a-z A-Z)_EXEC)
+
+config:
+	$(Q)ln -s $(PWD)/$(SRCDIR)/$(ARCH)-codegen.c $(SRCDIR)/codegen.c
+	$(call $(ARCH)-specific-defs) > $@
+	$(VECHO) "Target machine code switch to %s\n" $(ARCH)
 
 $(OUT)/tests/%.elf: tests/%.c $(OUT)/$(STAGE0)
 	$(VECHO) "  SHECC\t$@\n"
diff --git a/mk/arm.mk b/mk/arm.mk
index 0195288..5dbddd3 100644
--- a/mk/arm.mk
+++ b/mk/arm.mk
@@ -6,3 +6,10 @@ ARM_EXEC = echo WARN: unable to run
 endif
 
 export ARM_EXEC
+
+arm-specific-defs = \
+    $(Q)$(PRINTF) \
+" \#define ARCH_PREDEFIND \"__arm__\" /* defined by GNU C and RealView */\n\
+  \#define ELF_MACHINE 0x28 /* up to ARMv7/Aarch32 */\n \
+  \#define ELF_FLAGS 0x5000200\n \
+"
diff --git a/mk/riscv.mk b/mk/riscv.mk
index 19fe194..7c0fb62 100644
--- a/mk/riscv.mk
+++ b/mk/riscv.mk
@@ -5,4 +5,11 @@ $(warning "no qemu-riscv32 found. Please check package installation")
 RISCV_EXEC = echo WARN: unable to run
 endif
 
-export RISCV_EXEC
\ No newline at end of file
+export RISCV_EXEC
+
+riscv-specific-defs = \
+    $(Q)$(PRINTF) \
+" \#define ARCH_PREDEFIND \"__riscv\" /* Older versions of the GCC toolchain defined __riscv__ */\n\
+  \#define ELF_MACHINE 0xf3\n\
+  \#define ELF_FLAGS 0\n\
+"
diff --git a/src/cfront.c b/src/cfront.c
index af5f8d6..6da5206 100644
--- a/src/cfront.c
+++ b/src/cfront.c
@@ -2145,15 +2145,8 @@ void parse_internal()
     add_block(NULL, NULL);    /* global block */
     elf_add_symbol("", 0, 0); /* undef symbol */
 
-/* architecture defines */
-/* FIXME: use #ifdef ... #else ... #endif */
-#ifdef TARGET_ARM
-    add_alias("__arm__", "1"); /* defined by GNU C and RealView */
-#endif
-#ifdef TARGET_RISCV
-    /* Older versions of the GCC toolchain defined __riscv__ */
-    add_alias("__riscv", "1");
-#endif
+    /* architecture defines */
+    add_alias(ARCH_PREDEFIND, "1");
 
     /* binary entry point: read params, call main, exit */
     ii = add_instr(OP_label);
diff --git a/src/elf.c b/src/elf.c
index 27bfa12..e266aed 100644
--- a/src/elf.c
+++ b/src/elf.c
@@ -82,13 +82,7 @@ void elf_generate_header()
     elf_write_header_int(0);          /* EI_PAD: unused */
     elf_write_header_byte(2);         /* ET_EXEC */
     elf_write_header_byte(0);
-/* FIXME: use #ifdef ... #else ... #endif */
-#ifdef TARGET_ARM
-    elf_write_header_byte(0x28); /* ARM (up to ARMv7/Aarch32) */
-#endif
-#ifdef TARGET_RISCV
-    elf_write_header_byte(0xf3); /* RISC-V */
-#endif
+    elf_write_header_byte(ELF_MACHINE);
     elf_write_header_byte(0);
     elf_write_header_int(1);                          /* ELF version */
     elf_write_header_int(ELF_START + elf_header_len); /* entry point */
@@ -96,14 +90,7 @@ void elf_generate_header()
     elf_write_header_int(elf_header_len + elf_code_idx + elf_data_idx + 39 +
                          elf_symtab_index +
                          elf_strtab_index); /* section header offset */
-/* flags */
-/* FIXME: use #ifdef ... #else ... #endif */
-#ifdef TARGET_ARM
-    elf_write_header_int(0x5000200); /* ARM */
-#endif
-#ifdef TARGET_RISCV
-    elf_write_header_int(0);
-#endif
+    elf_write_header_int(ELF_FLAGS);
     elf_write_header_byte(0x34); /* header size */
     elf_write_header_byte(0);
     elf_write_header_byte(0x20); /* program header size */

Warning: the patch did not work yet, but it should be fairly expressive.

Questions about Op Priority

In "get_operator_prio(opcode_t op)" in line 836, cfront.c,
what is the reason you put lower priority Op in the front and lower priority Op in the rear?

If for reducing the clock cycle,
isn't the way should be putting the most used Operation in the front?

Thoughts on cfront's potential improvements

Currently cfront is using scanless parser with IR emitter binds into it, and it contains ~3000 LOC. But based on my contributions experience to industrial grade programming languages (V Lang in this case), shecc's frontend parser is lack of ease to either debug or for others to contribute (Even though shecc is meant to be educational).

Here's a list of my thoughts on improving the frontend of shecc:

  1. Discard scanerless parser, rewrite it into both lexer and parser.
  2. Introducing Abstract Syntax Tree for better IR emission and backend generation.
  3. Separation on the compilation phases into multiple files would be a better idea for potential contributors to learn shecc's architecture.

And by accepting the suggestion described above, the possible major changes would be:

  1. Having more than 3 phases compilation in shecc.
  2. The code generation logic must rewrite based on the introduced AST.

Support mmap on shecc

I am trying to implement mmap2 for shecc, but I found that there are no enough registers for calling the syscall based on test_mmap.s. So, I modified OP_syscall in src/riscv-codegen.c as following,

case OP_syscall:
            emit(__addi(__a7, __a0, 0));
            emit(__addi(__a0, __a1, 0));
            emit(__addi(__a1, __a2, 0));
            emit(__addi(__a2, __a3, 0));
+           emit(__addi(__a3, __a4, 0));
+           emit(__addi(__a4, __a5, 0));
+           emit(__addi(__a5, __a6, 0));
            emit(__ecall());
            if (dump_ir == 1)
                printf("    syscall");
            break;

On the other hand, to test the syscall, I modified the malloc help function as following:

block_meta_t *__malloc_request_space(int size)
{
    char *brk;
    block_meta_t *block;

+   void *tmp = __syscall(__syscall_mmap2, 0, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);

    brk = __syscall(__syscall_brk, 0);

After the mentioned modification, the segmentation fault is occurred when using qemu to execute the code.

$ make VERBOSE=1
out/inliner lib/c.c out/libc.inc
cc -o out/src/main.o -O -g -ansi -pedantic -Wall -Wextra -c -MMD -MF out/src/main.o.d src/main.c
cc out/src/main.o -o out/shecc
out/shecc --dump-ir -o out/shecc-stage1.elf src/main.c > out/shecc-stage1.log
chmod a+x out/shecc-stage1.elf
/usr/bin/qemu-riscv32 out/shecc-stage1.elf -o out/shecc-stage2.elf src/main.c
make: *** [Makefile:76: out/shecc-stage2.elf] Segmentation fault (core dumped)

Then, I dump the disassembly code to check. The caller and callee are shown below:

  • Caller:
   123e4:	ff010113          	addi	sp,sp,-16
   123e8:	00a12023          	sw	a0,0(sp)
   123ec:	0de00513          	li	a0,222
   123f0:	00000593          	li	a1,0
   123f4:	00001637          	lui	a2,0x1
   123f8:	00060613          	mv	a2,a2
   123fc:	00300693          	li	a3,3
   12400:	02200713          	li	a4,34
   12404:	fff00793          	li	a5,-1
   12408:	00000813          	li	a6,0
   1240c:	c61fd0ef          	jal	ra,0x1006c
  • Callee:
   1006c:	ff010113          	addi	sp,sp,-16
   10070:	00812623          	sw	s0,12(sp)
   10074:	00112423          	sw	ra,8(sp)
   10078:	00010413          	mv	s0,sp
   1007c:	00050893          	mv	a7,a0
   10080:	00058513          	mv	a0,a1
   10084:	00060593          	mv	a1,a2
   10088:	00068613          	mv	a2,a3
   1008c:	00070693          	mv	a3,a4
   10090:	00078713          	mv	a4,a5
   10094:	00080793          	mv	a5,a6
   10098:	00000073          	ecall
   1009c:	01040113          	addi	sp,s0,16
   100a0:	ff812083          	lw	ra,-8(sp)
   100a4:	ffc12403          	lw	s0,-4(sp)
   100a8:	00008067          	ret

It looks like a correct implementation and is similar to the test_mmap.s, but it is failed. I have no idea how to solve it.

For a coding question: about parser

I would like to ask, in parser, is "shecc" a directly generated instruction set linked list sequence? I don't seem to have found the relevant code for AST. If so, please point out.

If possible, may you comment on each structure and field in def.h?

Heartfelt thanks.

The peephole optimization breaks the macro expansion

Considering the following test case:

#define M(a, b) a + b

int main()
{
    return M(1, 2) * 3;
}

The stage-1 shecc (i.e. out/shecc-stage1.elf, testing with command: qemu-arm out/shecc-stage1.elf --no-libc <file>) reports:

Error Unexpected token at source location 20
#define M(a, b) a + b
                    ^ Error occurs here
Abnormal program termination

The current test driver runs on the stage-0 shecc and doesn't detect this error.

Fail to self-host

The stage-1 and stage-2 ELF are not identical. Check them with the following diff command:

$ diff <(arm-linux-gnueabihf-objdump -d out/shecc-stage1.elf) <(arm-linux-gnueabihf-objdump -d out/shecc-stage2.elf)

output:

35856,35861c35856,35861
<    33074:     e59d5000        ldr     r5, [sp]
<    33078:     e58d50bc        str     r5, [sp, #188]  ; 0xbc
<    3307c:     e59d6000        ldr     r6, [sp]
<    33080:     e58d60c0        str     r6, [sp, #192]  ; 0xc0
<    33084:     e59d7000        ldr     r7, [sp]
<    33088:     e58d70c4        str     r7, [sp, #196]  ; 0xc4
---
>    33074:     e59d6000        ldr     r6, [sp]
>    33078:     e58d60bc        str     r6, [sp, #188]  ; 0xbc
>    3307c:     e59d7000        ldr     r7, [sp]
>    33080:     e58d70c0        str     r7, [sp, #192]  ; 0xc0
>    33084:     e59d2000        ldr     r2, [sp]
>    33088:     e58d20c4        str     r2, [sp, #196]  ; 0xc4
36201,36206c36201,36206
<    335d8:     e59d5000        ldr     r5, [sp]
<    335dc:     e58d50bc        str     r5, [sp, #188]  ; 0xbc
<    335e0:     e59d6000        ldr     r6, [sp]
<    335e4:     e58d60c0        str     r6, [sp, #192]  ; 0xc0
<    335e8:     e59d7000        ldr     r7, [sp]
<    335ec:     e58d70c4        str     r7, [sp, #196]  ; 0xc4
---
>    335d8:     e59d6000        ldr     r6, [sp]
>    335dc:     e58d60bc        str     r6, [sp, #188]  ; 0xbc
>    335e0:     e59d7000        ldr     r7, [sp]
>    335e4:     e58d70c0        str     r7, [sp, #192]  ; 0xc0
>    335e8:     e59d2000        ldr     r2, [sp]
>    335ec:     e58d20c4        str     r2, [sp, #196]  ; 0xc4
36463,36468c36463,36468
<    339f0:     e59d5000        ldr     r5, [sp]
<    339f4:     e58d50bc        str     r5, [sp, #188]  ; 0xbc
<    339f8:     e59d6000        ldr     r6, [sp]
<    339fc:     e58d60c0        str     r6, [sp, #192]  ; 0xc0
<    33a00:     e59d7000        ldr     r7, [sp]
<    33a04:     e58d70c4        str     r7, [sp, #196]  ; 0xc4
---
>    339f0:     e59d6000        ldr     r6, [sp]
>    339f4:     e58d60bc        str     r6, [sp, #188]  ; 0xbc
>    339f8:     e59d7000        ldr     r7, [sp]
>    339fc:     e58d70c0        str     r7, [sp, #192]  ; 0xc0
>    33a00:     e59d2000        ldr     r2, [sp]
>    33a04:     e58d20c4        str     r2, [sp, #196]  ; 0xc4

Integrate with semu

  • Port semu to the shecc subset of c
  • Use semu instead of qemu in bootstrapping

how to compile

HI dear author,
It's truly a honor to write a letter to you, I'm building your project nowadays and found the error when building as following, I wonder if there is a chance that you know the reason? :)

image

thank you
best regards to you
William

qemu: uncaught target signal 4 (Illegal instruction) - core dumped

I tried to have fun with instructions with README.md, but got error like the following,

$ make
  SHECC	out/shecc-stage2.elf
qemu: uncaught target signal 4 (Illegal instruction) - core dumped
make: *** [Makefile:59: out/shecc-stage2.elf] Illegal instruction (core dumped)

Then I made a simple test,

  1. Passed to execute native shecc binary on host.
  2. Failed to execute shecc-stage2.elf with qemu-arm shecc-stage2.elf hello.c command.

2020-09-06_101439

Do we need to change how shecc evaluate expression?

The way recent shecc evaluating local expression is generating relative arithmetic instruction and calculating in run-time. I think we can change the scheme into evaluating in compile time.

Although it will increase compile time, there're still lots of benefits:

  • we can do more tricks(optimization) on expression because we know the characteristic(value) of the operand before run-time. e.g. quick modulo
  • no operand limit because we use stack. Also we can pass relative test bench in driver.sh(expr 210 "1 + (2 + (3 + (4 + (5 + (6 + (7 + (8 + (9 + (10 + (11 + (12 + (13 + (14 + (15 + (16 + (17 + (18 + (19 + 20))))))))))))))))))")

need to optimize symbol lookup?

I used uftrace to analyze the flow and performance of shecc and found the most time shecc spent is strcmp():
$ uftrace record out/shecc src/main.c # check stage0 for now, since stage1 didn't support something like gcc -pg
$ uftrace report
Total time Self time Calls Function
========== ========== ========== ====================
166.474 ms 0.776 us 1 main
...
41.193 ms 41.193 ms 1008573 strcmp //<- for stage1, this would be even higher since strcmp() is naive

It's not too hard to realize this since strcmp() was used a lot by find_func() and other similar functions (with linear search). Do you think we need to use some kind of dictionary to optimize this? (I can add it. :-) )

After all, hash table or trie might not be too complicated, we can still have the educational purpose. Or this is intentional for students to add?

Improve intermediate representation and also register allocation

Description

The current register allocator uses the strategy of the stack-based virtual machine and is connected to the generation of the intermediate representation. These might result in the inefficient utilization of the available registers and make the optimization harder.

To improve these, we could decouple the IR generation and the allocator, and introduce the new algorithm (e.g. linear-scan, graph coloring, etc.) to support the register-based virtual machine model.

Related issues

Determine the factors contributing to unexpected slowdowns during self-hosting

This project has the capability for self-hosting, transitioning from native compilation in stage 1 to stage 2. Given that shecc is composed in ANSI C, it allows for the use of any compiler across various platforms for the initial compilation. This process is further facilitated through the use of RISC-V and/or Arm emulators for bootstrapping. Nevertheless, we are experiencing unforeseen delays during the self-hosting process, particularly in the execution times of stages 1 and 2. Our aim is to conduct a thorough investigation into the causes of these slowdowns.

We can use uftrace to identify the sources of slowdowns.

High branch-miss rate when hosting shecc on the Raspberry Pi 3B

The built executable file (here is shecc itself at 2b5d7b6) has much higher branch-miss rate (19.91%) compare with GCC built (5.74%) when running on the Raspberry Pi 3B. The branch misprediction might potentially slowdown the execution.

How to reproduce

  1. Build 2b5d7b6
$ make config
$ make out/shecc
$ out/shecc -o out/shecc-stage1.elf ../shecc-2b5d7b6/src/main.c
$ chmod +x out/shecc-stage1.elf
  1. Measure with perf
$ perf stat -r 5 -e branches,branch-misses out/shecc-stage1.elf tests/fib.c
  1. Get the statistics
 Performance counter stats for 'out/shecc-stage1.elf tests/fib.c' (5 runs):

         2,308,457      branches:u                #   33.270 M/sec                    ( +-  0.00% )
           459,550      branch-misses:u           #   19.91% of all branches          ( +-  0.02% )

         0.078757 +- 0.000616 seconds time elapsed  ( +-  0.78% )

Related issues

Declare variables where needed

In C99, it is possible to declare variables precisely where they are needed, a feature also present in C++. Variables can also be declared within the control structure of a for loop, a feature introduced in C99:

 for (int x = 0; x < 10; x++) {  /* This is permitted starting with C99 */
    ...
}

The C99 standard says: (6.8.5.3 The for statement)

for ( clause-1 ; expression-2 ; expression-3 ) statement
behaves as follows: The expression expression-2 is the controlling expression that is evaluated before each execution of the loop body. The expression expression-3 is evaluated as a void expression after each execution of the loop body. If clause-1 is a declaration, the scope of any variables it declares is the remainder of the declaration and the entire loop, including the other two expressions; it is reached in the order of execution before the first evaluation of the controlling expression. If clause-1 is an expression, it is evaluated as a void expression before the first evaluation of the controlling expression.

This enhancement simplifies the code by allowing variable declarations closer to where they are used, thereby improving readability and maintainability. shecc should support this feature.

Unable to self compile stage 1

In commit 5f4f6d6, by applying the following command make distclean config ARCH=arm and then make to build stage 1 compiler, the compiler will report unexpected token error at unreasonable position:

Error Unexpected token at source location 20771
    ref_block_t *head, *tail;
                       ^ Error occurs here

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.