Giter Site home page Giter Site logo

kpd's People

Contributors

felixangell avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

yuknight

kpd's Issues

error checks for private/public symbols in modules

we need to make sure that we are only referencing symbols in modules that have been made public i.e are capitalized.

an interesting thought/assumption:

module_name.some_module_member

rather than checking if some_module_member is public, we know it's private since it's in lowercase... we could then check to see if the member exists and then complain?

store token info for types

the type system doesnt store any token info so good error messages are a little tricky in this regard

switch statement

there is no switch statement yet? would this be a match? ill probably implement this after the entire language as it is now has been code generated for (to kir).

parser

  • expressions
    • indexing foo[12]
    • select foo.bar.baz
    • slice foo[start : end]
    • tuple access foo.0
    • calls foo()
      • call arguments... foo(blah, baz, 5, blah, whatever)
  • lambda
  • global variables
  • composites?!
  • destructuring
    • tuples
    • structures
  • types
    • structure
    • trait
    • function
    • pointer
    • slice
    • tuple
    • union
    • tagged union
  • function receivers
  • operators: size_of, type_of, len_of?
  • generic arguments
  • attributes, e.g. #[no_mangle], etc.
  • nested blocks
  • only one func receiver (func_type doesnt obey this)

code generation tracking issue

  • if statements
  • else if
  • else
  • while loops
  • loops
  • function calls (no arguments)
  • function calls (arguments)
  • global variables
  • local variables
  • function return statement
  • function return with expression
  • breaking out of loops
  • skipping iteration in loop
  • block evaluation with eval
  • lambdas
  • array indexing
  • allocating arrays
  • unary expressions
  • binary expressions
  • pointers
  • address of

calling a function lookup fails in the type environment for variables

This runs fine:

func min(a s32, b s32) s32 {
    if a < b {
        return a;
    }
    return b;
}

func main() {
    let smaller = 3;
    let larger = 33;
    return min(larger, smaller); // NOTICE ME
}

/// .stdout
/// 3

However this does not:

func min(a s32, b s32) s32 {
    if a < b {
        return a;
    }
    return b;
}

func main() {
    let smaller = 3;
    let larger = 33;
    let c = min(larger, smaller);
    return c;
}

/// .stdout
/// 3

This is the error:

Fatal: type_infer: unhandled statement if([path: [sym a]]<[path: [sym b]])
Fatal: type_infer: unhandled statement ast.Return_Statement_Node
Error: Couldn't find type 'min' in environment:
core.exception.AssertError@source/middle/infer.d(191): Assertion failure
----------------
4

nested blocks

for parsing and sema analysis

{
    {
        // this is not allowed yet!
    }
}

mangle c_symbol defined functions properly

not sure if i have to follow the c mangling standard whatever that is, but for example:

#{no_mangle, c_symbol, variadic}
func printf(fmt *u8) s32;

Would need to be generated code for as:

call _printf

else if/else chains are correct

this could probably be done in the parser, throw an error for things like:

if a == x {

} else {

} else {

} else if {

} ... etc

unreachable code check

checks to see if there is unreachable code, i.e a loop that will never terminate, etc.

boolean operator codegen is incorrect

func main() {
	let a = 3 > 5;
	let b = 5 > 3;
	if b > 1 {
		b = 32;
	}
	return a + b;
}

/// .stdout
/// 32

b > 1 evaluates to true even though b is equal to 1 and not greater than

check for symbol conflicts in tagged unions

The following compiles with no errors:

type Foo enum {
	Label,
	Tagged { x s32, y s32 },
	Tagged { x s32, y s32 },
	Tagged { x s32, y s32 },
	Tagged { x s32, y s32 },
	Anonymous (s32, s32),
	Anonymous (s32, s32),
	Anonymous (s32, s32),
	Anonymous (s32, s32),
	Anonymous (s32, s32),
	Foo,
	Bar,
	Baz,
};

check position of elif and else

else if must be after an if
else must be after an else if or an if

some illegal cases:

else if _ {

}
if _ {

}

...

else {

} if _ {

}

change func args, etc? to an array or some ordered hashmap?

order is important for some of these structures i.e. functions mostly. the d associative arrays do not seem to keep the items in order so i would either need to use an array or an ordered hashmap?

this might also be necessary for things like structures too because the order might be important for how they are stored in memory.

global variables not introduced into type environment

let global_var = 32;

func main() {
    let j = global_var - global_var;
    while j != global_var {
        j = j + 1;
    }
    return j;
}

Stack trace:

KRUG COMPILER, VERSION 0.0.1
* Executing compiler, optimization level O0
* Operating system: Mac OS X
* Architecture: x86_64
* Target Architecture: X64
* Compiler is in debug mode

Warning: decl: Unhandled statement [path: [sym j]]=[path: [sym j]]+1
Warning: decl: Unhandled statement ast.Return_Statement_Node
Error: Couldn't find type 'global_var' in environment:
core.exception.AssertError@source/middle/infer.d(191): Assertion failure
----------------
4   krug                                0x00000001097c3bad _d_assertp + 117
5   krug                                0x000000010978ab35 sema.type.Type sema.infer.Type_Inferrer.get_type(immutable(char)[], sema.type.Type_Variable[immutable(char)[]]) + 593
6   krug                                0x000000010978abbf sema.type.Type sema.infer.Type_Inferrer.get_symbol_type(immutable(char)[], sema.type.Type_Variable[immutable(char)[]]) + 59
7   krug                                0x000000010978b06b sema.type.Type sema.infer.Type_Inferrer.analyze(ast.Node, sema.infer.Type_Environment, sema.type.Type_Variable[immutable(char)[]]) + 487
8   krug                                0x000000010978ac73 sema.type.Type sema.infer.Type_Inferrer.analyze_path(ast.Path_Expression_Node, sema.type.Type_Variable[immutable(char)[]]) + 95
9   krug                                0x000000010978b1f0 sema.type.Type sema.infer.Type_Inferrer.analyze(ast.Node, sema.infer.Type_Environment, sema.type.Type_Variable[immutable(char)[]]) + 876
10  krug                                0x000000010978af7c sema.type.Type sema.infer.Type_Inferrer.analyze(ast.Node, sema.infer.Type_Environment, sema.type.Type_Variable[immutable(char)[]]) + 248
11  krug                                0x000000010978ae23 sema.type.Type sema.infer.Type_Inferrer.analyze_variable(ast.Variable_Statement_Node, sema.type.Type_Variable[immutable(char)[]]) + 111
12  krug                                0x000000010978af31 sema.type.Type sema.infer.Type_Inferrer.analyze(ast.Node, sema.infer.Type_Environment, sema.type.Type_Variable[immutable(char)[]]) + 173
13  krug                                0x000000010978ae81 sema.type.Type sema.infer.Type_Inferrer.analyze(ast.Node, sema.infer.Type_Environment) + 53
14  krug                                0x0000000109790a77 void sema.type_infer_pass.Type_Infer_Pass.analyze_let_node(ast.Variable_Statement_Node) + 291
15  krug                                0x0000000109790c03 void sema.type_infer_pass.Type_Infer_Pass.visit_stat(ast.Statement_Node) + 55
16  krug                                0x00000001097911bd void sema.visitor.Top_Level_Node_Visitor.visit_block(ast.Block_Node, void delegate()) + 625
17  krug                                0x0000000109790bc5 void sema.type_infer_pass.Type_Infer_Pass.analyze_function_node(ast.Function_Node) + 49
18  krug                                0x0000000109791262 void sema.visitor.Top_Level_Node_Visitor.process_node(ast.Node) + 110
19  krug                                0x0000000109790e6b void sema.type_infer_pass.Type_Infer_Pass.execute(ref krug_module.Module, immutable(char)[]) + 503
20  krug                                0x000000010978c8ff void sema.analyzer.Semantic_Analysis.process(ref krug_module.Module, immutable(char)[]) + 339
21  krug                                0x000000010975faa4 void kargs.build.Build_Command.process(immutable(char)[][]) + 2880
22  krug                                0x000000010974726e _Dmain + 862
23  krug                                0x00000001097d7503 void rt.dmain2._d_run_main(int, char**, extern (C) int function(char[][])*).runAll().__lambda1() + 39
24  krug                                0x00000001097d7393 void rt.dmain2._d_run_main(int, char**, extern (C) int function(char[][])*).tryExec(scope void delegate()) + 31
25  krug                                0x00000001097d746e void rt.dmain2._d_run_main(int, char**, extern (C) int function(char[][])*).runAll() + 138
26  krug                                0x00000001097d7393 void rt.dmain2._d_run_main(int, char**, extern (C) int function(char[][])*).tryExec(scope void delegate()) + 31
27  krug                                0x00000001097d7301 _d_run_main + 485
28  krug                                0x00000001097472f1 main + 33
29  libdyld.dylib                       0x00007fffc14df254 start + 0
30  ???                                 0x0000000000000002 0x0 + 2

misc semantic checks

per-function

  • unused variables
  • unreachable code (i.e. infinite loop)
  • parameter symbol conflicts with function variables (bug)

type-checks

  • types in expressions are the same
  • return type matches parent function
  • const correctness

per-submodule

  • unused functions only if unexported/private? this is hard because other modules may use it

note: this is not an exhaustive list.

type inference for eval blocks

		Alloc a = new Alloc(get_int(32), bb.name() ~ "_" ~ gen_temp());
		curr_func.add_instr(a);

We just dump an s32 in there, this is obviously not how it works!

c-style string literals

i forgot if these are actually implemented or not in the lex/parse, i feel like they are because the tests seem to parse fine...

eitherway, these need to be handled during ir-gen so that c-style strings create a null terminated string rather than a structure (krug string type)

kir codegen

tracking issue for kir codegen, some stuff to do:

  • break
  • for loops
  • structure type gen
  • function parameters
  • structure field access
  • function receivers
  • tuples
  • tagged union
  • else if
  • else
  • lambdas/anonymous functions
  • defer
  • slices
  • global variables
  • casting types?
  • type promotion?
  • de-structuring statements
    • tuples
    • structures
  • path expressions
  • next
  • return
  • while loops
  • infinite loops
  • function code gen
  • variables
  • function calls
  • if statements
  • eval block (block_expr_node)
  • yield statement (part of eval blocks)
  • floating point types
  • pointers
  • address of

code cleanups

gonna make a long list here but the code is pretty messy right now:

  • remove the scope/range stuff
  • avoid all runtime string concatenation where possible
  • cleanup variable naming stuff (sometimes I forgot if i was using snake_case or camelCase...)
  • refactor some utility functions (e.g. BlameToken)
  • make the symbol table/scope thing cleaner
  • improve general compiler error logging (have a dev mode flag option too)

argument parser

krug uses the built in D argument parsing thing which is kind of rubbish and gives some crazy error messages.
i want to implement this myself but i want to have a similar system to go, i.e. the krug compiler has multiple sub commands, i.e. ./krug <sub_command>

sub commands would be:

  • run - runs the program
  • compile - compiles the program into an executable
  • explain - explains the given error message
  • help - shows a help message with a list of sub-commands

additionally, these sub-commands will work with short flags, i.e. ./krug r is shorthand for ./krug run

refactor camelcase naming stuff

im pretty sure there are a few cases where the code deviates from the style guide, i.e.

fooBar
FooBar

should be

foo_bar
Foo_Bar

"poisoned" ast nodes

when we encounter a parser error, return a bad node rather (as well as the recovery skips?)
nodes for Bad Decls, Exprs, Stats, etc. these will capture the token spans.

ensure that every krug program's dependency graph is acyclic

this should be as simple as implementing tarjans strongly connected components algorithm that will execute over the programs dependency graph.
before we begin the actual program parse/sema phase we want to run the tarjans algorithm on the dependency graph and notify the user of any cycles in the program.
this would be done before we flatten the graph, if the program is not acyclic then we terminate execution of the compiler.
there is a flaw to this as we have to do a check and build the programs dependency graph every time we compile a krug program. this could be quite slow and we wouldn't even start compiling or lexing anything. in the future we could either:

  • start lex/parsing stuff anyways?
  • cache the dependency graph of the program somewhere

i have a feeling the first option would be worse because if the program is flawed it will take longer to error because we have to lex/parse everything.

cleanup module layout

not really sure... how the D module system works but I feel like my use of it isn't particularly idiomatic and could be cleaned up

mangling scheme

need a proper mangling scheme that is used for code generation

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.