laurenarnett / tensorflock Goto Github PK

A small functional tensor language with Einstein summation notation convention and shape-checking at compile-time.

TeX 27.70% Makefile 1.95% OCaml 55.57% HCL 6.92% Shell 3.35% C 4.36% Haskell 0.15%

tensorflock's Introduction

Hi there, I'm Lauren 👋

I graduated as a Computer Science major from Columbia University in 2019. I have been working as a Full-Stack Software Developer since then, using Clojure and ClojureScript. I'm currently in a batch at the Recurse Center, where I'm making lots of Haskell projects and trying out IHP.

🌱 Currently learning: Haskell; SIMD & working with vector intrinsics; distributed systems

😄 Pronouns: she/her
⚡ Fun fact: CFG parsing reduces to Boolean matrix multiplication (Lee, 2002), meaning that the fastest parsing algorithms can be computed in less than O(n^3)!

You can find more details about me and ways to reach out at laurenar.net!

tensorflock's People

Contributors

Stargazers

Watchers

Forkers

jmorag

tensorflock's Issues

Code Sample

We need to write some code samples at the end of the proposal.

Flatten a tensor (Joseph)
Update a tensor element (maybe a primitive, Joseph)
Matrix multiplication (Nick)
GCD algorithm (~~Nick~~) (actually done by Joseph)
Perceptron algorithm (Lauren)
Nearest neighbor (Lauren)

Semantic Bug

We have a pretty interesting bug in the semantic checker that lets things like

wrong_type : Nat;
wrong_type = True;

pass through unscathed. This is because

When the semantic checker validates expressions, for literals, it doesn't actually look at their declared type, but infers it straight from their value.
We don't write enough tests that the semantic checker should reject. These are just as important as writing the ones that are valid in our language.

Tensor syntax

What do we want our tensor syntax to be? I liked declaring them

T : Tensor (r : Rank) <d_1..d_r>

but this is a pretty "functional" way of programming, and Edwards was pretty explicit about us having to give up on that if we're doing tensors. Thoughts?

Implement a block floating pass

In lambda-lift.ml once sfuncs are uniquely named, and have their parameters lifted (#117 & #118) implement a pass to float these blocks into the global scope. This can be done by consing them to sprogram's sfunc list.

Index binding in expressions

Sticking sindices in sfuncs isn't good enough because you can still use indices in main expressions; thus, index binding needs to be added to expressions.

Transforming everything to a Forall - goal is to figure out how to stick things into this.

Codegen for tensor literals

Now that we can semantically check tensor literals for their shapes, we need to generate code for them. This involves linking up the code in baby_runtime to codegen. We need to

Finish writing print_tensor in C.
Declare it under the other two printing functions we have in codegen.
Somehow figure out how to call Talloc and Tfree from codegen when we encounter a tensor literal.

Long Compilation Times

I have a sneaking suspicion that whenever we run make, ocamlbuild -clean gets called, forcing the scanner, parser, ast, etc. to get recompiled, even if we only make changes to codegen or semant. From just looking at the Makefile, I'm not sure how easy this is to fix, but maybe @nbuonin could take a look at this.

Do local variables for scopes post lifting

transpose : T<n,m> -> T<m,n>;
transpose mat = mat';
{
mat':T<m,n>;
mat'[j,i] = mat[i,j];
}

allocate args
allocate stuff in scope

Reorganize test files

Our current test structure leaves much to be desired. Ideally, we should have tests grouped into semantically meaningful folders. Maybe one big pass folder and one big fail folder. We can then move tests from pass into fail as we begin to implement a stricter compiler, instead of just rewriting all of the pass ones.

Tensor struct

Define a C struct representing a tensor. As we discussed in the proposal phase, tensors have

shape
contents

In an effort to cut down on our ambition, we're supporting only doubles (64-bit floats) in our tensors. We can capture the notion of a natural-valued tensor index with an unsigned int. We should probably make these of the 64-bit variety as well, since then we can avoid running into any memory allocation subtleties that come with having to put things with different sizes on the heap.

My first guess at what this struct would be is

struct Tensor {
    unsigned int size; //Internal, the actual number of doubles that the tensor is storing
    unsigned int rank;
    unsigned int *shape;
    double *array;
};

but perhaps someone has more ideas about this.

Implement tensor indexing

Record Syntax

Record syntax needs to be added to the proposal, in the syntax section.

data Tensor = Tensor { numType : (Num a) => a
, rank : Nat
, shape : Vect rank (\ x -> Fin x)
}

Broken parens

Parens only work on arithmetic expressions. Yikes! Trying to fix it, but it's rather annoying.

functions and variables

The time has come for TensorFlock to compile programs with functions! Probably start out with generating code for functions of the form f : SomeUnitType; f = some_unit_value and branch off of that for actual functions.

Implement fixed-size tensor allocation

This one's not too bad. Given a tensor of known size, rank, and shape, implement the c function that mallocs the struct on the heap. The real challenge is trying to do this for tensors whose structure is unknown at compile time.

Require that failing tests actually fail in the testrunner

Right now, we mostly just test for passing things. This leaves us open to the compiler not properly rejecting programs like it should. @nbuonin, can you update the test_runner so that if anything in the fail directory passes, the test itself fails?

Access a component of a tensor

This is the low level function in C that, given a Tensor struct pointer and indices, returns a pointer to the memory location of those indices.

typedef struct Tensor T;
typedef unsigned int uint;

double* access(T *tensor, ...) 
{
    va_list indices;
    va_start(indices, tensor);
    uint index = 0;
    uint i, j, prod;

    for(i = 0; i < tensor->rank; ++i) {
        prod = 1;
        for(j = 0; j < i - 1; ++j) {
            prod *= (tensor->shape)[j];
        }
        index += va_arg(indices, uint) + prod;
    }
    va_end(indices);
    return &((tensor->data)[index]);
}

Right now, it is agnostic to the rank of the tensor it is acting on. For efficiency, we may want to perform manual loop unrolling and specialize this for tensors of ranks 1-4. For now, though, this can serve as a black box that in our manual compilation for the next meeting.

Range function

Please compile the range function to C

range : Int -> T<n>;
range n = t; { t : T<n>; t[i] = cast i; }

Matrix-vector multiplication

Please compile matrix-vector multiplication

matTimesVec : T<n,m> -> T<m> -> T<n>;
matTimesVec mat vec = mat[i,j] * vec[j];

Pin llvm 3.8

Would it be possible, along with all the other stuff that our makefile drags in, to pin llvm to our dependencies so that it comes along?

Function to extract ids from expr

In order to create a dependency graph of a program, we need to be able to deduce which identifiers an expression depends on so that we can look for them in the rest of the program.

It should have type ids_in_expr : expr -> string list. Name can be changed to something with fewer underscores.

Implement Operators

https://github.com/laurenarnett/TensorFlock/wiki/Syntax#operators

blah

test

test1
test2

Streamline TLit codegen

#100 works, but we can do better. I'm in an issue-happy mood now so I'm opening this up here. Probably will involve some changes to the Talloc function in the runtime. I'm thinking that we should just use it minimally to allocate space given a size and rank and let the LLVM deal with actually sticking contents inside of the returned struct.

Codegen var bug

Found a bug in codegen for global vars. We currently have a test that looks like:

main = res;

three : Nat;
three = 3;

res : Nat;
res = 2 ^ three;

It returns 8 like it should. However, if we reorder the functions like so:

main = res;
res : Nat;
res = 2 ^ three;

three : Nat;
three = 3;

then it returns 1. This is because when we do codegen for global variables, we initialize Nats to 0, and llvm is quite happy to accept that as a valid value for three and raise 2 to the 0'th power and only later reassign three, since reassignment is a thing that is allowed in llvm-land.

I think that this stems from a larger problem of us having copied large swaths of microc before we realized that our language is about as far away from microc as possible. Anyway, to solve this particular problem, we might need to change what we call globals into functions with 0 arguments and constant return values so that order of initialization stops mattering.

Implement tensor shape

Makefile improvements

Modify "make state" to move generated files to a _state directory
Make a target to package project for first deliverable

Additions to Docker container

Add:

entr and/or watcher script
man pages
valgrind
LLVM suffix free
gcc extension that permits nested scope (for emitting LLVM that demonstrates nested scope)

Implement dynamic tensor allocation

As I mentioned in #13, this one is somewhat harder. If we declare something super general, like
T : Tensor r <d_1..d_r>, can we write the corresponding C (later LLVM) code that would actually perform such an operation under the hood? I'll take this one.

Implement Function Declaration and Definition

https://github.com/laurenarnett/TensorFlock/wiki/Syntax#fexpr
and function declaration and definition:
https://github.com/laurenarnett/TensorFlock/wiki/Syntax#function-declaration-and-definition

Implement IDs

https://github.com/laurenarnett/TensorFlock/wiki/Syntax#ids
keywords:
https://github.com/laurenarnett/TensorFlock/wiki/Syntax#keywords
and types:
https://github.com/laurenarnett/TensorFlock/wiki/Syntax#farg

Rewrite LRM

Someday before turning in final project stuff we should rewrite some of the LRM, especially stuff on scope, lambda lifting, etc. keeping in mind the feedback that it's "not supposed to be a tutorial" like documentation.

Restricted class of expressions for bracket indexing

I implemented a restricted subset of expr, aexpr, for tensor shapes. It only has IDs, integer literals, arithmetic operations, and function application. Should we also restrict bracket indices to be of this class as well? The only argument against this would be that boolean statements can possibly be useful in conditionals, say "select this index if such-and-such and select this other one if otherwise." However, it's probably not super necessary to have this feature and this does a fair amount of type checking for us for free at parse time.

Script for generating llvm output from a C file

Fairly low priority, but since Edwards really wants us to have easy access to what Clang outputs, we probably could do with a script that automates that a little.

Read up on Lamba lifting algorithms

Tensor literal initialization

Implement function calls

Average of vector and tensor

Please write equivalent C for the following functions:

vecAverage : T<n> -> T<>;
vecAverage v = (v[i] * ones[i]) / (cast n); {
    ones : T<n>; ones[i] = 1.0;
}

matAverage : T<n,m> -> T<>;
matAverage m = ((m[i,j] * ones'[j]) * ones[i]) / (cast (n * m)); {
    ones  : T<n>; ones[i] = 1.0;
    ones' : T<m>; ones[j] = 1.0;
}

C function to do einstein summation

@dawnstar63, could you try implementing something in C that contracts to tensor structs? I think it would look something like

T *contract(T *t1, T *t2, int idx1, int idx2, uint *new_shape) {
  // allocate the new tensor using the new shape (which is something that we'll have to pass from codegen)
  // a bunch of for loops to stick all of the correct components into the new tensor, summing over the indices given as arguments
  // return the pointer to the new tensor
}

Arbitrary Tensor Shape

How do we represent the arbitrary shape? Ideally, we would like to be able to operate on individual shape dimensions of tensors without knowing how many there are ahead of time in a type safe way, but creating syntax for this is somewhat hard.

Implement parameter lifting pass

In lambda-lift.ml implement a pass over the uniquely named sfuncs (see #117) to lift free variables in sfunc.sscope to sfunc.sfparams.

Do this by finding the intersection of ids in a sfunc's enclosing scope and sfunc.ssope, and then subtract the ids in sfunc.sfparams. The remaining ids are free variables; copy these into sfunc.sfparams.

Then update each call site in the sfuncs enclosing scope and sfunc.sscope with the additional parameters.

give each sfunc.sfname a unique name, use incrementing ~s b/c we don't use those in our language
update each call site in a sfunc's enclosing scope and sfunc.sscope