Giter Site home page Giter Site logo

at15 / reika Goto Github PK

View Code? Open in Web Editor NEW
1.0 2.0 0.0 795 KB

[WIP] A DSL for query and benchmark TSDB

License: MIT License

Makefile 1.04% ANTLR 2.33% Java 73.35% Forth 0.19% Fortran 0.06% Shell 0.55% Go 8.25% JavaScript 7.97% Python 6.26%
programming-language time-series-database antlr dsl

reika's Introduction

reika's People

Contributors

at15 avatar

Stargazers

 avatar

Watchers

 avatar  avatar

reika's Issues

[impl-j][anltr] Variable Syntax and Semantics

Before introducing list and records, it's better to first have variable up and running, which also requires the type checker (infer?) #24 working as well, if we have just pure number .... everything can be done at compile time ...

Syntax

  • allow/force type in variable declaration a = 1 or int a = 1, we should be able to infer some type
    • use let style, allow (not forced) to specify the type, but will infer right hand side and compare, error if what the user want on left side is not what he write on right side

Semantics

Untyped visitor

  • symbol table?

[impl-go] Go implementation of RCL

Same as the python one #51, the main issue for Go is its runtime lack supports of visitor, there is several PR for that antlr/antlr4#1841 (should be the latest on, updated on 2017/10/03)

One thing to note is, we may not need generated BaseVisitor from ANTLR, most DSL in Go are using handwritten parser and have a walk function, we should be able to find a way to use the context w/o the base

[antlr] Grammar rule for record is wrong: extraneous input ' '

literal
    : BOOL # BoolLiteral
    | INT # IntLiteral
    | DOUBLE # DoubleLiteral
    | STRING # StringLiteral
    ;

// TODO: [0:123]
list
    : '[' literal (' ' literal)* ']'
    ;

term
    : literal # LiteralTerm
    | list # ListTerm
    // FIXME: the record rule is wrong
    | '{' ID ':' literal (',' ID ':' literal)* '}' # RecordTerm // TODO: allow record to have non-literal types? nested record etc.
    | '(' term ')' # BracketsTerm
    ;
java -cp ./third_party/antlr-4.7-complete.jar:./build/libs/reika-0.0.1-SNAPSHOT.jar org.antlr.v4.gui.TestRig me.at15.reika.parser.Reika prog -gui
{a: 123, b: 124};
line 1:3 extraneous input ' ' expecting {BOOL, INT, DOUBLE, STRING}
line 1:8 extraneous input ' ' expecting ID
line 1:11 extraneous input ' ' expecting {BOOL, INT, DOUBLE, STRING}

[impl-j] Visitor pattern

It is used when traverse ANTLR generated parse tree, also mentioned in ANTLR3 book P113 External Tree Visitor

  • it seems the accept method, although identical, need to be declared in every subclass of node

[antlr] Support negative number

We need negative number, but if we put the sign into the token INT and DOUBLE, it would create trouble for 1 - 2, it is no longer INT MINUS INT, but INT INT.

// what we want
irk> -1
-1 : float
irk > 1 - 2
-1 : float

Don't bother trying to handle them in the lexer, handle uniary-minus in the parser instead

[impl-j][test] Add test coverage to CI

The results of any tests run via the JUnit Platform Gradle plugin will not be included in the standard test report generated by Gradle; however, the test results can typically be aggregated by your CI server software. See the reportsDir property of the plugin
reportsDir file('build/test-results/junit-platform') // this is the default

  • some code quality platforms (codecay, code beat) also support coverage report
    • didn't find the entrance in codebeat

Other code quality tools

[impl-j][antlr] Grammar

Current spec is 0.1

  • the Java9.g4 could be an example, actually too complex to be helpful
  • literal (I don't know if we really need true false based on current syntax, we don't even have if else ...), renamed to Val
    • integer
    • float double
    • symbol
    • date, time? or datetime
  • container (or built in data structure)
    • list (mutable or immutable?)
    • dictionary
    • table
  • operator
    • assign value = ?
    • assign reference := ?
    • overloaded built in infix arithmetic op +, -, *, /
    • prefix unary + - and postfix unary ++ --
  • type
    • not sure how much inference we can/need to do now, seems very little w/o function

Optional

  • use new line instead of ;, it is mentioned in ANTLR 4 P216 Fun with Python new lines
  • support Java Doc like block comment for @author etc.

[git] Invalid folder name breaks checkout on windows

I was trying take notes of the dotty video #35 and download games while on windows, but failed to checkout when clone

fatal: cannot create directory at 'doc/talk/cmps253-Re:Reika': Invalid argument
  • rename the talk folder

[tapl] Untyped lambda-calculus

From at15/mini-impl#6

Note TAPL use nameless representation for function parameter (binding), we chose to use name, it is easier to implement but error pron, more discussion can be found in the original issue

  • syntax
  • parser
  • ast builder
  • evaluator

[antlr] Lexer and Parser

Related #18

Actually I didn't really figured out parser lexer problem when building old reika, because it was so simple and I was mainly following grammar on the book that I didn't meet any problem

  • why we should not put the negative sign in lexer, and make it a parser rule instead
  • what happens to the symbols we used in parser rules
    • based on generated tokens file, they become token with higher precedence
      • why there are two same tokens file ....
  • difference between lexer and parser

[impl-j] Better AST Nodes

Current AST node class has several problems

  • where to put type, now we have type property in the abstract class, but we also treat user specified type as a node, and that type is stored in property v, which may cause mistake when writing visitor, further more, we may not need to store this as a AST node
  • when compute type, we use type.Checker which generates new tree instead of updating existing one, which waste resource since we never use the old one
  • keep information of token location, so when there is error, we can specify the exact location, in old reika, there is just one pass (convert parse tree to AST and type check) #31
  • use annotation ?decorator for the identical accept method
  • enhance visitor interface #27 , it seems use different name is a more common practice (to avoid error?) or faster?
  • treat REPL and file differently, when in REPL, every input is a block, while in file, it is not the case, so simple call parser.prog is not correct

Ref

[antlr] Avoid duplicated rules for binary operations including minus

Related #18

Since the syntax for all the binary operations are the same, it's a waste to specify them differently

e   :  e '*' e # Mul
     | e '-' e # Minus
     | INT
     ;

Currently, I have two way to solve this

1. Put all binary operators into one token in lexer

e : e BINARY_OP e # BinaryOp;

BINARY_OP : '*' | '/' | '+' | '-';

which caused two troubles

  • lost the precedence between '*' and '+' #19
  • can't handle negative number, because - is for both negative and minus #18

2. Use same label for multiple alternatives

in ANTLR4 P264 15.3 Parser Rules

you can reuse the same label on multiple alternatives to indicate that the parse-tree walker should trigger the same event for those alternatives

e  : e '*' e # BinaryOp
   |  e '+' e # BinaryOp
   | INT
   ;

but you can't get token defined in parser rule, the context for visitor does not have it

[impl-j] Settings

Compiler Settings, what should be printed etc. scalc could be a good? example again. Need to allow passing in configuration from command line flags using JCommander #37

  • settings class #45
  • parse command line arguments #45
  • ? read default setting from home?

[impl-j][std] Plot graph in browser when using java runtime

It is possible to draw graph in Java using its own UI library, but it's way better if we can generate graph in browser. We might want to split the runtime to separate project in the future, so using the compiled java file don't need to bring the compiler around (along with antl's icu deps #39 )

  • find a web framework that supports websocket (so we can push to client)
  • embed static assets
  • might just use netty instead of using a full framework

[impl-j][type] Type Checker

It's better to introduce type checker before we start introducing operator overload for +, -

  • should we have type inference, if so, how much? #26 , actually, we don't have function, there is no need for type inference, variable declaration is just let binding, and user specified type is just a hint
  • the name for the type checker class, Checker, Reconstructor, Inferer

Ref

  • ANTLR3 Chapter 8. Enforcing Static Type Rules

[antlr] Setup ANTLR

Although for implementing the checker in TAPL we don't need a parser, but the doc in old Reika is not a step by step guide to tell how to use ANTLR from scratch, and it's for Java only while we are planning to support multiple language in the newer version

  • generate the parser
  • visualization (don't remember if I have to generate parser and put in class path)
  • try a different language, js? it would be cool to run it in browser
  • use it with gradle in real project, might need to setup the project structure first
  • test helpers
  • error trackers

[tapl] Arithmetic

From at15/mini-impl#6 we start our implementation of TAPL's type checker from simple untyped arithmetic expressions (a.k.a calculator)

  • define syntax based on OCaml implementation
    • everything is expression (I think I saw this in the book, statement is just expression of type unit)
    • might need to annotate the grammar so generated parser can have correct context
  • java ast builder
  • java interpreter (eval)
  • python ast builder
    • does the annotation work the same way as Java?

[antlr] Precedence of arithmetic operators

If we mix all the binary operators (+, -, *, /) together, we lost the precedence, and user need to explicitly add () to have multiply executed before plus. The reason we put them together is we want to avoid duplicated code in visitor when building AST, because the operator is the only difference

Currently we have (this is wrong)

term
    : literal # TmLiteral
    | '-' term # TmNegative
    | term BINARY_OP term # TmBinaryOp
//    | list # TmList
//    | record # TmRecord
    | '(' term ')' # TmBrackets
    ;

BINARY_OP
    : '+'
    | '-'
    | '*'
    | '/'
    ;

for 1 + 2 * 3 we have

     *
  +   3
1  2

but we should have

   +
1   *
   2  3

[dialect] Reika Configuration Language RCL

Due to need from xephonhq/awesome-time-series-database#38 , we want a configuration language that we can point out which line in the configuration file is wrong, i.e. get an AST w/ position instead of clean object (if we want)

  • syntax, consider YAML, TOML, HCL
  • ANTLR grammar
  • implementation, since we want to use it in awesome-tsdb Java may not be the best chocie
    • javascript
    • python

In the long term, RCL should be merged into Reika compiler

Ref

[tool] Setup Gradle

ANTLR works best with Java, old Reika tsdb-proxy-java/ql is using Gradle instead of maven, we will continue using Gradle, but I forgot how to use Gradle entirely, this issue is used to track the progress of having a gradle project using ANTLR

  • multiple projects? TAPL is quite independent from others, and might be split out in the future
    • Reika needs a type checker, a interpreter, but they may not need two different projects, just a command line parameter should work just fine
    • use different main class would just work, no need to specify the only main class for jar and build multiple jar for different project
  • gradle tasks for compile the parser
    • can't use gradle w/ ANLTR's tree visualization, Makefile may not work as well, ayi did work
  • bug, it seems gradle task always run when gradle build, use doLast solve this
  • error-prone https://github.com/google/error-prone
  • filter using tag, but it seems there is no way except changing gradle file manually
  • tried to use the standalone launcher, but can't find the class .... https://github.com/junit-team/junit5/blob/master/documentation/src/docs/asciidoc/running-tests.adoc#running-tests-console-launcher-options java -jar junit-platform-console-standalone-1.0.2.jar -d src/main, but no test found ...

[antlr] Left Recursion

When reading the book Language Implementation Patterns (ANTLR3), found a example of list, which seems to have indirect left recursion. But as I remember ANTLR4 handles LR better but still can't handle indirect left recursion (it is mentioned in the ANTLR4 book somewhere)

From ANTLR3 book P26, example text is [a, b, c], [a, [b, c], d]

grammar NestedNameList;
list : '[' elements ']' ;
elements : element (',' element)* ;
element: NAME | list ;
NAME : [a-zA-Z]+;

[tapl] Untyped lambda-calculus left side for Application does not have to be Abs node, can be a var node

At first I defined abs as type Node and ASTBuilder works, but after switching to Abs, it broke

public static class App extends Node {
  public final Abs abs;
  public final Node param;

guess it is due to (lambda x. x) (lambda x. x x);, in (lambda x. x x) there is a application and x is a var instead of a abs, it can become abs if some one bind it, the var here can be anything ... and (lambda x. x x) seems never terminate?

java.lang.ClassCastException: me.at15.tapl.untyped.UntypedAst$Var cannot be cast to me.at15.tapl.untyped.UntypedAst$Abs

[impl-j] Profiler

We need to know how long we spend on each compile phase, also it would be good if we could count time in our test and REPL, after all ... these data are kind of like time series .... scalac's could be an example

  • simple timer
  • collect jvm metrics

[ref] Weld

https://github.com/at15/papers-i-read/issues/89

Weld is a language and runtime for improving the performance of data-intensive applications. It optimizes across libraries and functions by expressing the core computations in libraries using a common intermediate representation, and optimizing across each framework

[doc] Course project report

Similar to at15/snowbot#20 due on tomorrow ... end of finals week, Friday Dec 15. 6-20 pages ... so again ... it's report driven development

Introduction

  • motivation
  • brief summary of the accomplishment
  • description of notation

Body

  • related work

Conclusion

  • accomplishment
  • open problem

[impl-j][test] Re-structure test by level of language features

Currently we have two folder for test scripts, arith and arith_typed which are quite misleading, arith_typed is actually the one with let binding. Since we are adding list a more ideal folder structure should be

primitve
   - int_double_float_only
   - negative
   - binary_op
   - let (previous arith_typed) 
list
   - binary_op_overload (1 + [1, 2, 3])

[impl-j[std] Standard library in Java

We need standard library if we want to more than just calculator, it can be used in both REPL (interpreted) and compiled (actually transpile, we convert to Java file and let javac do the hard work, and run it with our runtime in class path)

Compiler

  • inject symbol of standard libary

Numeric

  • vector
  • random etc.

Graphic

[tracker] Migrate code from tsdb-proxy java

Currently Reika lies in https://github.com/xephonhq/tsdb-proxy-java , it should be a project of its own and imported in tsdb-proxy-java(go) as a library for query and benchmark. Also we can't just copy the code or use git subtree, clean up need to be done and existing git history is not that useful.

Tooling

Code

  • migrate the parser and interpreter

Extra

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.