Giter Site home page Giter Site logo

masr's Introduction

masr

Meta ASR: replacement for aging ASDL

Pronounced like "MASER," Microwave Amplification of Stimulated Emission of Radiation. Yes, it's a physics pun. We like physics puns.

Run

  • Install leiningen (look it up on the web, there is only one).

  • Type lein test at the terminal. Interacting with the tests is the best way, by far, to learn the code.

  • Type lein test :only masr.core-test/Module-test, for example, to run just one of the tests.

  • Type lein run at the terminal; doesn't do much, yet.

Read

  • core_tests.clj

  • example inputs in the resources/reference subdirectory

  • specs.md

    Looks best in the IntelliJ IDEs. Doesn't look so good in Visual-Studio Code. Any renderer of Markdown should be serviceable.

Modify

  • Write Clojure code and Markdown in specs.clj and core_tests.clj.

  • Extract the Markdown file from the code as follows:

awk -f md4code.awk < ./src/masr/specs.clj > specs.md

masr's People

Contributors

rebcabin avatar

Stargazers

 avatar Shaikh Ubaid avatar  avatar Ondřej Čertík avatar

Watchers

Ondřej Čertík avatar  avatar  avatar

masr's Issues

type-declaration

We recommend that type-declaration be changed to print nil rather than () or [] for its null case.

Here is the reasoning:

type-declaration is currently spec'ced as

(s/def ::type-declaration
  (s/nilable ::symtab-id))

meaning that the following are conforming instances:

       (s/valid? ::type-declaration
                 (type-declaration nil))    := true
       (s/valid? ::type-declaration
                 (type-declaration 42))     := true

--show-asr prints () for a null type-declaration. There are two problems with this:

  1. type-declaration is never a collection, so () or [] are not suitable null values for it.
  2. The kosher way to spec this is
(s/def ::type-declaration
  (s/or :null #(or (= % []) (= % ()))
        :id   ::symtab-id))

This is very difficult to deal with because conformance adds the tag :null or :id.

(type-declaration ())
;; => [:null ()]
(type-declaration 42)
;; => [:id 42]

(s/conform ::type-declaration (second (type-declaration ())))
;; => [:null ()]
(s/conform ::type-declaration (second (type-declaration 42)))
;; => [:id 42]

Recursive conformance calls, say for Variable must be polluted with logic for calls of second in multiple places.

In `ttype`, how does `List` differ from `Tuple`?

I assume List is homogeneous and Tuple is heterogeneous, as suggested (if not implied ?!?) by the arguments. Otherwise, they're both ordered collections, start-index 0, with duplicate entries allowed?

ttype
    = Integer(int kind, dimension* dims)
...
    | Set(ttype type)
    | List(ttype type)   --- homogeneous
    | Tuple(ttype* type)  --- heterogeneous
    | Struct(symbol derived_type, dimension* dims)
...

Proposition:

If ttype* types in Tuple denotes a sequence of types for a heterogeneous tuple, then ttypes* must denote a sequence (the ASDL is ambiguous), and the length of any tuple that conforms to the spec must be equal to the length of the ttypes*.

Remark:

The length of conforming tuples is implied by the length of the ttypes*. I consider this spec to be implied and explicit, rather than implicit, with implicit meaning "not evident from the manifest text." This spec for Tuple, once ttypes* is clarified as a sequence, is therefore acceptable under the tenet that all MASR specs must be explicit.

What are allowed `node*` in `TranslationUnit`?

The ASDL spec for TranslationUnit says node*

unit
    = TranslationUnit(symbol_table global_scope, node* items)

Surely not all nodes are allowed. Can it be Function, Program, etc., i.e., whats the actual list of allowed nodes at this level?

ASDL `expr` is not sufficiently discriminating

Consider my notes on the following. I will fix this in MASR.

;; | LogicalCompare(expr left,   ;; must have type ::Logical
;;                  cmpop op,    ;; not all cmpop, only Eq and NotEq
;;                  expr right,  ;; must have type ::Logical
;;                  ttype type,
;;                  expr? value)

Ambiguity in identifier-list, -set, -suit

This issue is for me, an implementation TODO

Consider

(s/valid? ::identifier-set (identifier-list '(a a b))) := true

It should not be true because of the duplicates in the list. This is a bug in the implementation; the design is good.

Scalar detection

Some compile-time checks can tell whether a numerical term like Integer or Character is a scalar. Scalar types are helpful for checking array indices, for example. Each index should be an integer scalar. Character scalars are useful for checking Strings that should contain a single character.

Other terms like Var often cannot be checked at compile time for a scalar property, but sometimes they can by a propagation of the scalar property from a constant into the variable.

We should debate whether such a level of checking is worth the work needed to implement it.

Another `dimension` question

I want to tighten the spec for dimension to (informally)

(s/or [::integer-scalar, ::integer-scalar]
      [])

i.e., zero or two integer scalars, where a scalar has empty dimensions, instead of only

expr? start, expr? end

is that appropriate? In particular, could a dimension itself be an array, say of length 2?

consider adding `Implies` to `logicalbinop` in ASR

Of course, (Implies A B) is equal to (Or (not A) B) or to (Not (And A (Not B))) (by de Morgan's), but it might be common enough to have its own logicalbinop, especially in the face of Eqv and NEqv, each of which are at least asequally complex and already exist in ASR.

Characters with negative length ?

example expr9-c6fe692.stdout has Characters with negative length:

                                                (Variable
                                                    6
                                                    s
                                                    []
                                                    Local
                                                    ()
                                                    ()
                                                    Default
                                                    (Character 1 -2 () [])
                                                    Source
                                                    Public
                                                    Required
                                                    .false.
                                                )

What does this mean? Should it be allowed?

I disallowed it out of abundance of caution, but this file fails my type-checks!

Is the parameter `symbol` for `Var` really a `symbol`? Or just an `identifier`?

I see (Var 2 a) in a recent output from LPython:

            [(= (Var 2 a)
              (LogicalConstant false (Logical 4 []))
              ())

This does not match the spec in ASR.asdl

    | Var(symbol v)

as a symbol is one of

symbol
    = Program(symbol_table symtab, identifier name, identifier* dependencies,
        stmt* body)
    | Module(symbol_table symtab, identifier name, identifier* dependencies,
        bool loaded_from_mod, bool intrinsic)
    | Function(symbol_table symtab, identifier name, ttype function_signature,
        identifier* dependencies, expr* args, stmt* body, expr? return_var,
        access access, bool deterministic, bool side_effect_free)

etc.

I am guessing that what is really meant is

    | Var(symtab_id stid, identifier it)

and I'll go with this as a workaround, but this is a BLOCKER.

better sugar for identifiers

Reminder for me

(identifier boofar) instead of (identifier 'boofar)

(identifier-suit [a b]) instead of (identifier ['a 'b])

etc.

Return type of `StringChr`: string or character ?

In the examples, I see only

(StringChr
                                            (Var 4 p)
                                            (Character 1 1 () [])
                                            ()
                                        )

with an empty return value.

I've spec'ced it to return a string, but is this right ?

(defn StringChr
  ([str-expr, char-ttype, string-val?]
   "trinary ... Return ascii value of the indicated
   character in the string."
   (let [cnd {::term ::expr,
              ::asr-expr-head
              {::expr-head ::StringChr
               ::string-expr       str-expr
               ::Character         char-ttype
               ::string-value?     string-val?}}]
     (if (s/valid? ::StringChr cnd)
       cnd
       :invalid-string-chr)))
  ([str-expr, string-val?]
   (StringChr str-expr, (Character) string-val?)))

What does `StringItem` do?

Seems to return an integer, but I can't tell what it's supposed to mean.

(StringItem
 (Var 5 __tmp_assign_for_loop)
 (IntegerBinOp
  (Var 5 __explicit_iterator)
  Add
  (IntegerConstant 1 (Integer 4 []))
  (Integer 4 [])
  ()
  )
 (Integer 4 [])
 ()
 )

Meanings of empty dimension and dimension*?

A dimension instance of (dimension ()) or (dimension []) is legal, according to the ASDL spec. What does it mean for a ttype instance like (Integer 4 [[]]) that enjoys this dimension instance (inside its dimension*, which has one dimension inside)?

Presumably, it means that the object enjoying the ttype is a naked Integer in this case, not a 1x1 array, not necessarily a scalar?

For that matter, please write about the distinctions between the following

(Integer 4 [])  ;; empty dimension* 
(Integer 4 [[]])  ;; singleton dimension* with one empty dimension [sic]

Meaning of `=` in `--show-asr`?

In recent output format, I see

(= (Var 2 a)
              (LogicalConstant false (Logical 4 []))
              ())

I can't find anything in ASR.asdl that matches=. I presume it's shorthand for one or more of the binops, but it's ambiguous and creates a problem for MASR, which now must inspect the arguments to figure out which binop.

logicalbinop = And | Or | Xor | NEqv | Eqv
cmpop = Eq | NotEq | Lt | LtE | Gt | GtE

`symbol` for `name` in `SubroutineCall` ?

ASDL says

    | SubroutineCall(symbol name, symbol? original_name, call_arg* args, expr? dt)

name is a symbol instead of the expected identifier. symbol is a huge type with many heads. Can you give an example of a name that is not an identifier? Can we refine the type of name a little more precisely, i.e., say what heads of symbol are allowed? Can you reveal some of the secret metasymantics now implemented in C++?

Ditto for original_name.

Informative: discriminating types

To cut down on the looseness of types like the following:

    | IntegerUnaryMinus(expr arg, ttype type, expr? value)
...
    | RealUnaryMinus(expr arg, ttype type, expr? value)

I am doing the following in full-form (legacy continues to support the more-loose ASDL.

(s/def ::logical-expr
  (s/or :logical-constant   ::LogicalConstant
        :logical-compare    ::LogicalCompare
        :integer-compare    ::IntegerCompare
        :logical-binop      ::LogicalBinOp
        :logical-not        ::LogicalNot
        :cast               ::Cast      ;; TODO check return type!
        :if-expr            ::IfExp     ;; TODO check return type!
        :named-expr         ::NamedExpr ;; TODO check return type!
        :var                ::Var       ;; TODO check return type!
        ;; TODO: integer-compare, etc.
        ))
;; #+end_src

;; #+begin_src clojure

(s/def ::logical-expr?  (.? ::logical-expr))
(s/def ::logical-value?     ::logical-expr?)

(s/def ::logical-left       ::logical-expr)
(s/def ::logical-right      ::logical-expr)
;; #+end_src

;;
;;
;; ### Integer Types
;;
;;

;; #+begin_src clojure

(s/def ::integer-expr
  (s/or :integer-constant    ::IntegerConstant
        :integer-binop       ::IntegerBinOp
        :integer-unary-minus ::IntegerUnaryMinus
        :integer-bit-not     ::IntegerBitNot
        :cast                ::Cast      ;; TODO check return type!
        :if-expr             ::IfExp     ;; TODO check return type!
        :named-expr          ::NamedExpr ;; TODO check return type!
        :var                 ::Var       ;; TODO check return type!
        ))
;; #+end_src

;; #+begin_src clojure

(s/def ::integer-expr?  (.? ::integer-expr))
(s/def ::integer-value?     ::integer-expr?)

(s/def ::integer-left  ::integer-expr)
(s/def ::integer-right ::integer-expr)
;; #+end_src

;;
;;
;; ### Real Types
;;
;;

;; #+begin_src clojure

(s/def ::real-expr
  (s/or :real-constant    ::RealConstant
        :real-binop       ::RealBinOp
        :real-unary-minus ::RealUnaryMinus
        :cast             ::Cast      ;; TODO check return type!
        :if-expr          ::IfExp     ;; TODO check return type!
        :named-expr       ::NamedExpr ;; TODO check return type!
        :var              ::Var       ;; TODO check return type!
        ))
;; #+end_src

;; #+begin_src clojure

(s/def ::real-expr?  (.? ::real-expr))
(s/def ::real-value?     ::real-expr?)

(s/def ::real-left  ::real-expr)
(s/def ::real-right ::real-expr)
;; #+end_src

Character kind in `ttype`

The character kind is listed as 1 byte and utf8. Isn't utf8 a variable-length type, up to 4 bytes?

;; kind: The `kind` member selects the kind of a given type. We currently
;; support the following:
;; Integer kinds: 1 (i8), 2 (i16), 4 (i32), 8 (i64)
;; Real kinds: 4 (f32), 8 (f64)
;; Complex kinds: 4 (c32), 8 (c64)
;; Character kinds: 1 (utf8 string)

Also, as an aside, there is an awkwardness in the following comment:

;; Logical kinds: 1, 2, 4: (boolean represented by 1, 2, 4 bytes; the default
;;     kind is 4, just like the default integer kind, consistent with Python
;;     and Fortran: in Python "Booleans in Python are implemented as a subclass
                    _________           _________
;;     of integers", in Fortran the "default logical kind has the same storage
;;     size as the default integer"; we currently use kind=4 as default
;;     integer, so we also use kind=4 for the default logical.)

Request unambiguous and explicit outputs in `--show-asr`

Issue #21 brings up the requirement that --show-asr really must be completely unambiguous. = might mean Eq or it might mean Eqv or Assignment? Any kind of operator notation like that must be replaced with either a symbol like Assignment or must be explicitly written in ASR.asdl.

Secret sugar like operator synonyms or multinyms hidden in the C++ implementation of ASR make MASR's ADSL back-channel really difficult!

As we move toward greater formality for MASR, it's important that --show-asr be unambiguous, explicitly written in full in ASR.asdl, and context-free. Leave the sugar to MASR! Don't make MASR do dispatch-on-type or lookahead on the printouts from --show-asr!

`StringItem` results can be either `Character` or `Integer`

Found by bisecting false positives:

(StringItem
  (Var 5 __tmp_assign_for_loop)
  (IntegerBinOp
    (Var 5 __explicit_iterator)
    Add
    (IntegerConstant 1 (Integer 4 []))
    (Integer 4 [])
    ()  )
  (Integer 4 [])  ;; <~~???~~ type is an Integer 4 
  ()  )
(StringItem
  (Var 5 d)
  (IntegerBinOp
    (Var 5 i)
    Add
    (IntegerConstant 1 (Integer 4 []))
    (Integer 4 [])
    ()  )
  (Character 1 -2 () [])  ;; <~~???~~ type is a Character 1
  ()  )

Is this intentional? The ASDL is ambiguous:

    | StringItem(expr arg, expr idx, ttype type, expr? value)

`WhileLoop` doesn't conform to ASDL

Example from asr-expr1-dde511e.stdout has three parameters:

(WhileLoop
                                        () ;; one
                                        (NamedExpr
                                            (Var 2 a)
                                            (IntegerConstant 1 (Integer 4 []))
                                            (Integer 4 [])
                                        ) ;; two
                                        [(=
                                            (Var 2 y)
                                            (IntegerConstant 1 (Integer 4 []))
                                            ()
                                        )] ;; three
                                    )

but ASDL from ASDL_2023_APR_06_snapshot.asdl specifies only two:

    | WhileLoop(expr test, stmt* body)

The first argument in the Example is mysterious. I'll work around it for now.

`IntrinsicFunction` and `call-args` versus `call-arg`

I've had call-args as a vector of lists working with SubroutineCall and FunctionCall for a while (see Issue #32) . Here is the spec and an example lifted from the lpython .stdout reference outputs:

(s/def ::call-args (s/coll-of ::call-arg)) ;; <~~~ top level of list/vector
(s/def ::call-arg ;; <~~~ second level of list
  (s/coll-of ::expr?
             :min-count 1   ;; Issue 32
             :max-count 1))
(FunctionCall
  2 pow__AT____lpython_overloaded_0__pow
  2 pow
  [((IntegerConstant 2 (Integer 4 []))) ;; <~~~ call-args HERE
   ((IntegerConstant 2 (Integer 4 [])))]
  (Real 8 [])
  (RealConstant 4.000000 (Real 8 [])  )  ())

However, I encountered the following usage in expr_14-6023c49:

(IntrinsicFunction
  Abs
  [(RealBinOp ;; <~~~ not enough nesting
   (Var 2 a3)
   Sub
   (RealConstant 9.000000 (Real 8 []))
   (Real 8 [])  ()  )]
  0
  (Real 8 [])
  ()  )

This has one too few levels of nesting for a call-args. I solved it with a spec like this:

(s/def ::call-arg-or-args
  (s/or :call-arg  ::call-arg
        :call-args ::call-args))

(defmasrtype
  IntrinsicFunction expr
  (intrinsic-ident    call-arg-or-args      overload-id
                      return-type           value?))

I guess this is OK, but it's a little "hinky."

Dimensions of length 0?

The following are legal dimensions in the ASDL grammar, written in as an array of dimensions sic in context of an Integer ttype for convenience:

(Integer 4 [[0]])  ;; array of length 0? pointer?
(Integer 4 [[6 0]])  ;; array of length 0 with starting index = 6? meaning?

Is an array of length 0 a naked integer? Or a typed pointer with value nil? Or something else?

Design Question forced by Ambiguous Return Type of `StringItem`

This is a top-level design issue as the types for Character and String are currently conflated, and some ASR terms further conflate Character scalars with Integer scalars.

By Issue #51 , StringItem might return a String (which is actually a Character), or an Integer. Terms that take a StringItem, therefore, must be prepared to take either a Character or an Integer. This need complicates ("complexifies," in Hickey-speak) the fine-grained types for string-expr and integer-expr.

Resolution is possibly a redesign of those fine-grained types, or perhaps new, top-level types that distinguish String and Character and permit a character to be either an Integer or a new type for a primitive character.

Logical `Gt`, `GtE`, etc.

I began by restricting logical cmpops to Eq and NEq, but then discovered LogicalCompare expressions with Gt, GtE etc. in expr13-10040d8. I assume these are C-Like, in that true == 1 is Gt false == 0. In any event, they're a little weird.

Confused about `binop` -- polymorphic?

Looks like binop contains polymorphic operators, valid for Real, Integer, Complex, plus some things that are valid only for Integer. Do we define tighter types for these operators? If so, how

(enum-like binop        #{'Add 'Sub 'Mul 'Div 'Pow
                          'BitAnd 'BitOr 'BitXor
                          'BitLShift 'BitRShift})

`TupleConstant` looks wrong

    | TupleConstant(expr* elements, ttype type)

shouldn't it be

    | TupleConstant(expr* elements, ttype* types)

? Such would match examples like the following from 922cf65

(is (s/valid? ::asr/TupleConstant
                (TupleConstant
                 [(IntegerConstant 1 (Integer 4 []))
                  (IntegerConstant 2 (Integer 4 []))
                  (StringConstant
                   "a"
                   (Character 1 1 () [])
                   )]
                 (Tuple
                  [(Integer 4 [])
                   (Integer 4 [])
                   (Character 1 1 () [])]))))

Request for preference in syntactic sugar for `ttype`

An example of a fully explicit ttype entity for Integer, is, say

(Integer :kind 4, :dimensions [])

which produces

          {::term ::ttype,
           ::asr-ttype-head
           {::ttype-head ::Integer,
            ::bytes-kind 4
            ::dimensions []}}

This latter is the 'ground truth', fully explicit hash-map that specifies all details explicitly.

This issue concerns the syntax-sugar constructor functions.

This Issue requests a preference for defaulting. Clojure only allows optional keyword arguments in functions. Absent keyword arguments in a call have value nil. This issues boils down to interpreting nil in the syntax-sugar function Integer, that is, what does

(Integer)

mean?

Alternatives:

  1. Missing keyword arguments produce non-conforming specs, i.e., errors. (Integer), (Integer :kind 4), and (Integer :dimensions [6, 42]) would be errors. Upside: explicitness. Downside: verbosity. Every valid call of Integer, say, is as long as (Integer :kind 4, :dimensions []).

  2. Missing keyword arguments are defaulted to :kind 4 and :dimensions [], so that (Integer), (Integer :kind 4), and (Integer :dimensions []) all mean exactly (Integer :kind 4, :dimensions []). Upside: brevity. Downside: missing keywords are defaulted, so that a naive mistake like

(Integer 'foobar [42 45 "foo"])

means exactly (Integer :kind 4, :dimensions [])

Which alternative do you prefer? If neither, do you have a better proposal?

identifier* versus dimension*

I'm assuming that dimension* is an ordered collection of dimension instances. I'll also assume that identifier* is an unordered collection of identifier instances with no duplicates, i.e., a set. @lcompilers, please close this issue if identifier* should be a set, otherwise discuss :)

`Integer` value in the test position of `If` statement ?

I found this in one of the stdout example. It has an Integer in the pocket. Seems too C-like.

(If ;; TODO: weird: integer value in the pocket?
         (NamedExpr
          (Var 2 a)
          (StringOrd
           (StringConstant "3" (Character 1 1 () []) )
           (Integer 4 [])
           (IntegerConstant 51 (Integer 4 [])) )
          (Integer 4 []) )
         [(=
           (Var 2 x)
           (IntegerConstant 1 (Integer 4 []))
           () )]
         []
         )

I spec it as valid for now:

(s/def ::logical-expr
  (s/or :logical-constant     ::LogicalConstant
        :logical-compare      ::LogicalCompare
        :integer-compare      ::IntegerCompare
        :real-compare         ::RealCompare
        :complex-compare      ::ComplexCompare
        :logical-binop        ::LogicalBinOp
        :logical-not          ::LogicalNot
        :cast                 ::Cast      ;; TODO check return type!
        :if-expr              ::IfExp     ;; TODO check return type!
        :named-expr           ::NamedExpr ;; TODO check return type!
        :var                  ::Var       ;; TODO check return type!
        ))

`ListAppend` : should there be `ttype`'s ?

ASDL:

    | ListAppend(expr a, expr ele)

I'll be more expressive and explicit in Clojure, but wondered ... the list a has a ttype and the list e has a ttype. They must match, but should the ASDL say something about the ttype's here?

dependencies: what kind of collection of identifiers?

I already have spec'ced identifier-set (unordered, no duplicates allowed), identifier-list, (ordered, duplicates allowed), and identifier-suit (ordered, no dupes).

What kind of collection is the dependencies field in Variable?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.