Giter Site home page Giter Site logo

mental32 / monty Goto Github PK

View Code? Open in Web Editor NEW
67.0 4.0 5.0 1.65 MB

A language toolchain for explicitly typed annotated Python. ๐Ÿ

License: MIT License

Python 0.79% Rust 99.20% Shell 0.01%
python monty cranelift python3 python-language compiler interpreter strongly-typed

monty's Introduction

Monty

A language toolchain for explicitly typed annotated Python.

Index

Abstract

Monty (/หˆmษ’ntษช/) is a language toolchain for a subset of Python 3.0+ it is designed from the ground up to be an LSP-first compiler, emphasizing high responsiveness and extraordinarily fast compile times.

Similar alternatives are already mature and pretty impressive, namely:

The compiler aims to maintain the inherent dynamism of Python while incorporating advanced type inference and type-checking capabilities.

Key Differentiators:

  1. Compile-Time Execution of Global Scope: Monty evaluates code in the global scope at compile-time, thereby making the import process static. This architectural decision is intended to encourage the segregation of business logic from initialization code.

  2. Advanced Typing Semantics: Monty's type system is influenced by prominent type checkers such as pytype, pyright, and pyre. It employs advanced type inference algorithms and supports a variety of type-narrowing techniques.

Monty is not intended to ever be a production grade language, it is a project I enjoy working on in my spare time and most recently I have decided to transition away from total coverage of Python and pick out a, strict, toy, subset of the language.

Building the compiler

You will need a fairly recent version of rustc, I am building locally with 1.72.0

Then specify that you would like to build the montyc binary, dropping the --release flag as needed.

cargo build --release --bin montyc

Crate/Repository Layout

  • /montyc is the compiler binary, it is a thin wrapper around montyc_driver
  • /montyc_driver is where all the magic happens, type checking/inference, calls into codegen, etc...
  • /montyc_codegen is where codegen providers are, currently only Cranelift is supported but I'd like to support both LLVM and GCC in the future.
  • /montyc_hlirt is a High Level Interpreter Runtime (HLIRT) and is a minimal but geniune Python interpreter used mainly for compile time evaluation.
  • /montyc_query is where the query interface is defined for the driver.
  • /montyc_flatcode is where AST -> FlatCode lowering happens.
  • /montyc_parser is the parser implementation.
  • /montyc_core is where all fundamental types used in this project go to live.

Related projects

monty's People

Contributors

dependabot[bot] avatar gitter-badger avatar mental32 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

monty's Issues

De cranelift-ify the IR

Currently our MIR, Ebb, FluidBlock and Basic block instructions are all more fragile copies of the ones in cranelift-codegen.

Typically this wasn't an issue since monty was being developed for montyc which uses cranelift as its main backend so if the IR API matched it made life easier. However monty is intended to be consumed by many others, such as MIRI, where the cranelift IR builder API mirroring doesn't make sense.

Support class and basic user-creatable types

Allow the following to be valid code:

class Foo:
    pass

def make_foo() -> Foo:
    return Foo()

def main():
    make_foo()

once that is working let's do associated methods:

class Foo:
-     pass
+    def bar(self) -> Foo: return self

builtins support

Currently we don't have any importing mechanisms or module infrastructure but this shouldn't matter too much for the bullt-ins since people don't typically use them through the builtins module import.

We now have minimal functioning support for modules & importing!
At the moment builtins have to be explicitly imported but we can fix this easily enough :)

Big list of stuff to implement:

  • abs
  • all
  • any
  • ascii
  • bin
  • bool
  • breakpoint (I don't have a god damn clue how to deal with this one)
  • bytearray
  • callable (Actually this should be fairly simple!)
  • chr
  • classmethod
  • compile
  • complex
  • delattr
  • dict
  • dir
  • dimod
  • enumerate
  • filter
  • float
  • format
  • frozenset
  • getattr
  • globals (comptime only)
  • hasattr (comptime only?)
  • hash
  • hex
  • id (this is basically the "address of")
  • input
  • int (a zero argument version of this should be easy asf)
  • isinstance(comptime only?)
  • issubclass (comptime only?)
  • iter (shrug)
  • len (ehhh kinda UB)
  • list
  • locals (comptime only)
  • map (shrug)
  • max
  • memoryview (what was this?)
  • min
  • next (shrug)
  • object (big shrug)
  • oct
  • open
  • ord
  • pow
  • print (big oof, nearly done)
  • property (SHRUG)
  • range
  • repr
  • reversed
  • round
  • set
  • setattr
  • slice
  • sorted
  • staticmethod
  • str (lmao no, we need heap allocation for this baby)
  • sum
  • super
  • tuple
  • type
  • vars
  • zip
  • __import__

Un-spaghettify the codebase ๐Ÿ

The codebase is a complete mess at the moment :(

Granted, I expected this. I've just been hacking it together hap haphazardly and its probably a good idea to go and look at the overall architectural approach we want to take.

At the moment the flow of information is pretty messy, there are several ways to get the same answer for a question. We still have access directly to the AST when we parse out the source into a module node. We do use the Item abstraction to alleviate the pain of working over the AST directly but this makes it difficult to re-write and modify the tree for instance (this would be needed for a class of tree-based optimization and constant operations)

Support for comptime macros?

I was thinking this might be nice thing to have, macros are pretty much everywhere and there's even a PEP for it!

I'm not planning on implementing PEP-648 yet since it's not supported in any official parser anyway.
Instead I was thinking something closer to Rust-like attribute macros.

Here's a contrived syntax example:

import __monty

@__monty.decorator_macro
def discard(node):
    return None 

@discard
class Never:
    """This class never exists."""

The idea is that these macros can only be applied as decorators and accept only one argument node of type ast.AST and return Optional[Union[ast.AST, Iterable[ast.AST]]] where:

  • None means "remove {node}"
  • a single AST instance means "replace {node} with {rv}"
  • any iterable value of AST nodes means "insert {rv} in place of {node}"

At the moment I assume the implementation, upon discovering a macro function, "pop" it from the tree then subsequently compile and call it whenever we use the macro.

Support for data structures

Currently we only support integers and boolean as legal types, this is fun and all but we should be able to express more complex structures i.e. (C-like?) strings and tuples.

  • Strings (#3)
  • Tuples
  • Lists
  • Sets
  • Dicts

Tracking Issue: type inference

We need to figure out type inference at some point if we even want a chance at a pleasant experience writing code.

Type annotations are fine in places where we fail to infer types and in ClassDef/FunctionDef nodes.

Everywhere else we kinda need it to work

Build an IR interpreter

MIRI, lets build an interpreter to consume our IR and execute it in a more controlled environment so we can test more thoroughly for stuff like memory leaks or use use-after-free's

Totally did not borrow this from Rust...

Support tracking string constants

Right now our internal visitors just skip over constant string nodes, we should be able to at least represent their presence. potentially in a global string interning table?

Tracking Issue: Documentation

Documentation efforts have been started in /dev/ there is a developer guide book and sphinx auto generated API docs.

These need to be expanded upon of course.

Just a few ideas on various narrowing ways

You've probably already considered this, just tried to think of commonly used ways to branch on type :)

# same narrowing
if isinstance(x, int): ...
if type(x) is int: ...
if issubclass(type(x), int): ...

Special cases with singleton stuff, obviously, like:

# same narrowing
if x is None: ...  # (!)
if type(x) is type(None): ...
if isinstance(x, type(None)): ...
if issubclass(type(x), type(None)): ...

More obscure narrowing cases, but should follow automatically if everything's done properly:

# say y is an int
if isinstance(x, type(y)): ...
if type(x) is type(y): ...
if issubclass(type(x), type(y)): ...

Tacking issue: Better error tracebacks

Currently these don't exist, the current model is to just hit an assertion error or a type exception somewhere deep in the code and figure out the solution yourself.

Missing core "import" machinery.

We gotta have modules but for that we gotta have importlib/import machinery in order to resolve and locate the modules when parsing out from-import and import nodes.

See also importlib._bootstrap, CPython actually appears to have its core import logic implemented in Python but when the interpreter gets compiled the source file implementation gets dumped out into raw bytecode and transpiled into a C int array and marked as "frozen".

`len` always returns 20

Expected behaviour:

assert(len([1, 2, 3]) == 3)

Current behaviour:

assert(len([1, 2, 3]) == 20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.