Giter Site home page Giter Site logo

pyjion's Introduction

Pyjion

Designing a JIT API for CPython

A note on development

Development has moved to https://github.com/tonybaloney/Pyjion

FAQ

What are the goals of this project?

There are three goals for this project.

  1. Add a C API to CPython for plugging in a JIT
  2. Develop a JIT module using CoreCLR utilizing the C API mentioned in goal #1
  3. Develop a C++ framework that any JIT targeting the API in goal #1 can use to make development easier

Goal #1 is to make it so that CPython can have a JIT plugged in as desired (CPython is the Python implementation you download from https://www.python.org/). That would allow for an ecosystem of JIT implementations for Python where users can choose the JIT that works best for their use-case. And by using CPython we hope to have compatibility with all code that it can run (both Python code as well as C extension modules).

Goal #2 is to develop a JIT for CPython using the JIT provided by the CoreCLR. It's cross-platform, liberally licensed, and the original creator of Pyjion has a lot of experience with it.

Goal #3 is to abstract out all of the common bits required to write a JIT implementation for CPython. The idea is to create a framework where JIT implementations only have to worry about JIT-specific stuff like how to do addition and not when to do addition.

How do you pronounce "Pyjion"?

Like the word "pigeon". @DinoV wanted a name that had something with "Python" -- the "Py" part -- and something with "JIT" -- the "JI" part -- and have it be pronounceable.

How do this compare to ...

PyPy is an implementation of Python with its own JIT. The biggest difference compared to Pyjion is that PyPy doesn't support all C extension modules without modification unless they use CFFI or work with the select subset of CPython's C API that PyPy does support. Pyjion also aims to support many JIT compilers while PyPy only supports their custom JIT compiler.

Pyston is an implementation of Python using LLVM as a JIT compiler. Compared to Pyjion, Pyston has partial CPython C API support but not complete support. Pyston also only supports LLVM as a JIT compiler.

Numba is a JIT compiler for "array-oriented and math-heavy Python code". This means that Numba is focused on scientific computing while Pyjion tries to optimize all Python code. Numba also only supports LLVM.

IronPython is an implementation of Python that is implemented using .NET. While IronPython tries to be usable from within .NET, Pyjion does not have a compatibility story with .NET. This also means IronPython cannot use C extension modules while Pyjion can.

Psyco was a module that monkeypatched CPython to add a custom JIT compiler. Pyjion wants to introduce a proper C API for adding a JIT compiler to CPython instead of monkeypatching it. It should be noted the creator of Psyco went on to be one of the co-founders of PyPy.

Unladen Swallow was an attempt to make LLVM be a JIT compiler for CPython. Unfortunately the project lost funding before finishing their work after having to spend a large amount of time fixing issues in LLVM's JIT compiler (which has greatly improved over the subsequent years).

Both Nuitka and Shedskin are Python-to-C++ transpilers, which means they translate Python code into equivalent C++ code. Being a JIT, Pyjion is not a transpiler.

Are you going to support OS X and/or Linux?

Yes! Goals #1 and #3 are entirely platform-agnostic while goal #2 of using CoreCLR as a JIT compiler is not an impedence to supporting OS X or Linux as it already supports the major OSs. The only reason Pyjion doesn't directly support Linux or OS X is entirely momentum/laziness: since the work is being driven by Microsoft employees, it simply meant it was easier to get going on Windows.

Will this ever ship with CPython?

Goal #1 is explicitly to add a C API to CPython to support JIT compilers. There is no expectation, though, to ship a JIT compiler with CPython. This is because CPython compiles with nothing more than a C89 compiler, which allows it to run on many platforms. But adding a JIT compiler to CPython would immediately limit it to only the platforms that the JIT supports.

Does this help with using CPython w/ .NET or UWP?

No.

Code of Conduct

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

pyjion's People

Contributors

bachmann1234 avatar brettcannon avatar dinov avatar dwillmer avatar ethanhs avatar gvanrossum avatar hexchain avatar krysros avatar meadori avatar mordechaimaman avatar msftgits avatar skyline75489 avatar timgates42 avatar tintoy avatar tonybaloney avatar yasoob avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pyjion's Issues

Need asynchronous exception support

test_threading has a failing test test_PyThreadState_SetAsyncExc.

We need some way to support asynchronous exceptions. Ideally we wouldn't have to do checks inside the function, but we might need to resort to polling on back branches.

Improve ROT_TWO/ROT_THREE support

Currently we don't support ROT_TWO/ROT_THREE with optimized locals, and instead just say they escape if we hit those opcodes. We need to update the code generator to use locals of the correct type to spill into and make sure all of our code tracking them is correct.

Infer the types of dict indexing

If we know the type contents of a dict we can then infer the return type of a subscription operation if we know what the key is. This only works if the dict has not escaped the local scope.

Type infer the complex type

absvalue.cpp and absvalue.h have very limited type inference support for various operations. We should flesh it out and add full type inference support for all unary, binary, and comparison operations for all built-in types.

Need to make locals available in traceback objects

Similar to #9, but we need to do it for traceback objects after an exception has occurred. It probably means we need to save the values when we unwind a function for exceptions, but maybe we could do something smarter. It causes test_frame.py to fail the test_locals test.

Emit direct calls to built-ins

If we've done a LOAD_GLOBAL and know it resolves to a known builtin we should generate a call directly to it. Depends upon dictionary watcher support to get it right (#6).

Consider moving build to CMake

Already need it for RyuJIT anyway, plus it would make us cross-platform for building (whether our C++ code is cross-platform is another question :) .

Add project motivation to README

This is probably premature to ask for as I gather this project is not ready for the limelight, but at some point it would be good to add some explanation of the motivation behind this project to the README.

Some basic questions to answer:

  • What spurred the creation of this project?
  • How does it relate to projects like Pyston and PyPy?
  • Who's most going to want to use this implementation?
  • What level of compatibility with CPython is this implementation targeting?

And so forth.

Use git submodules

Currently we have our GetDeps.bat file do git checkouts into subdirectories for dependencies and then do hard resets on them in order to apply patches. What would be better would be to use git submodule support for those project dependencies. That way all we need to do is apply the patches to the subprojects to get all the code together.

Refactor END_FINALLY

The bytecode is horribly overloaded. Instead of making the JIT have to jump through so many hoops, leading to possible bugs, we should instead fix the bytecode to not try and do so much.

Patch sys._getframe, provide access to optimized locals

Somehow we should have access to the locations of variables in the JITed code (because the CLR needs to provide it for debugging). We should patch sys._getframe and offer a version which can introspect and update our optimized functions.

Infer the type of items of a tuple

Since tuples are immutable, if we know its contents when constructed we know what indexing or slicing on a tuple will return if we know the index/slice value.

Resurrect unladen swallow's dictionary watchers

We need to be able to know when to invalidate code generation that's based upon globals/built-ins. We can probably just resurrect unladen swallow's dictionary watchers to accomplish that.

Tagged pointer support

If we know we have non-escaping integer values we should omit tagged pointer support. That should improve integer arithmetic performance while still supporting overflow to long ints. We should only do it on non-escaping values to avoid repeatedly needing to rebox values.

Assigning to f_trace needs to somehow enable tracing

Once a tracing function is installed with sys.settrace it can simply return None all the time. Then later can always do sys._getframe() and assign to f_trace of a running function which will cause the function to be traced:

import sys

def f():
    sys.settrace(lambda*args: None)
    print(1)
    g()
    print(2)
    print(3)

def g():
    sys._getframe().f_back.f_trace = lambda *args: print('hi')

f()

W/o the JIT this will print:

1
hi
2
hi
3
hi

We'll need to detect when a frame has it's trace function updated and force the code back into the interpreter. It'll probably be a little bit tricky in that we don't want to do a check on every call and instead we probably want to patch the return address from our call out of the frame on the stack.

Can't execute `python -m test`

Triggers an assertion failure at https://github.com/Microsoft/Pyjion/blob/master/Pyjion/absint.h#L176. If you run with test -n you end up with:

PS C:\Users\brcan\Documents\Repositories\Pyjion\Python> .\python.bat -m test -n
Running Debug|x64 interpreter...
== CPython 3.6.0a0 (default, Oct 26 2015, 11:12:33) [MSC v.1900 64 bit (AMD64)]
c:\users\brcan\documents\repositories\pyjion\pyjion\absint.h(176) : Assertion failed: !(valueInfo.Value == &Undefined && !isUndefined)
==   Windows-10.0.10240 little-endian
==   hash algorithm: siphash24 64bit
==   C:\Users\brcan\Documents\Repositories\Pyjion\Python\build\test_python_6592
Testing with flags: sys.flags(debug=0, inspect=0, interactive=0, optimize=0, dont_write_bytecode=0, no_user_site=0, no_site=0, ignore_environment=0, verbose=0, bytes_warning=0, quiet=0, hash_randomization=1, isolated=0)
[  1/398] test_grammar
Unknown unsupported opcode: YIELD_FROMC:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\include\vector(159) : Assertion failed: vector iterator + offset out of range
C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\include\vector(160) : Assertion failed: "Standard C++ Libraries Out of Range" && 0

Inline critical builtins

There's probably some built-ins which are simple enough and important enough in various scenarios where they deserve to be inlined. min, max, abs, etc... Also depends upon dictionary watchers (#6) for correct implementation

The chameleon_v2 benchmark locks up in an infinite loop

Looks like it's a nasty case of PyJit_RichEquals_Str using PyUnicode_CheckExact and PyJit_RichEquals_Generic using PyUnicode_Check and them never agreeing on who should handle things (although changing PyJit_RichEquals_Generic to use PyUnicode_CheckExact causes the benchmark to fail).

Profile function arguments

The Python/JIT interface is now updated so that we can provide an address and a data address which is provided to us at callback time. We can use this to return a single function which does profiling of the Python code and records the argument types. We can then callback to the no-jit version of PyEval. After a sufficient number of traces we can then run the abstract interpreter with the input argument types and generate a function with guards, or a guard function + a implementation function that is specialized for the input types.

Look into improving looping bytecode

It has been said that the fact Python leaves something on the stack for each loop iteration is a bit of a pain for the CLR. Python's bytecode should probably be examined to see why the item is left on and if it is really worth keeping those semantics or if it would work out better to not leave the item on the stack and thus make the corresponding code in the JIT easier to work with.

Refactor abstract interpreter to drive code gen

Currently the compiler is querying the abstract interpreter for how it should omit code. Instead the compiler interface shouldn't deal w/ byte code directly - it should just support primitive operations:

push_const(PyObject* value)
object_binary(int op)
float_binary(int op)
tagged_pointer_binary(int op)
object_unary(int op)
float_unary(int op)

etc... And the abstract interpreter can call back on the interface knowing the types that need to be operated on. Over time the abstract interpreter could also be refactored to use this in a cleaner way then a ton of if's...

Add co_jitted to code objects, with dictionary for setting options

It exists at the C level, but isn't exposed at the Python level. We should also add a dictionary object which allows users to put in arbitrary key/values. Various JITs can pick those key/values up and use them.

There's also probably a bad circular reference between jitted code and code objects right now which should go away.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.