Giter Site home page Giter Site logo

kljohann / genpybind Goto Github PK

View Code? Open in Web Editor NEW
13.0 4.0 5.0 800 KB

Autogeneration of Python bindings from manually annotated C++ headers

License: Other

Python 73.45% C 2.37% C++ 23.90% Objective-C 0.28%
python-bindings clang cpp python libclang pybind11

genpybind's Introduction

genpybind CircleCI

Autogeneration of Python bindings from manually annotated C++ headers

genpybind is a tool based on clang, which automatically generates the code which is necessary to expose a C++ API as a Python extension via pybind11. To reduce the complexity and required heuristics, it relies on additional manual hints that are added to the header file in the form of unobtrusive annotation macros1. While this mandates that you are able to change the actual interface declaration, it results in a succinct file that describes both the C++ and Python interface of your library. However, as a consequence, manually writing pybind11 code is still necessary for code which is not under your control.

That said, a simple class that should be exposed via a Python extension could look as follows:

#pragma once

#include "genpybind.h"

class GENPYBIND(visible) Example {
public:
  static constexpr int GENPYBIND(hidden) not_exposed = 10;

  /// \brief Do a complicated calculation.
  int calculate(int some_argument = 5) const;

  GENPYBIND(getter_for(something))
  int getSomething() const;

  GENPYBIND(setter_for(something))
  void setSomething(int value);

private:
  int _value = 0;
};

The resulting extension can then be used like this:

>>> import pyexample as m
>>> obj = m.Example()
>>> obj.something
0
>>> obj.something = 42
>>> obj.something
42
>>> obj.calculate() # default argument
47
>>> obj.calculate(2)
44
>>> help(obj.calculate)
Help on method calculate in module pyexample:

calculate(...) method of pyexample.Example instance
    calculate(self: pyexample.Example, some_argument: int=5) -> int

    Do a complicated calculation.

As you can see, annotations are included inline to control what is exposed to the Python extension, whether getters or setters are exposed as a class property, …. The resulting Python extension will among other things include docstrings, argument names and default arguments for functions. Imagine how much time you will save by not manually keeping the python bindings and header files in sync! For the example presented above genpybind will generate the following:

auto genpybind_class_decl__Example_Example =
    py::class_<::Example>(m, "Example");

{
  typedef int (::Example::*genpybind_calculate_type)(int) const;
  genpybind_class_decl__Example_Example.def(
      "calculate", (genpybind_calculate_type) & ::Example::calculate,
      "Do a complicated calculation.", py::arg("some_argument") = 5);
}
genpybind_class_decl__Example_Example.def(py::init<>(), "");
genpybind_class_decl__Example_Example.def(py::init<const ::Example &>(), "");
genpybind_class_decl__Example_Example.def_property(
    "something", py::cpp_function(&::Example::getSomething),
    py::cpp_function(&::Example::setSomething));

Implementation

The current implementation was started as a proof-of-concept to see whether the described approach was viable for an existing code base. Due to its prototypical and initially fast-changing nature it was based off the libclang bindings. However, as of clang 5.0.0 not all necessary information was available via this API. (For example, implicitly instantiated constructors are not exposed.) To work around this issue, several patches on top of libclang are included in this repository (some of them have already been merged upstream; some are rather hacky or not yet finished/fully tested). In addition, a genpybind-parse tool based on the internal libtooling clang API is used to extend/amend the abstract syntax tree (e.g. instantiate implicit member functions) and store it in a temporary file. This file is then read by the Python-based tool via the patched libclang API.

Evidently, now that the approach has been shown to work, the implementation could transition to be a single C++ tool based on the internal libtooling API. I eventually plan to go down that road.

Known defects and shortcomings

  • Documentation is minimal at the moment. If you want to look at example use-cases the integration tests might provide a starting point.
  • Expressions and types in default arguments, return values or GENPYBIND_MANUAL instructions are not consistently expanded to their fully qualified form. As a workaround it is suggested to use the fully-qualified name where necessary.

Installation

  1. Build and install llvm/clang 9.0.0 with the patches provided in llvm-patches. You can use a different prefix when installing, to prevent the patched clang from interfering with the version provided by your distribution. Let's assume you unpacked the source code to $HOME/llvm-src and used -DCMAKE_INSTALL_PREFIX=$HOME/llvm.

  2. Make sure genpybind can find the the libclang Python bindings:

    export PYTHONPATH=$HOME/llvm-src/tools/clang/bindings/python \
      LD_LIBRARY_PATH=$HOME/llvm/lib
  3. Build the genpybind-parse executable:

    PYTHON=/usr/bin/python2 CXX=/bin/clang++ CC=/bin/clang \
      LLVM_CONFIG=$HOME/llvm/bin/llvm-config \
      ./waf configure --disable-tests
    ./waf build

    Note that custom Python/compiler/llvm-config executables can be provided via environment variables. If you happened to use the -DBUILD_SHARED_LIBS=ON option when building clang you need to pass --clang-use-shared to waf configure.

    Optional: If you want to build and run the integration tests you need to install pytest and pybind11 and should remove the --disable-tests argument to waf configure. You can use the --pybind11-includes option to point to the include path required for pybind11.

  4. Install the genpybind tool:

    ./waf install

    By default genpybind will be installed to /usr/local/. Use the --prefix argument of waf configure if you prefer a different location.

  5. Create Python bindings from your C++ header files:

    # Remember to set up your environment as done in step 2 each time you run genpybind:
    export PYTHONPATH=$HOME/llvm-src/tools/clang/bindings/python \
      LD_LIBRARY_PATH=$HOME/llvm/lib
    # The following assumes that both `genpybind` and `genpybind-parse` are on your path.
    genpybind --genpybind-module pyexample --genpybind-include example.h -- \
      /path/to/example.h -- \
      -D__GENPYBIND__ -xc++ -std=c++14 \
      -I/path/to/some/includes \
      -resource-dir=$HOME/llvm/lib/clang/5.0.0

    The flags after the second -- are essentially what you would pass to the compiler when processing the translation unit corresponding to the header file.

Keywords

arithmetic

enum GENPYBIND(arithmetic) Access { Read = 4, Write = 2, Execute = 1 };

To allow arithmetic on enum elements use the arithmetic keyword.

See enums.h and enums_test.py.

dynamic_attr

The dynamic_attr keyword controls if dynamic attributes (adding additional members at run-time) is allowed:

struct GENPYBIND(visible) Default {
  void some_function() const {}
  bool existing_field = true;
};

struct GENPYBIND(dynamic_attr) WithDynamic {
  void some_function() const {}
  bool existing_field = true;
};

See dynamic_attr.h and dynamic_attr_test.py.

expose_as

expose_as allows to give the Python binding a name different from the one in the C++ source:

GENPYBIND(expose_as(some_other_name));
bool name;

This also allows to populate the private/name-mangled Python variables and functions:

GENPYBIND(expose_as(__hash__))
int hash() const;

See expose_as.h and expose_as_test.py.

getter_for/setter_for/accessor_for

Python propiertes are supported by the getter_for and setter_for keywords:

GENPYBIND(getter_for(value))
int get_value() const;

GENPYBIND(setter_for(value))
void set_value(int value);

GENPYBIND(getter_for(readonly))
bool computed() const;

getter and setter_for are short-hands for accessor_for(..., get/set).

See properties.h and properties_test.py.

hidden

See visible.

hide_base

See hide_base.h and hide_base_test.py.

holder_type

Cf. pybind11's PYBIND11_DECLARE_HOLDER_TYPE.

See holder_type.h and holder_type_test.py.

inline_base

See inline_base.h and inline_base_test.py.

keep_alive

To control the life time of objects passed to or returned from (member) functions, keep_alive can be used. keep_alive(bound, who) indicates that who should be kept alive at least until bound is garbage collected. An argument to keep_alive can be either the name of a function parameters or one of return or this, where return refers to the return value of the function and this refers to the instance a member function is called on.

GENPYBIND(keep_alive(this, child))
Parent(Child *child);

When the instance of Parent is deleted from Python, child will not be deleted as well.

See keep_alive.h and keep_alive_test.py.

module

Using the module keyword C++ namespaces can be turned into submodules of the generated Python module. In the following example, X would be exposed as name_of_module.submodule.X, where name_of_module is the name of the outer Python module.

namespace submodule GENPYBIND(module) {
class GENPYBIND(visible) X {};
} // namespace submodule

See submodule.h and submodule_test.py.

noconvert

Implicit conversion of function arguments can be controlled with the noconvert keyword:

GENPYBIND(noconvert(value))
double noconvert(double value);

GENPYBIND(noconvert(first))
double noconvert_first(double first, double second);

If noconvert(...) is called with anything but type double, a TypeError is raised. For multi-argument functions, the behaviour can be controlled on a per-variable basis.

See noconvert.h and noconvert_test.py.

opaque

Allows to "inline" the underlying type at the location of a typedef, as if it was defined there. As the name of this feature may lead to confusion with pybind11's PYBIND11_MAKE_OPAQUE, it will likely be renamed or redesigned in an upcoming release. More details can be found in issue #24.

See expose_as.h and expose_as_test.py.

postamble

Unscoped GENPYBIND_MANUAL macros can be used to add preamble and postamble code to the generated bindings, e.g. for importing required libraries or executing python code that dynamically patches the generated bindings:

GENPYBIND(postamble)
GENPYBIND_MANUAL({
  auto env = parent->py::module::import("os").attr("environ");
  // should not have any effect as this will be run after preamble code
  env.attr("setdefault")("genpybind", "postamble");
  env.attr("setdefault")("genpybind_post", "postamble");
})

GENPYBIND_MANUAL({
  auto env = parent->py::module::import("os").attr("environ");
  env.attr("setdefault")("genpybind", "preamble");
})

See manual.h and manual_test.py.

readonly

readonly is an alias for writable(false);

required

GENPYBIND(required(child))
void required(Child *child)

Calls to functions where pointer arguments are annotated with required and called from Python with None will raise a TypeError.

See required.h and required_test.py.

return_value_policy

The return value policy controls how returned references are exposed to Python:

Nested &ref();

const Nested &cref() const;

GENPYBIND(return_value_policy(copy))
Nested &ref_as_copy();

struct GENPYBIND(visible) Parent {
  GENPYBIND(return_value_policy(reference_internal))
  Nested &ref_as_ref_int();
};

By default, the automatic return value policy of pybind11 is used. In the case of ref and cref in the example this amounts to "return by value" for the wrapped Python functions. This behavior is unchanged when the function is explicitly annotated to return by value (see ref_as_copy). As ref_as_ref_int demonstrates, any other return value policy supported by pybind11 can be set. In this case reference_internal is used to return a reference to an existing object, whose life time is tied to the parent object.

See return_by_value_policy.h and return_by_value_policy_test.py.

stringstream

The stringstream keyword populates the str and repr functionality:

GENPYBIND(stringstream)
friend std::ostream &operator<<(std::ostream &os, const Something &) {
  return os << "uiae";
}

See stringstream.h and stringstream_test.py.

visible

If a binding is supposed to be generated is controlled by the visibility keywords visible and hidden:

class Unannotated {};

class GENPYBIND(hidden) Hidden {};

class GENPYBIND(visible) Visible {};

Any GENPYBIND annotation will make the annotated entity visible. As a consequence visible can be removed from the argument list, as soon as there are any other arguments to GENPYBIND.

Anything without an annotation is excluded by default, but the intent of hiding it from bindings can be explicitly stated by the keyword hidden.

If a namespace is annotated with visible, any contained entity will be made visible by default, even if it has no GENPYBIND annotations. The hidden keyword can then be used to hide it.

See visibility.h and visibility_test.py.

writeable

Constness is transported from C++ to Python automatically. In addition, variables can be set to be read-only by the writable keyword:

const int const_field = 2;
GENPYBIND(writeable(false))
int readonly_field = 4;

For both const_field and readonly_field, an AttributeError will be raised if set from Python.

See variables.h and variables_test.py.

License

See License.


Footnotes

  1. During normal compilation these macros have no effect on the generated code, as they are defined to be empty. The annotation system is implemented using the annotate attribute specifier, which is available as a GNU language extension via __attribute__((...)). As the annotation macros only have to be parsed by clang and are empty during normal compilation the annotated code can still be compiled by any C++ compiler. See genpybind.h for the definition of the macros.

genpybind's People

Contributors

cpehle avatar kljohann avatar muffgaga avatar phispilger avatar schmitts avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

genpybind's Issues

genpybind not found, when loading spack-installed package

when doing the following:

$ spack load -r genpybind ^[email protected] %[email protected].
[email protected]  [email protected]  
$ genpybind
Traceback (most recent call last):
  File "/wang/environment/software/jessie/spack/spack-staging/opt/spack/linux-debian8-x86_64/gcc-7.2.0/genpybind
-develop-ymkuj5q5gyxfuaqwgcg2oqcols2hv5os/bin/genpybind", line 4, in <module>
    from genpybind.tool import main; main()
ModuleNotFoundError: No module named 'genpybind'

So genpybind and genpybind-parse are on PATH, but PYTHONPATH does not contain the lib/pythonX.X/site-packages/ folder of the genpybind package. Is there an extends('python') missing in the package.py?

Integration into popular build systems

In order for people to use genpybind in projects based on e.g. CMake, bazel (and its clones), Meson, gn, … it would be helpful to provide them with instructions on how to integrate the code-generating step of genpybind into the build.

Aliased base class yields `child.referenced.kind not in cutils.RECORD_KINDS`

When trying to expose a class with "alias" bases, e.g.

using AliasForSomething = Something<Whatever>;

struct GENPYBIND(expose_as(Something), inline_base("*Something*")) PySomething
    : public AliasForSomething /* Something<Whatever> would work */
{
// ...

there's an assert triggering in genpybind/decls/klasses.py printing CursorKind.TYPE_ALIAS_DECL, cf. here.

Improve `opaque` keyword

The name is misleading and the documentation is wrong (updated to point to this issue in commit 7a1a62f). Consider a new name, e.g. expose_here for opaque(true) / opaque. (Or redesign the feature…) The status quo:

  • If opaque is used, the referred type is not exposed at its original location, but at the location of the typedef (and with the spelling of the typedef). Any extra arguments to GENPYBIND on the typedef are passed on to the underlying declaration.
  • If opaque(false) is used, we wait until other things have been processed. Iff the underlying declaration has already been exposed elsewhere once the typedef reaches the expose_later stage, a simple alias is created (m.attr("name_of_typedef") = underlying_decl;). Else it is exposed at the location of the typedef as if opaque(true) had been used at its original location and an alias is created.

Operators aren't exposed for deepter-than-1-stage inheritance

namespace detail {
template<typename Derived, typename T>
struct BaseType
{
    // comparison operator here
};
} // namespace detail

namespace X {
struct GENPYBIND(inline_base("*")) Value : public Y::Value
{
    constexpr explicit Value(uintmax_t value = 0) GENPYBIND(implicit_conversion) : base_t(value) {}
};
} // namespace X

namespace Y {
struct GENPYBIND(inline_base("*")) Value : public detail::BaseType<Value, uint32_t>
{
    constexpr explicit Value(uintmax_t value = 0) GENPYBIND(implicit_conversion) : base_t(value) {}
};
} // namespace Y

yields

// FIXME: expose [Operator CursorKind.FUNCTION_DECL detail::operator==]
//        ('operator==', 2) -> (('eq', 'eq'), 'l == r')
//        ['const ::Y::Value', 'const ::Y::Timer::Value']

Using `{g,s}etter_for` in CRTP-base classes does not expose the attribute

When deriving from a CRTP base, the getter/setter-annotated attributes are not exposed correctly;
Minimal example will follow… cf. PR #29.

Getter (42 == obj.mvalue):

TypeError: (): incompatible function arguments. The following argument types are supported:
    1. (arg0: Base<Enum>) -> int

Invoked with: [42]

Setter: (obj.mvalue = 42)

TypeError: (): incompatible function arguments. The following argument types are supported:
    1. (arg0: Base<Enum>, arg1: int) -> None

Invoked with: [42], 23

exposed via:

genpybind_struct_decl__Enum_Enum.def_property("mvalue", py::cpp_function(&::Enum::get_value), py::cpp_function(&::Enum::set_value));

call to genpybind fails if TMPDIR exists

@obreitwi reported (cf. electronicvisions#1)

If TMPDIR is set and points to an existing directory, calls to genpybind will fail:

Traceback (most recent call last):
  File "/wang/environment/software/jessie/spack/2017-12-01/view/bin/genpybind", line 4, in <module>
    from genpybind.tool import main; main()
  File "/wang/environment/software/jessie/spack/2017-12-01/view/lib/python2.7/site-packages/genpybind/tool.py", line 51, in main
    shutil.rmtree(name)                                                                                                                                                                                                                                                                                                       
  File "/wang/environment/software/jessie/spack/2017-12-01/view/lib/python2.7/shutil.py", line 270, in rmtree                                                                                                                                                                                                                 
    onerror(os.rmdir, path, sys.exc_info())                                                                                                                                                                                                                                                                                   
  File "/wang/environment/software/jessie/spack/2017-12-01/view/lib/python2.7/shutil.py", line 268, in rmtree                                                                                                                                                                                                                 
    os.rmdir(path)                                                                                                                                                                                                                                                                                                            
OSError: [Errno 39] Directory not empty: '/scratch/obreitwi/2018-01-29_haldls/build/shitness/abc/genpybindAtSREY'

As I could not reproduce using:

$ find; TMPDIR=$PWD/def python -c "import tempfile, shutil, subprocess; name = tempfile.mkdtemp(prefix='juhu'); open('{}/test'.format(name), 'a').close(); print 'Temp:', name; print subprocess.check_output('find', shell=True); shutil.rmtree(name)"; find
.
./def
Temp: /wang/users/mueller/cluster_home/abc/def/juhutRn2nG
.
./def
./def/juhutRn2nG
./def/juhutRn2nG/test

.
./def

It could be related to some NFS-specific behavior? (@obreitwi is it always happening?)

Adding wrapper code for already wrapped types

struct GENPYBIND(visible) TopLevel
{
    // already wrapped somewhere else
    typedef SomeComponents some_type GENPYBIND(opaque);

    GENPYBIND_MANUAL({
        auto ism = parent->py::is_method(attr);
        auto attr = parent.attr("some_type");
        attr.attr("to_numpy") = parent->py::cpp_function(/* code */, ism);
    })
};

In some cases it might be nice to modify the wrapper code of some element contained in the wrapper code of a TopLevel class instead of adding the member dynamically via attr.

Extend documentation: Use-cases, examples and motivation

In order to better understand what benefit genpybind could provide to a project, some use cases and links to existing projects might be helpful. For example:

  • Expose a C++ API to Python: This one is obvious, but can also be achieved using hand-written bindings. In addition to reducing the manual effort, genpybind helps in those cases where your interface is still undergoing regular changes, where manual bindings might be a hindrance.
  • Interactive use and exploration of a library's API: Without much overhead you can add a single line in order to exercise your newly-written function using the REPL.
  • Facilitated unit testing: (This is one of my favourite points.)
    • Writing tests in Python could be easier, faster or more approachable. (In addition it does not require Knowledge of C++, which might help depending on your developers'/testers' backgrounds.)
    • Unit testing libraries like https://docs.pytest.org could reduce the boilerplate involved in writing a test. It may be faster to go from interactive exploration of your implementation in the REPL to a unit test.
    • You can take advantage of property based testing using a powerful, stable and well-tested framework like https://hypothesis.readthedocs.io.

Also, standalone examples and links to projects that make use of genpybind (e.g. https://github.com/electronicvisions) would likely be of interest for potential users.

Update pybind11 version

The latest pybind11 version genpybind has been developed against is 2.2.1. Consequently it's also used in the CI and unit tests. We should update to a recent pybind11 version and introduce support for potential new features, e.g.:

There have also been some changes that have an effect on our unit tests, which need to be adjusted:

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.