mmomtchev / pymport Goto Github PK
View Code? Open in Web Editor NEWUse Python libraries from Node.js
License: ISC License
Use Python libraries from Node.js
License: ISC License
Currently, there is no special type for proxified PyObject
s - they have to be declared as any
or unknown
.
pymport('numpy')
on Node.js 16.x / Ubuntu 22.04 leads to a crash because of conflict between the two OpenSSL versions
When using PyObject.fromJS()
Infinity
gets transformed to integer and loses its special value
The []
operator in JavaScript returns undefined
if the key is not defined, while the same operator in Python raises an exception.
PyObject.item()
should follow the JS convention.
The manual use of dlopen
with module
has side-effects on the Node.js binary addon loading mechanism:
Welcome to Node.js v18.17.1.
Type ".help" for more information.
> require('pymport')
{
PyObject: [Function: PyObject],
pymport: [Function (anonymous)],
pyval: [Function (anonymous)],
version: [Getter],
proxify: [Function: proxify]
}
> require('/home/mmom/src/scdb/node_modules/everything-json/lib/binding/linux-x64/everything-json.node')
{
PyObject: [Function: PyObject],
pymport: [Function (anonymous)],
pyval: [Function (anonymous)],
version: [Getter]
}
Calling a Python function with a single Buffer
argument triggers the kwargs
parsing:
const { pymport } = require('pymport/profixied');
const np = pymport('numpy');
np.frombuffer(Buffer.from([1,2]));
throws class objects cannot be converted to Python
- as the Buffer
is interpreted as kwargs
and the error is thrown on its internal fields
Consider the following TypeScript code:
import { PyObject } from 'pymport/proxified';
const d = PyObject.dict({a: 1, b: 2});
for (const i of d) console.log(i);
Instead of outputting
a
b
it outputs
undefined
undefined
npm install --build-from-source
fails because node-addon-api
is a devDependency
Error: Cannot find module 'node-addon-api'
Require stack:
- /home/mmom/src/test/node_modules/pymport/[eval]
at Function.Module._resolveFilename (internal/modules/cjs/loader.js:902:15)
at Function.Module._load (internal/modules/cjs/loader.js:746:27)
at Module.require (internal/modules/cjs/loader.js:974:19)
at require (internal/modules/cjs/helpers.js:101:18)
at [eval]:1:1
at Script.runInThisContext (vm.js:134:12)
at Object.runInThisContext (vm.js:310:38)
at internal/process/execution.js:81:19
at [eval]-wrapper:6:22
at evalScript (internal/process/execution.js:80:60) {
code: 'MODULE_NOT_FOUND',
requireStack: [ '/home/mmom/src/test/node_modules/pymport/[eval]' ]
Workaround is to manually install node-addon-api
npm install node-addon-api
The following way to use multiple values below work:
df.item(['Age', 'Fare']).median().type
'Series'
Is possible to support something without relying in manual calling the .item() ?
This works:
df['Age'].median().type
'numpy.float64'
This is wrong (since should be a Series, and not a float value):
df['Age', 'Fare'].median().type
'numpy.float64'
This does not work:
df[['Age', 'Fare']].median().type
Uncaught TypeError: Cannot read properties of undefined (reading 'median')
pymport
cannot be rebuilt from source with Python 3.11 and npm
6.
Old versions of node-gyp
are not compatible with Python 3.11.
There is no solution to this problem except to upgrade npm
The error message is:
Traceback (most recent call last):
File "/opt/hostedtoolcache/node/14.20.1/x64/lib/node_modules/npm/node_modules/node-gyp/gyp/gyp_main.py", line 50, in <module>
sys.exit(gyp.script_main())
^^^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/node/14.[20](https://github.com/mmomtchev/pymport/actions/runs/3446910836/jobs/5753009688#step:7:21).1/x64/lib/node_modules/npm/node_modules/node-gyp/gyp/pylib/gyp/__init__.py", line 554, in script_main
return main(sys.argv[1:])
^^^^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/node/14.20.1/x64/lib/node_modules/npm/node_modules/node-gyp/gyp/pylib/gyp/__init__.py", line 547, in main
return gyp_main(args)
^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/node/14.20.1/x64/lib/node_modules/npm/node_modules/node-gyp/gyp/pylib/gyp/__init__.py", line 520, in gyp_main
[generator, flat_list, targets, data] = Load(
^^^^^
File "/opt/hostedtoolcache/node/14.20.1/x64/lib/node_modules/npm/node_modules/node-gyp/gyp/pylib/gyp/__init__.py", line 136, in Load
result = gyp.input.Load(build_files, default_variables, includes[:],
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/node/14.20.1/x64/lib/node_modules/npm/node_modules/node-gyp/gyp/pylib/gyp/input.py", line 2782, in Load
LoadTargetBuildFile(build_file, data, aux_data,
File "/opt/hostedtoolcache/node/14.20.1/x64/lib/node_modules/npm/node_modules/node-gyp/gyp/pylib/gyp/input.py", line 391, in LoadTargetBuildFile
build_file_data = LoadOneBuildFile(build_file_path, data, aux_data,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/node/14.20.1/x64/lib/node_modules/npm/node_modules/node-gyp/gyp/pylib/gyp/input.py", line 234, in LoadOneBuildFile
build_file_contents = open(build_file_path, 'rU').read()
^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: invalid mode: 'rU' while trying to load binding.gyp
gyp ERR! configure error
gyp ERR! stack Error: `gyp` failed with exit code: 1
gyp ERR! stack at ChildProcess.onCpExit (/opt/hostedtoolcache/node/14.20.1/x64/lib/node_modules/npm/node_modules/node-gyp/lib/configure.js:351:16)
gyp ERR! stack at ChildProcess.emit (events.js:400:28)
gyp ERR! stack at Process.ChildProcess._handle.onexit (internal/child_process.js:285:12)
gyp ERR! System Linux 5.15.0-10[22](https://github.com/mmomtchev/pymport/actions/runs/3446910836/jobs/5753009688#step:7:23)-azure
This code is leaking memory:
for (let i = 0; i < 1000; i++) {
const a = PyObject.list([1]);
a.get('__getitem__');
}
FATAL ERROR: Error::Error napi_define_properties
1: 0xa3ac30 node::Abort() [node]
2: 0x970199 node::FatalError(char const*, char const*) [node]
3: 0x9701a2 [node]
4: 0xa07f1b napi_fatal_error [node]
PyObject.prototype.constr
should be a Python callable and not a JS function
.toString()
on a proxified object calls the JS .toString()
on the proxy which in turn calls the Python str()
built-in resulting in a slightly different output that contains [object
prefix
In a Node based Docker container with Python3 installed, npm i pymport
succeeds but npx pympip3 install numpy
fails with:
You don't appear to have the pymport built-in Python environment installed: /app/node_modules/pymport/lib/binding/linux-arm64/bin/python3
Hello! This is such a cool project, I'm very excited about it. Thank you for all of your work on it!
I've got this working with some python packages (e.g. fuzzysearch
). I just tried it with whisperx, though, and I got the following error:
Error: Python exception: libsqlite3.so.0: cannot open shared object file: No such file or directory
I feel like there's some path issue here that I don't know enough to dig into, because I can import and execute whisperx
in Python with no issue.
I've tried setting PYTHONHOME
/PYTHONPATH
, which seems to do what I'd expect (it seems to be using the Python virtualenv that I point it at), but I get the same error about being unable to find libsqlite3.so.0
.
Some more info:
If you have any guidance, it'd be much appreciated! And please let me know if there's any more detail I can provide that would be helpful.
const { PyObject } = require('pymport/proxified');
const list = PyObject.list([]);
console.log(list.length) // <- undefined
On Linux and macOS, npx pympip
fails because of a missing executable bit
I encountered the error below when running npx pympip3 install pandas
, and I ensure I had installed python and pandas before.
You don't appear to have the pymport built-in Python environment installed: /Users/jadyyang/code/test-project/node_modules/.pnpm/[email protected]/node_modules/pymport/lib/binding/darwin-arm64/bin/python3
node-gyp
does not work with Python 3.12
Consider the following Python code declaring a static class member:
class SomeClass:
static_member = 42
and the following JS proxified pymport code:
const klass = pymport('above_code.py').SomeClass;
klass
will be proxified as Python callable, rendering static_member
inaccessible
Let py
be a callable PyObject
.
In the following call
py.call((x) => x + 1);
the function will get interpreted as named arguments.
Proxified objects are never freed due to incorrect use of the WeakMap
guaranteeing the uniqueness of the references
Every time the pymport
module is loaded and then unloaded, it leaks about 15Kb to 20Kb of memory.
The only way this can happen more than once is if the main thread never loads the module but repeatedly spawns worker_threads
that load the module and then quit, allowing Node.js to unload the library.
The package node-calls-python seems to achieve the same goal as you, using the same way (using the N-API and Python C API).
Any difference in the implementations, limits, performances in your opinion ?
For further reference, here are some packages achieving the same goals with different means:
Pybridge is using child_process, communicates through stdin/stdout, and uses serialization/deserialization for data exchange.
Pyopide is running inside the browser using WebAssembly->slower than other solutions
Hey @mmomtchev thanks for the great project!
It's possible to run the execution inside a REPL like IPython Kernel?
So we can use some magic commands like "%%run" and others?
const bool = PyObject.fromJS(true);
const val = bool.toJS(); // val is 1 instead of true
Currently nodejs/node#45088 prevents clean shutdown of the Python environment.
As usually this shutdown takes place just before exiting the process, most of the time this does not have any consequences.
However if the main instance does not use pymport
but spawns worker_threads
that use Python, then pymport
will get loaded by the worker_thread
. When this thread exits, Node.js will unload the pymport
dynamic library from memory without Python being properly shut down. This will leak very precious TLS (Thread-Local Storage) memory and will eventually lead to a crash.
pymport
incorrectly always uses the main event loop to schedule async messages which can result in a crash when using async calls in a worker_thread
Consider this code:
const list = proxify(PyObject.list([1]));
console.log(list.length); // displays 1
list.append(2);
console.log(list.length); // still displays 1, stored in __pymport_proxy__
The store of proxified object uses incorrectly the object reference as key, resulting in mismatches
It should be possible to automatically convert to JS primitive values such as True
, False
, None
and in some cases strings and numbers.
When using the Node.js inspector to step over the final iteration of a loop iterating over a Python iterable, when Python throws the final exception, V8 crashes.
Seems to happen only in async contexts.
The crash is caused by V8 unable to identify the correct handler to step over.
Python exception: <NULL> @ C:\Users\mmom\src\pymport\src\call.cc:265
#
# Fatal error in , line 0
# Check failed: CodeKind::INTERPRETED_FUNCTION == code->kind().
#
#
#
#FailureMessage Object: 000000E8F97FAF20
1: 00007FF66A0A9E7F node_api_throw_syntax_error+175967
2: 00007FF669FC036F v8::CTypeInfoBuilder<void>::Build+11999
3: 00007FF66AE2D182 V8_Fatal+162
4: 00007FF66A9F29E5 v8::internal::Debug::PrepareStepOnThrow+1253
5: 00007FF66A9F1099 v8::internal::Debug::OnThrow+345
6: 00007FF66A9A51D0 v8::internal::Isolate::ThrowInternal+544
7: 00007FF66A9A4059 v8::internal::Isolate::ScheduleThrow+25
8: 00007FF66AAD1CCB v8::Isolate::ThrowException+59
9: 00007FF66A07DF17 napi_throw+119
10: 00007FFDC38E642C Napi::Error::ThrowAsJavaScriptException+188 [C:\Users\mmom\src\pymport\node_modules\node-addon-api\napi-inl.h]:L2408
11: 00007FFDC39C29C7 `Napi::details::WrapCallback<<lambda_484baa0b744a2c3b2bec5c95d3446a94> >'::`1'::catch$0+103 [C:\Users\mmom\src\pymport\node_modules\node-addon-api\napi-inl.h]:L65
12: 00007FFDC394D3F0 _CallSettingFrame_LookupContinuationIndex+32 [D:\a\_work\1\s\src\vctools\crt\vcruntime\src\eh\amd64\handlers.asm]:L98
13: 00007FFDC394B51E __FrameHandler4::CxxCallCatchBlock+478 [D:\a\_work\1\s\src\vctools\crt\vcruntime\src\eh\frame.cpp]:L1439
14: 00007FFDF2E33CE6 RtlCaptureContext2+1190
15: 00007FFDC38F9B75 Napi::details::WrapCallback<<lambda_484baa0b744a2c3b2bec5c95d3446a94> >+53 [C:\Users\mmom\src\pymport\node_modules\node-addon-api\napi-inl.h]:L60
16: 00007FFDC38F86E0 Napi::InstanceWrap<pymport::PyObjectWrap>::InstanceMethodCallbackWrapper+64 [C:\Users\mmom\src\pymport\node_modules\node-addon-api\napi-inl.h]:L3463
17: 00007FF66A0760B0 cppgc::internal::BaseSpace::pages_mutex+32752
18: 00007FF66AAECAE2 v8::internal::SetupIsolateDelegate::SetupHeap+54946
19: 00007FF5EB0FB5C4
Howdy! This is a great project! I've been using it to integrate Storyteller with the whisperx and fuzzysearch python libraries. It works very well in the general case, but:
Storyteller runs pymport
from a Worker thread, via piscina. Piscina supports cancelling workers, and does so by calling worker.terminate()
. When this callsite is reached, if the Worker is running Python code, the Node.js runtime crashes with one of a few different C++ errors:
terminate called after throwing an instance of 'Napi::Error'
what():
OR
double free or corruption (out)
After which the entire Node.js runtime crashes, up through the parent process that kicked off the Piscina worker.
For what it's worth, https://github.com/hmenyus/node-calls-python, a similar project, has very similar issues.
Anyway, I don't even know if this is something you can resolve here, or if it's actually an issue with terminating worker threads while Node.js is running any addon code, but I figured I would give it a shot!
Here's a minimal reproduction, if that helps at all: https://github.com/smoores-dev/node-python-worker-repro
How do I spin up a second interpreter to get past the async blocking issue? I do not need to share info between each interpreter. Also can python stdout and stderr be redirected to nodejs ?
When using the built-in interpreter, the reported directory for the include files is not correct:
× Building wheel for xxhash (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [18 lines of output]
running bdist_wheel
running build
running build_py
creating build
creating build/lib.linux-x86_64-cpython-311
creating build/lib.linux-x86_64-cpython-311/xxhash
copying xxhash/__init__.py -> build/lib.linux-x86_64-cpython-311/xxhash
copying xxhash/version.py -> build/lib.linux-x86_64-cpython-311/xxhash
copying xxhash/py.typed -> build/lib.linux-x86_64-cpython-311/xxhash
copying xxhash/__init__.pyi -> build/lib.linux-x86_64-cpython-311/xxhash
running build_ext
building '_xxhash' extension
creating build/temp.linux-x86_64-cpython-311
creating build/temp.linux-x86_64-cpython-311/deps
creating build/temp.linux-x86_64-cpython-311/deps/xxhash
creating build/temp.linux-x86_64-cpython-311/src
gcc -pthread -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -Ideps/xxhash -I/home/runner/work/pymport/pymport/lib/binding/linux-x64/include/python3.11 -c deps/xxhash/xxhash.c -o build/temp.linux-x86_64-cpython-311/deps/xxhash/xxhash.o
error: command 'gcc' failed: No such file or directory
[end of output]
When throwing a Python exception from a proxified Python object back to the user, the proxify
layer, ideally, should hide itself in the stack trace so that the user can immediately see the origin of the exception in his code.
const obj = PyObject.fromJS({ test: 'test' });
assert.isUndefined(obj.get('notAtest'));
// This will throw the still pending exception from the previous statement
assert.throws(() => pyval('invalid'), /invalid/);
Consider the following scenario:
const q = pyCallable.callAsync(pyArg);
const b = somePyObject.attribute;
await q;
The first operation launches a Python operation in a background thread (using an existing thread in libuv
's pool).
Python however is a single-threaded interpreter with a limited shared-memory multi-threading support - but not one which involves actually simultaneously interpreting Python code. Be sure to read Async
in the wiki.
This means that the second line will effectively block the event loop until the first operation concludes.
A small, but notable exception to this rule is native Python extensions like numpy
which release the Python GIL while running C code.
No Python object can be accessed while a Python background operation is running.
This is a fundamental Python limitation that is unlikely to go away in a foreseeable future.
The V8 GC has a complex algorithm that decides when it is going to make a collection run and this requires tracking of the allocated memory. Currently pymport
reports only the size of objects it has referenced and does only once.
In particular these two cases can be problematic:
dict
or a list
that gradually becomes huge - V8 won't be aware of the size of this object and won't be in a hurry to free it when it is no longer referencedThese problem do not cause leaks - as the memory is still tracked - it is simply not correctly included in the GC statistics.
Alas, solving these problems would require instrumenting Python's allocator (which is costly) and there is no possible solution compatible with worker_threads
- as all V8 threads share the same Python instance - it won't be possible to know whose account is to be charged when allocating memory.
Error: Python exception: No module named '_ctypes'
at pymport (D:\a\pymport\pymport\proxified\index.js:22:44)
at Suite.<anonymous> (D:\a\pymport\pymport\test\proxy.test.ts:7:21)
at Object.create (D:\a\pymport\pymport\node_modules\mocha\lib\interfaces\common.js:148:19)
at context.describe.context.context (D:\a\pymport\pymport\node_modules\mocha\lib\interfaces\bdd.js:42:27)
at Object.<anonymous> (D:\a\pymport\pymport\test\proxy.test.ts:5:1)
at Module._compile (node:internal/modules/cjs/loader:1159:14)
at Module.m._compile (D:\a\pymport\pymport\node_modules\ts-node\src\index.ts:1618:23)
at Module._extensions..js (node:internal/modules/cjs/loader:1213:10)
at Object.require.extensions.<computed> [as .ts] (D:\a\pymport\pymport\node_modules\ts-node\src\index.ts:1621:12)
at Module.load (node:internal/modules/cjs/loader:1037:32)
at Function.Module._load (node:internal/modules/cjs/loader:878:12)
at Module.require (node:internal/modules/cjs/loader:1061:19)
at require (node:internal/modules/cjs/helpers:103:18)
at Object.exports.requireOrImport (D:\a\pymport\pymport\node_modules\mocha\lib\nodejs\esm-utils.js:49:16)
at async Object.exports.loadFilesAsync (D:\a\pymport\pymport\node_modules\mocha\lib\nodejs\esm-utils.js:91:[20](https://github.com/mmomtchev/pymport/actions/runs/3732763183/jobs/6332653301#step:10:21))
at async singleRun (D:\a\pymport\pymport\node_modules\mocha\lib\cli\run-helpers.js:1[25](https://github.com/mmomtchev/pymport/actions/runs/3732763183/jobs/6332653301#step:10:26):3)
at async Object.exports.handler (D:\a\pymport\pymport\node_modules\mocha\lib\cli\run.js:370:5)
Caused by python/cpython#100320
PYTHONHOME
environment variable does not allow using another installation as root
standard ubuntu 22.04
node v20.12.0
npm 10.5.0
npm install pymport
leads to:
npm ERR! code 7
npm ERR! path /data/dev/dev/snoot/snoot_pcd/node_modules/pymport
npm ERR! command failed
npm ERR! command sh -c node-pre-gyp install --fallback-to-build && node scripts/patch-prefix.js
What is the workaround to install pymport ?
Causes abort
on process exit
Assertion failed: (handle->flags & UV_HANDLE_CLOSING), function uv__finish_close, file ../deps/uv/src/unix/core.c, line 258.
running bdist_wheel
running build
running build_py
creating build
creating build/lib.macosx-11.7-x86_64-cpython-310
creating build/lib.macosx-11.7-x86_64-cpython-310/xxhash
copying xxhash/version.py -> build/lib.macosx-11.7-x86_64-cpython-310/xxhash
copying xxhash/__init__.py -> build/lib.macosx-11.7-x86_64-cpython-310/xxhash
copying xxhash/py.typed -> build/lib.macosx-11.7-x86_64-cpython-310/xxhash
copying xxhash/__init__.pyi -> build/lib.macosx-11.7-x86_64-cpython-310/xxhash
running build_ext
building '_xxhash' extension
creating build/temp.macosx-11.7-x86_64-cpython-310
creating build/temp.macosx-11.7-x86_64-cpython-310/deps
creating build/temp.macosx-11.7-x86_64-cpython-310/deps/xxhash
creating build/temp.macosx-11.7-x86_64-cpython-310/src
gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Ideps/xxhash -I/Users/runner/work/pymport/pymport/lib/binding/darwin-x64/include/python3.10 -c deps/xxhash/xxhash.c -o build/temp.macosx-11.7-x86_64-cpython-310/deps/xxhash/xxhash.o
gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Ideps/xxhash -I/Users/runner/work/pymport/pymport/lib/binding/darwin-x64/include/python3.10 -c src/_xxhash.c -o build/temp.macosx-11.7-x86_64-cpython-310/src/_xxhash.o
src/_xxhash.c:31:10: fatal error: 'Python.h' file not found
#include <Python.h>
^~~~~~~~~~
1 error generated.
error: command '/usr/bin/gcc' failed with exit code 1
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for xxhash
Failed to build xxhash
ERROR: Could not build wheels for xxhash, which is required to install pyproject.toml-based projects
On macOS the builtin Python interpreter fails to correctly set up its installation home
Collecting auto_gptq
Using cached auto_gptq-0.7.1.tar.gz (126 kB)
Installing build dependencies ... done
Getting requirements to build wheel ... error
error: subprocess-exited-with-error
× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> [1 lines of output]
Building cuda extension requires PyTorch (>=1.13.0) being installed, please install PyTorch first: No module named 'torch'
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error
× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip.
I have installed PyTorch before and it worked fine in my own project, I think maybe the environment using to build wheel is different from in-project.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.