Comments (15)
Yes, but PyO3 writes zeros to the entire slice.
from pyo3.
It should also be possible to create a new type that has the buffer protocol and pass that to
PyBytes
without copying.
Considering the resizeable PyBytes
might be XY problem, I would also suggest looking into creating a Python object implementing the buffer protocol. For example, in rust-numpy
we have PySliceContainer
which we use a the base object for NumPy arrays which are backed by Rust-allocated Vec
or rather ndarray::Array
instances, i.e. writing the data into a Python object is not really required, just a Python object which will properly deallocate the Rust allocation to act as the base object for the buffer protocol implementor.
from pyo3.
If the first proposal is the chosen one, then it would probably be good to have a version that doesn't zero out the bytes first. Something like:
fn with_new_uninit<F>(py: Python<'_>, len: usize, init: F) -> PyResult<&Self>
where
F: FnMut(&[MaybeUninit<u8>]) -> PyResult<()>
from pyo3.
I think that is already the case?
pyo3 calls PyBytes_FromStringAndSize with a null pointer, and according to the docs:
If v is NULL, the contents of the bytes object are uninitialized.
from pyo3.
Ah, yes, missed that.
from pyo3.
There's some previous discussion to this in #1074
We could explore making it work but _Py_Resize
is private, so I'd need to ask the CPython core devs how they would like this to work.
from pyo3.
Whats the process to get that going?
I assumed that the function had underscore just because its usable in a very specific context.
from pyo3.
Could we do this by adding an alternative utility function sth like:
pub fn resizeable_new_with<F>(py: Python<'_>, len: usize, init: F) -> PyResult<&PyBytes>
where
¦ F: FnOnce(&mut [u8]) -> PyResult<usize>,
{
¦ unsafe {
¦ ¦ let mut pyptr =
¦ ¦ ¦ ffi::PyBytes_FromStringAndSize(std::ptr::null(), len as ffi::Py_ssize_t);
¦ ¦ if pyptr.is_null() {
¦ ¦ ¦ return Err(PyRuntimeError::new_err(format!(
¦ ¦ ¦ ¦ "failed to allocate python bytes object of size {}",
¦ ¦ ¦ ¦ len
¦ ¦ ¦ )));
¦ ¦ }
¦ ¦ // Check for an allocation error and return it
¦ ¦ let buffer = ffi::PyBytes_AsString(pyptr) as *mut u8;
¦ ¦ debug_assert!(!buffer.is_null());
¦ ¦ // If init returns an Err, pypybytearray will automatically deallocate the buffer
¦ ¦ let new_len = init(std::slice::from_raw_parts_mut(buffer, len))?;
¦ ¦ if _PyBytes_Resize(ptr::addr_of_mut!(pyptr), new_len as ffi::Py_ssize_t) != 0 {
¦ ¦ ¦ return Err(PyRuntimeError::new_err("failed to resize bytes object"));
¦ ¦ }
¦ ¦ let pypybytes: Py<PyBytes> = Py::from_owned_ptr_or_err(py, pyptr)?;
¦ ¦ Ok(pypybytes.into_ref(py))
¦ }
}
from pyo3.
@damelLP: The _PyBtyes_Resize
that the code uses is the potential problem here, not necessarily the method name.
@AudriusButkevicius and @davidhewitt: I'm talking about things I'm not very familiar with, but wouldn't it be possible to construct the data with PyByteArrays and then simply make a PyBytes object from the array? I'm not fully certain how much overhead that would add but it would be a stable / public way to do this.
from pyo3.
That would perform a copy (as far as I understand) which is what I want to avoid. As it stands, I dont think there is a zero copy way to allocate PyBytes
from pyo3.
It uses the buffer protocol which, if I understand correctly, doesn't copy the underlying data.
While each of these types have their own semantics, they share the common characteristic of being backed by a possibly large memory buffer. It is then desirable, in some situations, to access that buffer directly and without intermediate copying.
from pyo3.
It should also be possible to create a new type that has the buffer protocol and pass that to PyBytes
without copying.
from pyo3.
I assumed it was copying. Do you have docs that suggests its zero copy?
from pyo3.
Scattered throughout the buffer protocol documentation
While each of these types have their own semantics, they share the common characteristic of being backed by a possibly large memory buffer. It is then desirable, in some situations, to access that buffer directly and without intermediate copying.
Buffer structures (or simply “buffers”) are useful as a way to expose the binary data from another object to the Python programmer. They can also be used as a zero-copy slicing mechanism. Using their ability to reference a block of memory, it is possible to expose any data to the Python programmer quite easily. The memory could be a large, constant array in a C extension, it could be a raw block of memory for manipulation before passing to an operating system library, or it could be used to pass around structured data in its native, in-memory format.
These are not things I've used before, but the documentation heavily suggests, if not says outright, that it should be zero copy.
from pyo3.
Sorry that I fell off this thread a little.
Whats the process to get that going?
I assumed that the function had underscore just because its usable in a very specific context.
There's probably good reasons why a general _Py_Resize
function is private. The right process would be to create an issue upstream in CPython with a clear proposal of what's needed, and possibly even help implement.
Considering the resizeable
PyBytes
might be XY problem, I would also suggest looking into creating a Python object implementing the buffer protocol.
👍 on this, and in particular we plan to make this easier with #3148.
from pyo3.
Related Issues (20)
- `PyObject_CallOneArg()` usage triggers assert HOT 1
- non-local `impl` definition, they should be avoided as they go against expectation HOT 2
- segmentation fault second run of rust function in python pyo3 0.21.2 HOT 24
- Default values in enum struct-variants HOT 4
- Set custom path with `run_code` HOT 1
- `#[pyo3(from_py_with)]` is ignored in dunder methods (`__eq__`, etc.) HOT 1
- Incorrect configuration exclusions for Windows Py>3.10??
- IPython ignores Control + C after loading Rust module built with PyO3/Maturin HOT 3
- cargo test --workspace fails with linker errors
- Tuple from iterable
- Add trait for cloning while the GIL is held HOT 2
- pyo3 fails to build on systems without AtomicI64 HOT 7
- libpython not loaded under Anaconda HOT 2
- Replace &'a Ident with Cow in FnArg Enum
- Optional argument is detected in `linux`/`windows` but not in `macos` HOT 3
- Segmentation fault and misaligned pointer dereference with Rayon HOT 1
- intern macro name is not very descriptive HOT 1
- Missing `__init_subclass__` and `__set_name__` magic methods in the user guide HOT 2
- [Feature request] Directly expose tp_bases, tp_mro, and tp_subclasses on PyType in high-level API HOT 4
- `a.b.c += 1` does not work HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pyo3.