Giter Site home page Giter Site logo

memory mapping and fuse about oneclient HOT 3 OPEN

image357 avatar image357 commented on August 16, 2024
memory mapping and fuse

from oneclient.

Comments (3)

bkryza avatar bkryza commented on August 16, 2024

@image357
Hi,
the problem here is indeed with fuse, for some reason mmap on fuse requires MAP_PRIVATE flag to be passed to mmap call, unfortunately this is not possible to pass this flag through high-level numpy API, but it is possible to create the mmap manually and pass it to ndarray like this:

import numpy as np
import os
import mmap

save_array = np.arange(9).reshape(3,3)
np.save("array.npy", save_array, fix_imports=False)

# works
load_array1 = np.load("array.npy", mmap_mode="c", fix_imports=False)

# works
size = os.path.getsize("array.npy")
with open("array.npy", "r") as f2:
    mm = mmap.mmap(f2.fileno(), size, offset=0, flags=mmap.MAP_PRIVATE)
    array2 = np.ndarray((3,3), buffer=mm)
    print(array2)

Please let us know if this approach is acceptable?

from oneclient.

image357 avatar image357 commented on August 16, 2024

Thanks for your answer and sorry for the late reply.
This doesn't work, though. I guess the reason is that .npy files have a specific header that saves dtype, shape and other information. Hence, putting the plain file as the ndarray buffer can't work.

The output of your code is

[[1.87585069e-309 1.17119999e+171 5.93271341e-037]
 [8.44740097e+252 2.65141232e+180 9.92152605e+247]
 [2.16209968e+233 1.05161974e-153 6.01399921e-154]]

which is not the original array.
I also played around using different dtype and order arguments.

Also mmap.MAP_PRIVATE effectively creates a copy-on-write array which is equivalent to the 'c' option for np.load(..., mmap_mode='c', ...).
I suppose this is something that has to be fixed on the fuse side or might not be fixable at all.

from oneclient.

bkryza avatar bkryza commented on August 16, 2024

@image357 ok, thanks for the information, unfortunately this looks like this will not be possible through oneclient.

However, we also provide a Python library - OnedataFS - which gives direct access to our filesystem without Fuse. I will try to check if it will work with mmap().

OnedataFS is available by default on oneclient Docker image or can be installed from packages. It implements the PyFilesystem API (https://docs.pyfilesystem.org/en/latest/index.html). Example basic use is as follows:

from fs.onedatafs import OnedataFS
oneprovider_host = "example.com"
oneprovider_token = "ABCD...."
odfs = OnedataFS(oneprovider_host, oneprovider_token)
spaces = odfs.listdir('')
...

Even if mmap doesn't work, please note that each file opened through OnedataFS has an internal memory buffer which will prefetch from the storage only blocks which are requested by IO operations on the handle, so it won't read the entire file into memory if not necessary, so maybe the mmap wouldn't be necessary in your case....

from oneclient.

Related Issues (12)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.