Giter Site home page Giter Site logo

Comments (7)

martindurant avatar martindurant commented on September 15, 2024

There are many levers to pull, actually. How are you setting the pool, what kind of benchmark are you running, and do you have an idea of what your current bottleneck may be caused by? Since fsspec generally maintains its own IO thread/loop, a significant increase in performance is something I'd be happy to bake in.

from s3fs.

ion-elgreco avatar ion-elgreco commented on September 15, 2024

@martindurant I am currently passing this to the S3FileSystem: config_kwargs={"max_pool_connections": 50},.

I was checking with iftop what peak transfer rate was, it was just 50Mb out of 1Gbps network capacity (aks -> LakeFS on aks -> azure blob). It took around 15secs to read 6000 txt files. I think it could go faster but not sure :)

from s3fs.

martindurant avatar martindurant commented on September 15, 2024

Would you mind making a graph of max_pool versus throughput? How many files (~ coroutines) are in flight?

from s3fs.

ion-elgreco avatar ion-elgreco commented on September 15, 2024

@martindurant do you have some examples on how to access these things during execution?

from s3fs.

martindurant avatar martindurant commented on September 15, 2024
  • I thought throughput was exactly what you were already measuring
  • The number of files you should be able to get from a normal glob or expand_paths call.
  • You could maybe use callbacks to measure the coroutines, but probably you would need to hack something into maybe fsspec.asyn._runner

from s3fs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.