Giter Site home page Giter Site logo

Quick Questions about onnx-scala HOT 3 CLOSED

ivanthewebber avatar ivanthewebber commented on May 29, 2024
Quick Questions

from onnx-scala.

Comments (3)

EmergentOrder avatar EmergentOrder commented on May 29, 2024

Hey Ivan,
I'm curious what kind of memory problems you encountered using the ORT Java API. Perhaps share some links to issues on that project if you have them? From time to time in the past I have encountered memory leaks here due to such issues. At the moment, I believe this project should be free of such leaks, but please let me know here and provide details to reproduce in case you encounter them.

In terms of throughput, you should get similar results from both projects, as the overhead introduced here is minimal. The most significant factor in both cases is data copying between the JVM and the native layer of ORT. It is possible, however that due to the use of cats-effect in this project that in some use cases you might see a difference (hopefully, in favor of ONNX-Scala).

I haven't done any evaluation in the context of Flink or other streaming contexts.
If you are interested in a comparison with NumPy and PyTorch, you can find Python scripts and a Scala benchmarking app in my project NDScala, which uses ONNX-Scala under the hood. The quick version is that NDScala / ONNX-Scala is faster than NumPy, but slower than PyTorch (on the simple multi-layer network example).

I wonder if you could explain why you believe ORTModelBackend is not thread-safe? I made an effort to make everything in this project as functional / immutable as possible, which should also mean that it is thread-safe. If it is not, I would consider that a bug.

from onnx-scala.

EmergentOrder avatar EmergentOrder commented on May 29, 2024

related ORT Java API memory issue here: microsoft/onnxruntime#18845

from onnx-scala.

ivanthewebber avatar ivanthewebber commented on May 29, 2024

My containers are getting OOMKilled as described in the related issue you linked. Likely because the ONNX runtime is requesting too much data and/or data is getting fragmented as I make a lot of large, nested arrays. I have been trying to improve performance by reusing objects (flyweight) and using a direct ByteBuffer to create tensors to avoid an unnecessary copy of input data.

Thanks for the info. I haven't tried switching to use this project instead but that it's on my radar. Looking at the source again I'm not sure what I was worried about. My use-case is highly parallelized so it's something I try to check for.

Since you answered my questions feel free to close; thanks.

from onnx-scala.

Related Issues (10)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.