Giter Site home page Giter Site logo

Dynamic output types about onnxruntime-rs HOT 6 OPEN

nbigaouette avatar nbigaouette commented on August 19, 2024
Dynamic output types

from onnxruntime-rs.

Comments (6)

Narsil avatar Narsil commented on August 19, 2024 1

Hi,

I wanted to give this library a spin, but this is blocking me, and also the fact that Input types can have the same problems.
In NLP, some graphs will consume both Ids and masks (which are integers) AND some cached attention layers, which do NOT have the same dimension AND are floats.

The same goes for the output, different shapes and different data types (returning new clues to be used for the cache + some logits or other heads).

I'm not sure if it's currently feasible on the input side. I've seen https://docs.rs/ndarray/0.14.0/ndarray/type.IxDyn.html which could be useful for the dimension, but does not seem to solve the type i32 vs f32.

Thanks for this lib !

from onnxruntime-rs.

Narsil avatar Narsil commented on August 19, 2024 1

tch-rs is using Tensor as the backbone type, which is completely opaque to the rust typing system so no issues doing Vec<Tensor> of different types and shapes.

If by something you mean the reverse, like a generic .next(),

That is what I meant.
If you could have a fully staticly typed call signature with all necessary tensors + types that would be ideal (100% typed + named tensors so that callers cannot miscall the forward pass). However I think this is not really realistic is it ? I mean onnx uses the session objects, and unwrapping the signature at compile time is not really feasible right ? (At least that's why I went with .next() in my tentative PR).

A bit more context on the PR, I was attempting to remove python entirely for inference on GPT2+ ORT quantized (in a webserver context). The end result is that we were still winning something like 2x (I remember 3x, but I don't want to boast something wrong, I don't really remember) over something relatively naïve with Python webserver + GPT2 + ORT quantized.

from onnxruntime-rs.

ahirner avatar ahirner commented on August 19, 2024

Another very common use case are indices alongside float values (e.g. torch.topk). When int64 is inferred as f32, all values turn out to be nigh zero floats. Thus, casting them back won't work either.

Are you still interested working on #65 @marshallpierce?

from onnxruntime-rs.

marshallpierce avatar marshallpierce commented on August 19, 2024

Sadly I don't have time. I hope someone picks it up though.

from onnxruntime-rs.

Narsil avatar Narsil commented on August 19, 2024

Is this something that would work: #69 ?

from onnxruntime-rs.

ahirner avatar ahirner commented on August 19, 2024

Is this something that would work: #69 ?

This PR is about inputs. If by something you mean the reverse, like a generic .next(), I think this could be an interesting solution. I didn't quite get the gist of #65 yet. I had tuples with finite amount of lengths in mind. To use such feature, the caller has to know their target types anyway.

It's probably a good idea to study the tch-rs API as well.

from onnxruntime-rs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.