Giter Site home page Giter Site logo

Comments (4)

gbolmier avatar gbolmier commented on June 3, 2024 1

Actually, it's not really the structure which is inconvenient, it's more about writing the ormbfile.yaml artifact config file. I opened a separate issue (#180) to discuss this further. I'm closing this one as nothing prevents users to publish other ML artifacts like transformers.

from ormb.

gaocegege avatar gaocegege commented on June 3, 2024

Hi @gbolmier

ML models often require stateful transformers to process data for them (e.g. standard scaler). Unfortunately, this kind of artifact isn't supported as of now.

Do you mean the model with transformer structure, or some transformation functions to process the data?

Also some ML frameworks aren't supported, yet? Especially frameworks that don't use specific serialization formats, but rely on e.g. the pickle protocol.

https://github.com/kleveross/ormb/blob/master/pkg/model/format.go The format is defined here. You can add a new format pickle.

And, welcome contributions!

from ormb.

gbolmier avatar gbolmier commented on June 3, 2024

Hi @gaocegege, thanks a lot for the prompt answer.

Do you mean the model with transformer structure, or some transformation functions to process the data?

I'm referring to the second (e.g. standard scaler, pca, tf-idf vectorizer). These transformers are closely tied to the model, they often have hyperparameters that impact the model's performance and a state updated while processing the training data (like models). The model's performance on unseen data is dependent on the transformers used during the training phase, that's why stateful transformers are persisted to further process unseen data in the same way they processed the training data.

https://github.com/kleveross/ormb/blob/master/pkg/model/format.go The format is defined here. You can add a new format pickle.

And, welcome contributions!

Thanks a lot for the pointer, cool this looks pretty straightforward.

Follow-up question, let's say I want to share and publish some transformers tied to my ML model, do I have to create similar tree structures for each transformer along the model one?

$ tree .
.
├── sklearn_model
│   ├── model
│   │   └── sklearn_model.joblib
│   └── ormbfile.yaml
├── sklearn_transformer_a
│   ├── model
│   │   └── transformer_a.joblib
│   └── ormbfile.yaml
└── sklearn_transformer_b
    ├── model
    │   └── transformer_b.joblib
    └── ormbfile.yaml

6 directories, 6 files

If that's the case, could we make it more convenient in practice?

from ormb.

gaocegege avatar gaocegege commented on June 3, 2024

If that's the case, could we make it more convenient in practice?

What's your favorite srtructure? As you know, OCI supports layer-based storage like Docker Image, maybe we could discuss it further.

from ormb.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.