Giter Site home page Giter Site logo

Comments (8)

nok avatar nok commented on May 29, 2024 3

Hello @lichard49 @chappers , I have good news, with the very latest commit on the master branch you can transpile a RandomForestClassifier with imported data. Have a look into the prepared notebook for a demonstration which uses the export_data=True argument in the predict method.

You can use the following commands to install the latest version:

pip uninstall -y sklearn-porter
pip install --no-cache-dir https://github.com/nok/sklearn-porter/zipball/master

from sklearn-porter.

nok avatar nok commented on May 29, 2024 1

Hello @lichard49,

I noticed this issue in the past by porting and using a large svm classifier. In my case I fixed it manually by using a property file which stores the model data (support vectors).

But in Java ...

A single method in a Java class may be at most 64KB of bytecode.

Currently I'm working on the next release, where you can run predictions against the ported models in Python.

After that I will fix this issue by adding an alternative export for larger models (in Java). Because most models are larger than 64KB of bytecode.

Happy coding,
Darius 🌵

from sklearn-porter.

8bit-pixies avatar 8bit-pixies commented on May 29, 2024

@nok hopefully this isn't a stupid request as I don't normally use Java; could you provide a template of how you got around this for Java export?

from sklearn-porter.

nok avatar nok commented on May 29, 2024

Hello @chappers, I tested different solutions how we can store large model data in separate files.

First I tested .properties files:

public static Properties load(String path) throws IOException {
    Properties props = new Properties();
    FileInputStream inStream = new FileInputStream(path);
    BufferedInputStream buffer = new BufferedInputStream(inStream);
    props.load(buffer);
    inStream.close();
    return props;
}
public static double[][][] convert(double[][][] output, String[] data) {
    for (int i = 0, x = 0, xl = output.length; x < xl; x++) {
        for (int y = 0, yl = output[x].length; y < yl; y++) {
            for (int z = 0, zl = output[x][y].length; z < zl; z++) {
                output[x][y][z] = Double.parseDouble(data[i++]);
            }
        }
    }
    return output;
}
Properties model = Tmp.load(System.getProperty("user.dir") + "/src/model.properties");
// model.properties: "inters=0.0, 0.0, 10.0, 12. ... "
double[][][] inters = Tmp.convert(new double[2][3][4], model.getProperty("inters").split(","));
System.out.println(inters[0][1][1]);

But I don't like that solution, because it's not generic (<?> ...), what means that multiple versions of the convert method (method overloading) are required. Furthermore the other programming languages don't really work well with properties files. So I decided to use the JSON format for storing all dynamic model data, but again Java is the black sheep. It unfortunately doesn't have any JSON parser in the standard packages. The status is that I will give org.json a go.

from sklearn-porter.

8bit-pixies avatar 8bit-pixies commented on May 29, 2024

Thank you so much - I'm keen on seeing a more fleshed out version in the future, but at least I have an adhoc/manual way working in the interim.

from sklearn-porter.

nok avatar nok commented on May 29, 2024

Okay, that's good 👍 !

In the future the transpiled estimators will be cleaner, faster and more dynamically. Today small changes can affect over 40 different transformations and the related test cases.

from sklearn-porter.

pernorc85 avatar pernorc85 commented on May 29, 2024

I tried c with export_data = True, it seems not work.
Do you plan to support exported model in c in the future?

from sklearn-porter.

Vasilissk-prog avatar Vasilissk-prog commented on May 29, 2024

Hi and thank you very much for your contribution.

I am trying to export a RandomForestClassifier( n_estimators= 100, max_features = 'sqrt',max_depth=100, n_jobs=-1, verbose = 1) , but I think that my laptop runs out of memory. Do you think that I can try in a server with better specifications or only option is to reduce n_estimators and max_depth?

from sklearn-porter.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.