Giter Site home page Giter Site logo

Comments (8)

laolv421 avatar laolv421 commented on June 16, 2024 1

@garyzhang99 It worked, thanks a lot.

from data-juicer.

garyzhang99 avatar garyzhang99 commented on June 16, 2024

I was unable to reproduce this bug when running the command. Could you try running it again after executing pip install -v -e .[sci]?

from data-juicer.

laolv421 avatar laolv421 commented on June 16, 2024

I was unable to reproduce this bug when running the command. Could you try running it again after executing pip install -v -e .[sci]?

This error occured again after I executed pip install -v -e .[sci].

  • ray version=2.7.0
  • python version=3.8.18
  • data juicer =0.2.0

5
This error is weird, after I commented line 83, it will raise error from line 88.
6

from data-juicer.

garyzhang99 avatar garyzhang99 commented on June 16, 2024

I was unable to reproduce this bug when running the command. Could you try running it again after executing pip install -v -e .[sci]?

This error occured again after I executed pip install -v -e .[sci].

  • ray version=2.7.0
  • python version=3.8.18
  • data juicer =0.2.0

5 This error is weird, after I commented line 83, it will raise error from line 88. 6

When using Ray in a distributed setting, due to Ray's feature (Ray future), Ray does not compute directly at the corresponding line of code. Instead, the computation is performed when the result is called. After you commented out line 83, the computation that was originally performed at line 83 is executed at line 88, leading to an error at line 88, whereas the actual error should have occurred before line 83.

The error reporting mechanism of Ray makes it difficult to pinpoint the corresponding error. Could you try not using Ray first and run the corresponding code in a single-machine version to see if there are more complete error messages?

from data-juicer.

laolv421 avatar laolv421 commented on June 16, 2024

I was unable to reproduce this bug when running the command. Could you try running it again after executing pip install -v -e .[sci]?

This error occured again after I executed pip install -v -e .[sci].

  • ray version=2.7.0
  • python version=3.8.18
  • data juicer =0.2.0

5 This error is weird, after I commented line 83, it will raise error from line 88. 6

When using Ray in a distributed setting, due to Ray's feature (Ray future), Ray does not compute directly at the corresponding line of code. Instead, the computation is performed when the result is called. After you commented out line 83, the computation that was originally performed at line 83 is executed at line 88, leading to an error at line 88, whereas the actual error should have occurred before line 83.

The error reporting mechanism of Ray makes it difficult to pinpoint the corresponding error. Could you try not using Ray first and run the corresponding code in a single-machine version to see if there are more complete error messages?

Thanks for your advice. I have tested the code without Ray, and everything worked as expected, normally. I then double-checked the demo.yaml file, modified ray_address: 'ray://localhost:10001' to ray_address: 'auto' and ran the code. Everything worked normally except for two operators with models, namely, language_id_score_filter and perplexity_filter. When I commented out these two operators, it worked fine. I conducted unit tests on both operators, and they both worked. But on the local ray, they were unable to find the model.

# language_id_score_filter
  File "/home/lzj/project/open-source/data-juicer/data_juicer/ops/filter/language_id_score_filter.py", line 53, in compute_stats
    raise ValueError(err_msg)
ValueError: Model not loaded. Please retry later.
# perplexity_filter
  File "/home/lzj/project/open-source/data-juicer/data_juicer/ops/filter/perplexity_filter.py", line 71, in compute_stats
    logits += kenlm_model.score(line)
AttributeError: 'NoneType' object has no attribute 'score'

from data-juicer.

garyzhang99 avatar garyzhang99 commented on June 16, 2024

Based solely on the provided description, we have not been able to reproduce the bug, nor can we pinpoint the specific issue. It appears it might be an environmental problem or an issue with the get_model and check_model functions. Could you provide more information?

Additionally, I would like to ask whether CUDA is enabled in your local environment and whether the corresponding Data-Juicer version is up to date.

from data-juicer.

laolv421 avatar laolv421 commented on June 16, 2024

Based solely on the provided description, we have not been able to reproduce the bug, nor can we pinpoint the specific issue. It appears it might be an environmental problem or an issue with the get_model and check_model functions. Could you provide more information?

Additionally, I would like to ask whether CUDA is enabled in your local environment and whether the corresponding Data-Juicer version is up to date.

  • torch.cuda.is_available() = True
  • data-juicer = v0.2.0
    Here are the screenshots. I hope they are helpful for you.
    1
    2
    3
    4

from data-juicer.

garyzhang99 avatar garyzhang99 commented on June 16, 2024

Based solely on the provided description, we have not been able to reproduce the bug, nor can we pinpoint the specific issue. It appears it might be an environmental problem or an issue with the get_model and check_model functions. Could you provide more information?
Additionally, I would like to ask whether CUDA is enabled in your local environment and whether the corresponding Data-Juicer version is up to date.

  • torch.cuda.is_available() = True
  • data-juicer = v0.2.0
    Here are the screenshots. I hope they are helpful for you.
    1
    2
    3
    4

It looks like the issue may be because you are using an older version of data-juicer, which previously did not have good support for CUDA in the Ray distributed version. You can try these two solutions separately:

  • Pull the latest data-juicer code from the main branch on GitHub, then build from source (pip install -v -e .).
  • Avoid using CUDA by setting the use_cuda related configurations to False in the code and modifying the CUDA environment variables accordingly.

It should be able to solve your problem.

from data-juicer.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.