Giter Site home page Giter Site logo

Comments (8)

bzz avatar bzz commented on July 25, 2024 1

Hi @rajurajvijay619, did you try using just a single language for the experiments?

E.g for java, I find total 500k samples from 184Mb of .gz to be very comfortably manageable on a laptop.

As one can see from published analysis example,
Screen Shot 2019-10-27 at 3 13 15 PM
languages like Go, JS or Ruby would give even smaller dataset sizes and fit on almost any local machine.

Hope this helps and good luck with experiments!

from codesearchnet.

hamelsmu avatar hamelsmu commented on July 25, 2024 1

I'll go ahead and close this issue, please lmk if there are any more questions

from codesearchnet.

sara-02 avatar sara-02 commented on July 25, 2024

@bzz just one question, when running for a single language(local machine), does the setup still requires GPUs?

from codesearchnet.

hamelsmu avatar hamelsmu commented on July 25, 2024

@sara-02 you can download data without GPUs, however running the default models in this repo will be painfully slow without gpus. However, you can try training on a smaller sample of the data as @bzz proposes, you can also set this parameter to limit the size of the data.

Also, google colab notebooks are great for free GPUs. Thanks for getting involved with this project ❤️

from codesearchnet.

hamelsmu avatar hamelsmu commented on July 25, 2024

@rajurajvijay619 can you describe your constraints a bit more? Is it disk size for downloading the dataset? Can you download the entire dataset and just sample from that?

Thanks for your feedback

from codesearchnet.

sara-02 avatar sara-02 commented on July 25, 2024

@sara-02 you can download data without GPUs, however running the default models in this repo will be painfully slow without gpus. However, you can try training on a smaller sample of the data as @bzz proposes, you can also set this parameter to limit the size of the data.

Also, google colab notebooks are great for free GPUs. Thanks for getting involved with this project heart

Thanks. I will look into colab as well as running it locally with only one language. I was hesitant to start because the first set in setup states that Additionally, you must install Nvidia-Docker to satisfy GPU-compute related dependencies. So, I thought the code might not run as-is on a local system with GPUs.

from codesearchnet.

hamelsmu avatar hamelsmu commented on July 25, 2024

@sara-02 you are correct regarding docker. I think in the end it could make your life easier to use the Docker setup, as installing all the dependencies by hand can become very cumbersome and brittle.

Let me know where you are struggling with Docker and I will be more than happy to help! I wrote this tutorial regarding Docker incase a gentle introduction is useful.

Looking forward to see what you do with this dataset! Please do not be shy in asking questions!

from codesearchnet.

hamelsmu avatar hamelsmu commented on July 25, 2024

if you are using collab, I do not believe you will be able to use Docker, in that case you will have to install via pip all the dependencies defined in the Dockerfile in the Collab notebook

from codesearchnet.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.