Giter Site home page Giter Site logo

Comments (10)

kylepjohnson avatar kylepjohnson commented on June 2, 2024 1

Hi, Yes, that's probably too large. However when we serve on the web, we'll want to pull down everything. Can you think of a way to make the downloads optional? Or maybe allow users to download everything for 1 language?

from cltk_docker.

 avatar commented on June 2, 2024 1

Yes I am interested in starting on preliminary repos.

from cltk_docker.

kylepjohnson avatar kylepjohnson commented on June 2, 2024

For this issue, I'd like you to look at how some NLTK containers are built, such as this one here: https://hub.docker.com/r/trackmaven/nltk/

I'm interested in a few things:

  • Can you add a "Source Repository" field, as in the above?
  • Add Usage in the "Full Description"
  • See how you can get the Dockerfile in it, as above.
  • Look into the Webhook option. This is how we might be able to pull the latest version automatically

from cltk_docker.

 avatar commented on June 2, 2024

Thanks for the resources, I will start working on this.

from cltk_docker.

 avatar commented on June 2, 2024

Hello sir,

  • I have figured out on how to build container and pushed it to my docker hub repo : https://hub.docker.com/r/achaitanyasai/cltk_docker/
  • And now coming to your requests:
    • Can you add a "Source Repository" field, as in the above ?
    • Add Usage in the "Full Description"
    • See how you can get the Dockerfile in it, as above.
    • Look into the Webhook option. This is how we might be able to pull the latest version automatically.
    • I will find out and complete the pending task(last task) soon.
  • And It seems like I don't have access to https://hub.docker.com/r/cltk/cltk/ can you please look into it. My username is achaitanyasai.
  • Also the container size is 3GB (because of corpora), isn't it too large ? If yes, we can just download only required / frequently used corpora instead of downloading all the corpora at once.

from cltk_docker.

kylepjohnson avatar kylepjohnson commented on June 2, 2024

And It seems like I don't have access to https://hub.docker.com/r/cltk/cltk/ can you please look into it. My username is achaitanyasai.

I have re-added you as an admin and confirmed you have write access. Try again and let me know if I did it right.

from cltk_docker.

 avatar commented on June 2, 2024
Can you think of a way to make the downloads optional? Or maybe allow users to download everything for 1 language?

I think first idea is OK but instead of making downloads optional, downloading the mostly used corpora is better. "mostly used corpora" in the sense the corpora that get's downloaded into ~/cltk_data by default.
And if we allow users to download everything for one language, it's more beneficial to the user who works only on one particular language. But we end up making containers for each language. So I feel it's a bad idea. What about your opinion ?

I have re-added you as an admin and confirmed you have write access. Try again and let me know if I did it right.

Thanks, now it's OK. I will deploy the container into cltk/cltk very soon.

from cltk_docker.

kylepjohnson avatar kylepjohnson commented on June 2, 2024

@achaitanyasai I'm finally getting back to this.

I think first idea is OK but instead of making downloads optional, downloading the mostly used corpora is better. "mostly used corpora" in the sense the corpora that get's downloaded into ~/cltk_data by default.

There will be so many different users of the CLTK that there no "most used". In a few months, I'll have an idea about which corpora we'll serve with API. At that point I'd like to slim down the repos to just those.

Are we ready to close this now? Anything else you want to take care of?

from cltk_docker.

 avatar commented on June 2, 2024

Yeah, it's a very nice idea of serving corpora through API.

No, we can close this issue.

from cltk_docker.

kylepjohnson avatar kylepjohnson commented on June 2, 2024

I'll reach out when we know more about the website and api.

Are you interested in starting on preliminary repos for cltk_api (Python/Flask) and cltk_frontend (JavaScript/Meteor)?

from cltk_docker.

Related Issues (1)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.