Comments (10)
Hi, Yes, that's probably too large. However when we serve on the web, we'll want to pull down everything. Can you think of a way to make the downloads optional? Or maybe allow users to download everything for 1 language?
from cltk_docker.
Yes I am interested in starting on preliminary repos.
from cltk_docker.
For this issue, I'd like you to look at how some NLTK containers are built, such as this one here: https://hub.docker.com/r/trackmaven/nltk/
I'm interested in a few things:
- Can you add a "Source Repository" field, as in the above?
- Add Usage in the "Full Description"
- See how you can get the Dockerfile in it, as above.
- Look into the Webhook option. This is how we might be able to pull the latest version automatically
from cltk_docker.
Thanks for the resources, I will start working on this.
from cltk_docker.
Hello sir,
- I have figured out on how to build container and pushed it to my docker hub repo : https://hub.docker.com/r/achaitanyasai/cltk_docker/
- And now coming to your requests:
- Can you add a "Source Repository" field, as in the above ?
- Add Usage in the "Full Description"
- See how you can get the Dockerfile in it, as above.
- Look into the Webhook option. This is how we might be able to pull the latest version automatically.
- I will find out and complete the pending task(last task) soon.
- And It seems like I don't have access to https://hub.docker.com/r/cltk/cltk/ can you please look into it. My username is
achaitanyasai
. - Also the container size is 3GB (because of corpora), isn't it too large ? If yes, we can just download only required / frequently used corpora instead of downloading all the corpora at once.
from cltk_docker.
And It seems like I don't have access to https://hub.docker.com/r/cltk/cltk/ can you please look into it. My username is achaitanyasai.
I have re-added you as an admin and confirmed you have write access. Try again and let me know if I did it right.
from cltk_docker.
Can you think of a way to make the downloads optional? Or maybe allow users to download everything for 1 language?
I think first idea is OK but instead of making downloads optional, downloading the mostly used corpora is better. "mostly used corpora" in the sense the corpora that get's downloaded into ~/cltk_data
by default.
And if we allow users to download everything for one language, it's more beneficial to the user who works only on one particular language. But we end up making containers for each language. So I feel it's a bad idea. What about your opinion ?
I have re-added you as an admin and confirmed you have write access. Try again and let me know if I did it right.
Thanks, now it's OK. I will deploy the container into cltk/cltk
very soon.
from cltk_docker.
@achaitanyasai I'm finally getting back to this.
I think first idea is OK but instead of making downloads optional, downloading the mostly used corpora is better. "mostly used corpora" in the sense the corpora that get's downloaded into ~/cltk_data by default.
There will be so many different users of the CLTK that there no "most used". In a few months, I'll have an idea about which corpora we'll serve with API. At that point I'd like to slim down the repos to just those.
Are we ready to close this now? Anything else you want to take care of?
from cltk_docker.
Yeah, it's a very nice idea of serving corpora through API.
No, we can close this issue.
from cltk_docker.
I'll reach out when we know more about the website and api.
Are you interested in starting on preliminary repos for cltk_api (Python/Flask) and cltk_frontend (JavaScript/Meteor)?
from cltk_docker.
Related Issues (1)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cltk_docker.