w261-environment's People
Forkers
chabobo dbalck kevinjfoley datadaveh ravneetg fred-fan jollylst unakhat dimitrioshytiroglou tpanza nscsekharucb chadharness-mids jerrysong tiffapedia cynthiahu ucb-info-python ctpham1 cameronbell75 sstorey-nephila yankinschweiz rrdkent kathleenyw samm55 fa-mc funguide1 sioharr avinashsc jrtuenge angelawugithub mwinton accaminero ce88888 steve-dille montyhallgoat aditihegde zengm71 jungy06 bhuvneshsharma qwertypo888 hienrepo kevin-hartman girijaghali skopimos bjbecerra xiaowanzio8 juliantsang1 lateflare blulightspecial barbaraitotia frankyftang john-woolley brandonscolieri cal-dortiz jrday93 gohaike harinandansrikanth dfristau dchacon-berkeley sudhrity m-chau toby-p krislee2524 malay-patel akikoiwamizu cvolticz summer5e55 abelninanw261-environment's Issues
Altiscale Jupyter Script
Take this off ephermal storage.
My user directory with read access?
Add Altiscale Config Example
Fix the location of the docker-compose.yml to copy from
The second step in ##How to use section of README.md asks to copy the contents of docker-compose.yml from "this" repo. This needs to be updated to "class repo"
Java 9
Add nbdime to python environment
Please consider adding the nbdime package to the Python environment on Altiscale.
http://nbdime.readthedocs.io/en/stable/index.html
It has nice features for diffing and merging Jupyter notebooks, which is critical for team projects.
Docker compose file version
Please reconcile this apparent conflict (v2 or v3??) ...
From README.md:
version: '3'
services:
quickstart.cloudera:
image: w261/w261-environment:latest
hostname: docker.w261
privileged: true
version: this item says use v2 syntax
Add mrjob.conf
Write a conf file in the runner.sh script placing it in the home dir, and containing the following options. Make sure the USERNAME is the loggedin altiscale user (instead of kylehamilton)
This is where mrjob looks for the configuration file:
~/.mrjob.conf
runners:
hadoop:
python_archives: hdfs:///user/kylehamilton/virtualenv/py27.zip#py27
python_bin: ./py27/bin/python
cleanup_on_failure: ALL
cleanup: TMP
This will remove the need to specify those options in the command in the notebooks:
!python mrjob_versions_test.py \
-r hadoop one_line.txt \
# --python-archive={pyArchive} \
# --python-bin={pyBin} \
--output-dir={OUTPUT_PATH_1} \
--no-output
(THIS IS TESTED)
Link for GC credits
Could you please update the url for google cloud platform credits. The one in slack currently states "Invalid Course".
Thank you!
Details on what `docker-compose up` does
Students asked for more clarity on what this does and where to run it.
Spark Packages
Add ability to specify pacakges
Spark 2.3
Altiscale PAC file - Mac
Installing PAC file on a Mac blocks many common locations. Root Cause?
additional modules for Docker container
-
pygithub - https://pygithub.readthedocs.io/en/latest/introduction.html
-
gsutil, gcloud, etc..
-
conda install -y python-graphviz (for the decision tree demo notebook)
-
conda install line_profiler
-
conda install memory_profiler
Nice to haves:
- nltk
- spacy
- sklearn
update requirements file
From a student: I had to do “pip install google-cloud” and it installed the latest version 0.34.0 successfully. This was where I got stuck when running “pip install -r requirements.txt” which had 0.32.0 in it..
Should we update the requirements.txt file?
Script for Initial Altiscale setup
Create GCP - gcloud for budget alert
allocating CPUs and RAM on macOS
Is the text "Click Docker in the clock(?) area" referring to the menu bar?
https://support.apple.com/guide/mac-help/menu-bar-mchlp1446/mac
Also FYI, it's under Preferences, rather than Settings:
vestigial MRJob reference
If we're moving away from MRJob, should it be removed from this text on the README page?
"Add the following parameter to Map Reduce Streaming and MRJob commands"
Add documentation for Altiscale/runner.sh
Add documentation for Altiscale/runner.sh to the README.md. Include high-level objectives, such as: Why does the script need to do what it does?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.