The datascience-project's intro from devilpepper

#DataScience-Project

##About

##Getting Started Please use Anaconda if you aren't already for some strange reason.
Create a virtual environment like so:

conda env create -f environment.yml

Then activate the environment before you get to work:

source activate DataScience-Project

Deactivate when you are done like this:

source deactivate

##Contributing

###Work on your own branch Merge conflicts are ugly, and it's really bad when somebody "accidentally" pushes binaries, temps, and other large files. If it only happened in your branch, revert to a clean state so nobody has to merge dirty project files to master. Create a new branch and switch to it with one command:

git checkout -b **_BranchName_**

You'll now be working on your own branch until you checkout to a different branch. Try not to checkout another branch unless you got corresponding developer's OK.

###master Someone who knows what they are doing should take care of merging to master. Create a pull request to have your work merged in. In the end, your merged work is what counts, so make sure your branch is clean.

By keeping master clean like this and working on separate branches, we never have to receive a message from a developer saying something like "Don't pull! My notebook is corrupted!" It sounds like a ridiculous statement, but it happens. It really shouldn't.

##Updates

###Getting them Just pull the latest changes from the repo! If anything goes wrong due to dependencies, make sure you get the latest requirements. Like so:

conda env update -f environment.yml

Do this while your virtual environment is active, of course.

###Making them If you added a dependency to the environment, keep everyone on the bleeding edge too! There exists a nice command, but since we are using a couple of packages from github, conda doesn't export these correctly, and we have to manually edit environment.yml. Do conda list and find only the package that you installed. The details on that row need to be added to the file.

Then just push these changes and everyone else can refer to the "Getting them" section above. :D

##Misc.

Here are some data science best practice. I like the one that says to mark your notebooks with your initials:

http://svds.com/jupyter-notebook-best-practices-for-data-science/

For more conda stuff, here's a conda vs pip vs virtualenv table:

http://conda.pydata.org/docs/_downloads/conda-pip-virtualenv-translator.html

devilpepper / datascience-project Goto Github PK

datascience-project's Introduction

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent