Giter Site home page Giter Site logo

datascience-project's Introduction

#DataScience-Project

##About

:p

##Getting Started Please use Anaconda if you aren't already for some strange reason.
Create a virtual environment like so:

conda env create -f environment.yml

Then activate the environment before you get to work:

source activate DataScience-Project

Deactivate when you are done like this:

source deactivate

##Contributing

###Work on your own branch Merge conflicts are ugly, and it's really bad when somebody "accidentally" pushes binaries, temps, and other large files. If it only happened in your branch, revert to a clean state so nobody has to merge dirty project files to master. Create a new branch and switch to it with one command:

git checkout -b **_BranchName_**

You'll now be working on your own branch until you checkout to a different branch. Try not to checkout another branch unless you got corresponding developer's OK.

###master Someone who knows what they are doing should take care of merging to master. Create a pull request to have your work merged in. In the end, your merged work is what counts, so make sure your branch is clean.

By keeping master clean like this and working on separate branches, we never have to receive a message from a developer saying something like "Don't pull! My notebook is corrupted!" It sounds like a ridiculous statement, but it happens. It really shouldn't.

##Updates

###Getting them Just pull the latest changes from the repo! If anything goes wrong due to dependencies, make sure you get the latest requirements. Like so:

conda env update -f environment.yml

Do this while your virtual environment is active, of course.

###Making them If you added a dependency to the environment, keep everyone on the bleeding edge too! There exists a nice command, but since we are using a couple of packages from github, conda doesn't export these correctly, and we have to manually edit environment.yml. Do conda list and find only the package that you installed. The details on that row need to be added to the file.

Then just push these changes and everyone else can refer to the "Getting them" section above. :D

##Misc.

Here are some data science best practice. I like the one that says to mark your notebooks with your initials:

http://svds.com/jupyter-notebook-best-practices-for-data-science/

For more conda stuff, here's a conda vs pip vs virtualenv table:

http://conda.pydata.org/docs/_downloads/conda-pip-virtualenv-translator.html

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.