Giter Site home page Giter Site logo

datascience-project's Introduction

#DataScience-Project

##About

:p

##Getting Started Please use Anaconda if you aren't already for some strange reason.
Create a virtual environment like so:

conda create --name **_EnvName_** --file requirements.txt

Then activate the environment before you get to work:

source activate **_EnvName_**

Deactivate when you are done like this:

source deactivate

##Contributing

###Work on your own branch Merge conflicts are ugly, and it's really bad when somebody "accidentally" pushes binaries, temps, and other large files. If it only happened in your branch, revert to a clean state so nobody has to merge dirty project files to master. Create a new branch and switch to it with one command:

git checkout -b **_BranchName_**

You'll now be working on your own branch until you checkout to a different branch. Try not to checkout another branch unless you got corresponding developer's OK.

###master Someone who knows what they are doing should take care of merging to master. Create a pull request to have your work merged in. In the end, your merged work is what counts, so make sure your branch is clean.

By keeping master clean like this and working on separate branches, we never have to receive a message from a developer saying something like "Don't pull! My notebook is corrupted!" It sounds like a ridiculous statement, but it happens. It really shouldn't.

##Updates

###Getting them Just pull the latest changes from the repo! If anything goes wrong due to dependencies, make sure you get the latest requirements. Like so:

conda install --yes --file requirements.txt

Do this while your virtual environment is active, of course.

###Making them If you added a dependency to the environment, keep everyone on the bleeding edge too! Do this by running:

conda list --export > requirements.txt

Once again, inside of your environment. Just push these changes and everyone else can refer to the "Getting them" section above. :D

##Misc.

Here are some data science best practice. I like the one that says to mark your notebooks with your initials:

http://svds.com/jupyter-notebook-best-practices-for-data-science/

For more conda stuff, here's a conda vs pip vs virtualenv table:

http://conda.pydata.org/docs/_downloads/conda-pip-virtualenv-translator.html

datascience-project's People

Contributors

hbenzek avatar cjallow avatar devilpepper avatar

Watchers

James Cloos avatar  avatar  avatar Hasna Benzekri avatar Eunjung Park avatar

Forkers

devilpepper

datascience-project's Issues

What tweets say about the market

Twitter is not as good of a source for stock market discussions, but StockTwits is good for this:
http://stocktwits.com/
Visiting http://stocktwits.com/symbol/_**SYMBOL**_ brings you to a feed for SYMBOL. We can scrape this page for historical tweets to do NLP and learn from since StockTwits API only gives us the last 30 tweets. The last 30 tweets can be used for validation.

S&P 500 has 500 stock symbols we can look at.
https://en.wikipedia.org/wiki/List_of_S%26P_500_companies
This gives us plenty of data to learn from and model to make some predications.

Yahoo Finance has historical data by day. It looks pretty nice. For GE, we could visit:
https://finance.yahoo.com/quote/GE/history?p=GE
There's probably an api also, but I haven't checked.
Interactive Brokers has a pretty sweet dataset that I haven't had a chance to look at. It is probably better than Yahoo Finance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.