Giter Site home page Giter Site logo

stock2vec's Introduction

Stock2Vec

Stock prices are flutuated in every day. So, in each day, put those stocks in order of price change to one sentence. Then, with certain window size, each stock will show up with highly related stock frequently, because they tend to move their prices together.
For example, 005380(Hyundai Motors) moves together with 000270(Kia Motors). Because not only they are in same industry, but also Hyndai owns Kia.

Why it matters?

Every stock are related with others, but we couldn't represent it as a vector. With this result, we can get a similarity between stocks. Moreover, we can put those vectors to our classifier such as Deep Neural Network.

In this repo

Embedding is done with Skip-gram from Word2Vec, and GloVe, but code for embedding is not included because there are already wide spreaded.

You can figure out which stock is related with cosine similarity. In addition, some of result files are delimited with "\001" because I wanted to put those to Hive. Belows are short description about included files.

File description

  • sentences.txt: Source file. Each day has one row. Each row represent a set of stocks which is sorted by price difference between the day and a day before that day.
  • sentences.refined.output.txt: There are 3 columns (source stock, dest stock, similarity based on Skip-gram) and they are "\001" delimited.
  • sentences.refined.output_glove.txt: There are 3 columns (source stock, dest stock, similarity based on GloVe) and they are "\001" delimited.
  • sentences.refined.vectors.txt: Columns are "\t" delimited. First column represents a stock code and rest are vectors from Skip-gram.
  • sentences.refined.gloves.txt: Columns are "\t" delimited. First column represents a stock code and rest are vectors from GloVe.

Sample of related stocks for 005380(Hyundai Motors)

skip-gram

code dest_code similarity
005380 012330 0.9679
005380 005387 0.9677
005380 005385 0.9491
005380 000270 0.9306
005380 005389 0.9176
005380 000240 0.8904
005380 009155 0.8859
005380 009150 0.8849
005380 005850 0.8805
005380 006405 0.8723
005380 034220 0.8704
005380 010690 0.8679
005380 007860 0.8672
005380 018880 0.8667
005380 086280 0.8654

glove

code dest_code similarity
005380 005387 0.9273
005380 000270 0.9233
005380 005385 0.9218
005380 012330 0.9100
005380 005389 0.8879
005380 033530 0.7575
005380 009150 0.7364
005380 002350 0.7042
005380 005850 0.7012
005380 018880 0.7001
005380 091180 0.6935
005380 000240 0.6934
005380 002550 0.6719
005380 003620 0.6705
005380 007860 0.6686

stock2vec's People

Contributors

kh-kim avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.