Giter Site home page Giter Site logo

glove.py's Introduction

This repository contains an implementation of the GloVe word vector learning algorithm in Python 2 (NumPy + SciPy). (A contributed Python 3 version is available here.)

You can follow along with the accompanying tutorial on my blog.

The implementation is for educational purposes only; you should look elsewhere if you are looking for an efficient / robust solution.

glove.py's People

Contributors

hans avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

glove.py's Issues

Bug in gradient updating

Your article and sample code is very useful! But there is a bug in run_iter I think. Vectors are updated in the same loop that gradients are calculated in. I don't think that is correct? Instead, all gradients should be calculated in one loop and in a second loop all updates should be applied.

grad_bias should multiple learning_rate?

    # Compute gradients for bias terms
    grad_bias_main = weight * cost_inner
    grad_bias_context = weight * cost_inner
    # # in stanford c version :should multiple learning_rate
    # grad_bias_main = weight * cost_inner * learning_rate
    # grad_bias_context = weight * cost_inner * learning_rate

about gradient

I have seen your blog about the code. Then I'm wondering if the gradient part missed multiplying the 2.That makes the learning rate becomes the original's half. Am I correct?

How to read a vocabulary file

I have implemented glove.py on BBCNews dataset.I have formed a corpus of a single file with single space between words.Vocabulary file got generated.Can you please explain to me how to read it?

vocabulary.txt

I have passed arguments to command prompt as follows
C:\Users\JAYASHREE\Documents\NLP>python Glove_python_bbc.py "C:/Users/JAYASHREE/Documents/NLP/text-corpus.txt" --vocab-path C:/Users/JAYASHREE/Documents/NLP/vocabulary.txt --cooccur-path C:/Users/JAYASHREE/Documents/NLP/cooccur_matrix.txt -w 10 --min-count 10 --vector-path C:/Users/JAYASHREE/Documents/NLP/word-vector.txt -s 40 --iterations 10 --learning-rate 0.1 --save-often

text-corpus.zip

Load glove

How do I load the saved model (bin file)?

Why we maintain two sets of embedings for one word?

Hello Jon:
Very nice implementation and very clear tutorial. I now nearly understand this pretty work except for one point: Why we maintain two sets of embedings for one word? As you told in the tutorial that one is used when the word appeard as the main word and the other one used when the word is context. I saw similar settings in the original code of GloVe. But it is still unclear(unintuiative) to me why we do this? What benifits do we obtain doing this and have you tried what will happen if we just use one set of embeding for one word?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.