Giter Site home page Giter Site logo

Comments (3)

VHRanger avatar VHRanger commented on July 19, 2024

It's a question of just the amount of additional work and how efficiently it can be done during the walks - you need to resample at each step of each random walk in Node2Vec walks and this also breaks cache locality. It's easy to make normal random walks fast in a CSRGraph (all the choices fit in the CPU cache).

Theres been some research done in making it faster through rejection sampling: https://louisabraham.github.io/articles/node2vec-sampling.html

Which weve been looking into merging into CSRGraphs: VHRanger/CSRGraph#14

That said, as noted in the README, I encourage you to try other algorithms (ProNE, GGVec) before spending a lot of resources gridsearching p and q on Node2Vec. As mentionned in this blog post, you need to gridsearch p & q a lot on Node2Vec for it to show a difference and this is time better spent gridsearching other parameters (the w2vparams for instance) with more efficient methods.

from nodevectors.

ldorigo avatar ldorigo commented on July 19, 2024

That said, as noted in the README, I encourage you to try other algorithms (ProNE, GGVec) before spending a lot of resources gridsearching p and q on Node2Vec. As mentionned in this blog post, you need to gridsearch p & q a lot on Node2Vec for it to show a difference and this is time better spent gridsearching other parameters (the w2vparams for instance) with more efficient methods.

Just read it, great article. Although at this point I'm mostly stuck with Node2Vec because I would have to rewrite a major part of my thesis to use another algorithm.

FYI, since writing here I've tried using Node2Vec's actual reference implementation in C++ (https://github.com/snap-stanford/snap/tree/master/examples/node2vec), and the walks are ridiculously fast to generate (a few seconds for my network for any combination of parameters). I don't think any of the python implementations leverage the markov property the way they describe in the paper (i.e. that you can generate partial walks for many nodes at once because the next node only depends on the current node and the previous one). Anyhow, thanks for answering - feel free to close this issue, I'm now using the c++ implementation so my problem is solved :-)

from nodevectors.

VHRanger avatar VHRanger commented on July 19, 2024

Thanks, since it's tracked in the rejective sampling issue we'll track progress on faster (p or q) != 1 walks there

from nodevectors.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.