Giter Site home page Giter Site logo

chenweize1998 / fully-hyperbolic-nn Goto Github PK

View Code? Open in Web Editor NEW
69.0 69.0 11.0 8.44 MB

Code for paper Fully Hyperbolic Neural Networks

Python 89.78% Shell 4.10% Perl 3.99% Smalltalk 0.17% Emacs Lisp 1.55% JavaScript 0.08% NewLisp 0.14% Ruby 0.15% Slash 0.03% SystemVerilog 0.02%

fully-hyperbolic-nn's People

Contributors

chenweize1998 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

fully-hyperbolic-nn's Issues

Midpoint Definition

Hey,

Just wanted to ask, the paper defines the mid point as:

$\sqrt{1/k}\frac{\sum{v.x}}{\left|v.x \right|_L}$

but the code in kg/manifolds/lorentz :
$\sqrt{k}\frac{\sum{v.x}}{\left|v.x \right|_L}$
where: k=-K

is there any particular reason for this?

Unable to reproduce results

Hi @chenweize1998 - Thanks for doing great work. I was trying to reproduce results for machine translation but for some reason results seem to be way off. It'll be great if you can help me point in the right direction wrt what I might be doing wrong.

  1. I am using Python 3.7.10, Pytorch 1.7.0+cu110, geoopt 0.3.1
  2. When I train for iwslt14 as per instructions given in README here, the model gets stuck at around validation accuracy ~6 and validation perplexity ~880. After 40k steps, I see BLEU score of 0.

Let me know if I should add more details which might help you. I didn't make any changes to the code though.

about Definition 1

Hello Weize,

I have a mathematical problem when learning your great paper. Could you help me understand why Bx is on a hyperboloid. I tried my best to prove it, but failed. Thank you very much!

Best

ROC on airport lp dataset

Hi,

Thank you for your great work. I tried to reproduce the results on the airport lp dataset and I get 96.28 on the test dataset not 97.3 as reported in the paper. I used the same configuration parameters as you mentioned here in this repository.

NameError: name 'AverageAttention' is not defined

Hi,

When I'm trying to run the evaluation step using the command
bash eval_iwslt.sh 4 0 ~/fully-hyperbolic-nn-main/mt/model/iwslt/64/bs=10240_ac=1_dp=0.0_attdp=0.1_gn=0.5_ws=6000_lr=5/model_step_40000.pt
This error message comes up:
NameError: name 'AverageAttention' is not defined
I'm wondering where AverageAttention was first defined and imported. Thank you so much!

Lorentz Linear result outside the manifold

I've been trying to implement a 2d convolution like operation using this method, I get the data and project it to the manifold, do a few steps and then pass them through the lorentz layer.
Just before I pass the data in, I perform the "check_on_manifold" function and it returns True. However, right after the layer it becomes False. Is there any specific reason for that? Should I reproject after using the layer?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.