Giter Site home page Giter Site logo

diffrec's People

Contributors

injadlu avatar ouxiang-li avatar wyuan1001 avatar yiyanxu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

diffrec's Issues

How to generate item_emb.npy

I do not see any code on how you generated the item embeddings for your datasets. How were the item embeddings created for the Autoencoders? Thanks.

How to get better results

I used the default hyper parameters "!python main.py --cuda --dataset=ml-1m_clean --data_path=../datasets/ml-1m_clean/"
the results are less than 0.1 and the loss is about 180

betas out of range in gaussian_diffusion.py", line 35

I am running "sh run.sh amazon-book_clean 5e-4 1e-4 0 0 400 2 [300] [] 0.05 [300] 10 x0 5 0.5 0.001 0.0005 0 1 log 1 0" for L-DiffRec, but It turns out to be with negative beta that out of range. I trace the code and found it uses "linear-var" as noise_schedule. I print the beta result and it is as: [ 5.00000000e-04, -6.25312656e-05, -6.25273557e-05, -6.25234463e-05, -6.25195374e-05]. Could you please help me to check the problem?

ratio of ml-1m_clean

  1. 论文4.1.1第二段提到了“splits the sorted interactions into training, validation, and testing sets with the ratio of 7:1:2”,我在下载到的ml-1m_clean数据集中发现train,valid,test的数据条数分别为403277,110722,57532,这是7:2:1的比例
  2. DiffRec/L-DiffRec/main.py中第306行调用evaluate的第三个参数是否应为mask_train而不是mask_tv
  3. 参考DiffRec/L-DiffRec/inference.py中ml-1m_clean的参数设置,发现测试集的Recall和NDCG指标明显高于验证集,这是否是由于划分数据集时按时间排序(论文4.1.1第二段)导致训练、测试、验证集不满足独立同分布性质

关于对比算法lightgcn的参数

您好,感谢您出色的工作和开源的代码,我在复现论文中的实验结果的时候发现lightgcn在ml-1m的效果和论文中的结果差距很大,我想知道论文中的lightgcn的参数是怎么样的,embedding size是多少呢?

L-DiffRec betas out of range

ser num:108822iten num:94949data ready.
running k-means on cuda:0..
[running kneans]: 0it [00:00,?it/s,center_shift=0.066783,iteration=1, tol=0
[running kneans]: 1it [00:00,10.89it/s,center_shift=0.002020,iteration=2,tol[running kneans]: 2it [00:00,15.43it/s, center_shift=0.000370,iteration=3, tol[running kneans]: 3it[00:00,23.13it/s,center_shift=0.000370, iteration=3, tol
[running kneansj: 3it [00:00,23.13it/s, center_shift=0.000044,iteration=4,tol[running kneans]: 4it [00:00,25.34it/s, center_shift=0.000044,iteration=4,tol
category length:[9495,85454]
Latent dims of each category:[[30],[270]]Traceback (most recent call last):
File "main.py" , line 133, in
diffusion = gd.GaussianDiffusion(nean_type,args.noise_schedule,
File "/media/wang/study/jhs/DiffRec-main/L-DiffRec/models/gaussian_diffusion
y", line 35, in init
assert (self. betas > 0).all() and (self. betas = 1).all(), "betas out of range"
AssertionError: betas out of range

May I ask the author, when I reproduce L-Diffrec, according to the default parameter execution, there will be this error, I do not understand, please explain.

dataset加载失败

你好,我尝试使用“ sh run.sh amazon-book_clean 5e-5 0 400 [1000] 10 x0 5 0.0001 0.0005 0.005 0 1 log 1 0 ” 命令运行代码,但是在amazon-book 数据集加载过程中出现 ValueError: cannot reshape array of size 4566535 into shape (2283281,2)。

A Question about Implementation of Eq.4

Thanks for sharing your codes. And I have a question about implementation of eq.4.

For function betas_from_linear_variance in gaussian_diffusion.py, let argument variance be $\gamma$ (right part of the eq.4), and alpha_bar $= 1-\gamma$. Thus, the function aims to solve $\beta$ using $\gamma$.

For eq.4, $1-\bar{\alpha}_{t} =1- \alpha_1\alpha_2\cdots\alpha_t=1-(1-\beta_1)(1-\beta_2)\cdots(1-\beta_t)=\gamma_t$

For $t=1$ in eq.4: $1-\bar{\alpha}_1 = 1-\alpha_1 = 1-(1-\beta_1) = \beta_1=\gamma_1$ (third line of the function),

For $t=2$ in eq.4: $1-\bar{\alpha}_2 = 1-\alpha_1\alpha_2 = 1-(1-\beta_1)(1-\beta_2) = \gamma_2$ ,

thus $\beta_2=1-(1-\gamma_2)/(1-\beta_1) = 1-(1-\gamma_2)/(1-\gamma_1)$ (first execution of the for loop)

For $t=3$ in eq.4: $1-\bar{\alpha}_3 = 1-\alpha_1\alpha_2\alpha_3 = 1-(1-\beta_1)(1-\beta_2)(1-\beta_3) = \gamma_3$ ,

thus $\beta_3=1-(1-\gamma_3)/[(1-\beta_1)(1-\beta_2)] = 1-(1-\gamma_3)/[(1-\beta_1)(1-\beta_2)]$

However $(1-\beta_1)(1-\beta_2) \neq 1-\gamma_2$ , is a cumprod operation neglected?

Dataset split

Hi,

I read in the paper that the sorted interactions are be splited into training, validation, and testing sets with the ratio of 7:1:2. But the valid dataset in this repository is clearly larger than the test dataset, more like 7:2:1. Is there some problem here?

Best.

How to understand the linear noise schedule (Eq. 4) in paper?

Notice that the author uses a new linear noise schedule instead of the Linear or cosine schedules used in DDPM. The selection in the code is noise_ schedule='linear var', which corresponds to lines 303-309 in gaussian_diffusion. py, but I do not understand the correspondence between these codes and Eq. 4 in the paper. I hope the author can help me.
Looking forward to your reply very much.

Hyperparamter

Hi YiyanXu!

Thank you for your insightful work.

Can you share the set of hyperparameters that diffrers from your default values in the script, to reproduce the result of "ML-1M clean dataset"?

Missing item_emb.npy in amazon-book_clean dataset

Excuse me, after I unrar the amazon-book_clean.rar, I find there missing item_emb.npy. Could you please upload the dataset again?

FileNotFoundError: [Errno 2] No such file or directory: '../datasets/amazon-book_clean/item_emb.npy'

How to generate train_list.npy

I do not see any code on how you generated the train_list.npy for your datasets. Does this file record all user_id and item_id with interaction records? Or should we only retain data that has been filtered by 5-core?

[Comparison of DiffRec and L-DiffRec] Which one is generally better?

As title, I am wondering if L-DiffRec is generally better than DiffRec at a rather small scale.
In your paper, you have shown that L-DiffRec is better in the noisy environment. I wonder if you put L-DiffRec in table 2, where will be its ranking among all your compared baselines? Will it generally surpass DiffRec?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.