ustcml / recstudio Goto Github PK
View Code? Open in Web Editor NEWA highly-modularized and recommendation-efficient recommendation library based on PyTorch.
Home Page: http://recstudio.org.cn
License: MIT License
A highly-modularized and recommendation-efficient recommendation library based on PyTorch.
Home Page: http://recstudio.org.cn
License: MIT License
Inter_feat with ratings lower than low_rating_thres
are filtered out in the following code.
RecStudio/recstudio/data/dataset.py
Lines 486 to 487 in 2bd40a8
However, the corresponding entries are misspelled as low_rating_threshold
in the following dataset config files:
In SeqDataset, the training speed is limited by the data loading due to the padding operation. Maybe we could use more threads in padding procedure.
无法打开官网,因此获取不到官方文档的信息。
正在使用这个库,有些内容不是很明白,希望能获得帮助,或者提供文档。
I find MultiVAE and MultiDAE both output nan recall on ml-1m while performing correctly on ml-100k and gowalla. But BPR (MF model) and LightGCN (graph model) are normal on all three datasets. So I guess it may be a problem with AE models.
When there are two same feature names in two different tables (e.g. one in the user information table and one in the item information table), there would be hidden problems.
For example, there is a column named category
in both user.csv
and item.csv
. When I want to get the values of both two columns, the value of category
would be overwritten
In dataset, sometimes we would use dataset which contains vector type features. For example, in if we want to use embeddings generated by language model, the feature would be embedding. Therefore, maybe we need the support of float sequence type.
the uh and uc of trn_data are added to trn_data, val_data and tst_data.
the uh and uc of val_data are added to tst_data, but why not add them to val_data itself?
The code is at the end of _build()
In your paper, ref "Table 5: Samplers in RecStudio", you mentioned that LSH based samplers have been implemented. But I cannot find them in your code.
The temperature hyperparameter seems to be missing from InfoNCE loss function in RecStudio.
Why there will be a sudden Cuda memory usage increase in the validation phase?
The batch size of the validation phase set in the config file is smaller than the training phase, but there will be a sudden Cuda memory usage increase in the validation phase, which causes the OOM Error.
Specifically, when the model runs the code in run.py
,model.evaluate
will cost more Cuda memory than model.fit
, could you please help me solve this problem? Thanks for your attention.
It seems like there is no feature scaling in dataset.py
(when the field type is float
).
For better visualization in tensorboard, hyper-parameter plugin could be used. There is a guide here.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.