clhchtcjj / bine Goto Github PK
View Code? Open in Web Editor NEWBiNE: Bipartite Network Embedding
BiNE: Bipartite Network Embedding
I just had to spend a while getting versions to line up to run this code. For anyone else searching for this, here is a more accurate list of required python modules and version. Some were missing from the README.
python 2.7.14
datasketch 1.4.1
futures 3.2.0
networkx 2.2
numpy 1.16.0
pandas 0.23.4
scikit-learn 0.20.0
scipy 1.1.0
six 1.11.0
I find that the code in line 180 of train.py
tmp_t = sorted(test_rate[u].items(), lambda x, y: cmp(x[1], y[1]), reverse=True)[0:min(len(test_rate[u]),len(test_rate[u]))]
you use
min(len(test_rate[u]),len(test_rate[u]))
but they are the same.
Need a little bit information on how to prepare my dataset. I have a dat file containing all the ratings place in the model folder so that it can be read, renamed and splitted into train and test. but it seem it is not reading anything
Please why is 50 chosen as the max number of iterations. As I try to run with higher iteration and it gives better recommendation results
For example assume it's 0.5.
I think it means the sample is positive in 50% time and negative in 50% time. Is it OK?
Any idea how to solve this?
File "train.py", line 53, in walk_generator
gul.homogeneous_graph_random_walks_for_large_bipartite_graph(datafile=args.train_data, percentage=args.p, maxT=args.maxT, minT=args.minT)
File "/tmp2/cmchen/proRec/BiNE/model/graph_utils.py", line 111, in homogeneous_graph_random_walks_for_large_bipartite_graph
A,row_index,item_index= bi.biadjacency_matrix(self.G, self.node_u, self.node_v, dtype=np.float,weight='weight', format='csr')
ValueError: too many values to unpack
Is this the reason? And how to solve in this case?It seems it cannot simply solved by replacing a function.
Traceback (most recent call last):
File "D:/programming/BiNE/model/train.py", line 572, in <module>
sys.exit(main())
File "D:/programming/BiNE/model/train.py", line 569, in main
train_by_sampling(args)
File "D:/programming/BiNE/model/train.py", line 321, in train_by_sampling
walk_generator(gul,args)
File "D:/programming/BiNE/model/train.py", line 55, in walk_generator
gul.calculate_centrality()
File "D:\programming\BiNE\model\graph_utils.py", line 61, in calculate_centrality
h, a = nx.hits(self.G)
File "D:\Anaconda3.5\envs\BiNE\lib\site-packages\networkx\algorithms\link_analysis\hits_alg.py", line 111, in hits
"HITS: power iteration failed to converge in %d iterations."%(i+1))
networkx.exception.NetworkXError: HITS: power iteration failed to converge in 102 iterations.
非常感谢您分享的代码。
在skip-gram,我有些问题请教下您,
I_z = {center: 1}这个地方是不是应该是计算context的节点吧,
V = np.array(node_list[contexts]['embedding_vectors']) 应该是计算center的节点embedding吧,
最终更新的是
for z in context_u:
tmp_z, tmp_loss = skip_gram(u, z, neg_u, node_list_u, lam, alpha)
node_list_u[z]['embedding_vectors'] += tmp_z ## 这里是不是更新center节点的embedding吧?
十分期待您的解答!
Hi Leihui, I am quite interested in the node visualization performance in your paper. However, I can not reproduce the TSNE results as shown in your paper. Could you please share the code of node visualization, thanks.
I used a data set which contains 0.2 million links to get the embedding. But after running for 8 hours , the program still got stuck in the graph construction.
Are there some ways to speed up the program?
i try to use bine on a user-item interaction network with 1 million users and similar scale of items. the implement now is stack at construct graph... i set the "large" option to 2. and it didnt help.
is there anyway to speed up the training.
plus, there is still around 100 GB memory unused in my machine and only 1 cpu is fully used.
hope for the answer.
Has anyone converted this code to python3?
Hi I want to understand why we need to update the context vector of user nodes and item nodes when performing the skipgram model.
This is not the case in node2vec afaik (and node2vec is using one-hot vector). May I ask if there is any reason behind it?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.