Comments (11)
Hello,
If I understand correctly, a simple solution for the mentioned case is to add a new column (maybe named as dst_label
) to the <ml_network.csv>
file, which stores the label of target nodes.
Then, when loading the dataset for node classification, you can add a line below here to read the target node labels, like dst_labels = graph_df.dst_label.values
. Also, remember to add a new property of Data
at here and below here, for example, self.dst_labels = dst_labels
. Similarly, please get the target node labels here when using the target node labels to compute the loss.
from dyglib.
Thank you very much! I think it is clear to me now.
I have another question regarding dynamic node classification. I have the edgelist csv file with three columns: source_id, target_id, and timestamp (a list of timestamps). the third column means that there are at least several interactions between the source node and the target node.
My question is that how to modify and condtruct the dataset and model trainig and test for supporting above scenario and dataset.
from dyglib.
Firstly, for each line, you can split the interaction list into multiple lines, where each line stands for a single interaction. For example, split the first line u18839785, u266463103, "['2022-01-14 xxxx', '2021-11-30 xxxx']"
into u18839785, u266463103, '2022-01-14 xxxx'
and u18839785, u266463103, '2021-11-30 xxxx'
.
Then, after splitting all the lines, you can sort the interactions in increasing order based on the timestamp.
The obtained data can be computed by our code.
from dyglib.
Thanks a lot! I will check it ASAP.
from dyglib.
Close this issue now. Feel free to reopen it when needed.
from dyglib.
Hi,
I have a question regarding the dataset utilized in your code. Could you please specify if the dataset employed is an instance of a homogeneous graph within the context of dynamic node classification?
from dyglib.
Hi,
Our work uses two datasets for dynamic node classification, including Wikipedia and Reddit. Compared with homogeneous graphs, these two datasets are bipartite with two types of nodes. You can refer to Section B.1 and Table 6 in our paper for more details.
from dyglib.
Yes, I see.
However, I am uncertain as to how these models can be adapted to classify both source and target nodes in a single homogeneous graph.
Wikipedia and Reddit both contain source nodes that are labelled, but in a homogeneous graph, both source and target nodes must be labelled.
Could you please advise me on how to modify the code to accommodate this scenario? According to your previous suggestions #13 (comment), it still does not work.
Thank you so much!
from dyglib.
In my opinion, by adding a column to additionally denote the target node labels and loading them with slight code modifications, your case can be solved. Could you further explain why the previous suggestion cannot work?
from dyglib.
Sure!
One of the most important questions for me is how to construct the original csv file like wikipedia.csv?
Specifically, I have 100 million edges and each with individual timestamp, I am not clear how to place the node features for both source nodes and target nodes in the original csv file. The node features are also about 1k dimensions.
from dyglib.
We obtain the original files like wikipedia.csv
from previous work EdgeBank.
For the mentioned case, you can directly place the edge features as the original wikipedia.csv
does. Then, for node features, if the node features change over time, you can append the features of source and target nodes after edge feature at each line. Moreover, you should also record the edge and node feature dimensions, so that you can split them like this. If the node features are consistent regardless of time, you can additionally save node features in a separate file with shape (num_nodes, node_feat_dim), and then load the node features by the node index in each line.
from dyglib.
Related Issues (17)
- A question about dynamic node features HOT 2
- Inference code HOT 1
- Adapting Models for Node Classification Tasks to Datasets Lacking Edge Features HOT 3
- Request for Guidance: Node Label Coverage Issue in Dataset Processing for Node Classification Task HOT 6
- Questions towards the details of paper and code
- About Training Time HOT 1
- Questions about Implementations of GraphMixer
- Can the MOOC dataset, originally used only for link prediction tasks in DyGLib, also be used for node classification tasks?
- Why does the transductive link prediction task not exclude unseen nodes from testing?
- validation metric list empty
- Creating custom dynamic and temporal dataset for link prediction HOT 3
- Question of dynamic node classification & edge classification? HOT 5
- [Help] Why Node ID starts from 1 instead of 0? HOT 3
- TypeError:'float!object cannot be interpreted as an integer HOT 1
- Questions about the application of the so-called Patching Technique HOT 2
- time_interval_aware策略下计算采样概率为什么用的np.cumsum HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dyglib.