Giter Site home page Giter Site logo

Comments (11)

yule-BUAA avatar yule-BUAA commented on September 27, 2024

Hello,

If I understand correctly, a simple solution for the mentioned case is to add a new column (maybe named as dst_label) to the <ml_network.csv> file, which stores the label of target nodes.

Then, when loading the dataset for node classification, you can add a line below here to read the target node labels, like dst_labels = graph_df.dst_label.values. Also, remember to add a new property of Data at here and below here, for example, self.dst_labels = dst_labels. Similarly, please get the target node labels here when using the target node labels to compute the loss.

from dyglib.

mgao97 avatar mgao97 commented on September 27, 2024

Thank you very much! I think it is clear to me now.

I have another question regarding dynamic node classification. I have the edgelist csv file with three columns: source_id, target_id, and timestamp (a list of timestamps). the third column means that there are at least several interactions between the source node and the target node.
image

My question is that how to modify and condtruct the dataset and model trainig and test for supporting above scenario and dataset.

from dyglib.

yule-BUAA avatar yule-BUAA commented on September 27, 2024

Firstly, for each line, you can split the interaction list into multiple lines, where each line stands for a single interaction. For example, split the first line u18839785, u266463103, "['2022-01-14 xxxx', '2021-11-30 xxxx']" into u18839785, u266463103, '2022-01-14 xxxx' and u18839785, u266463103, '2021-11-30 xxxx'.
Then, after splitting all the lines, you can sort the interactions in increasing order based on the timestamp.
The obtained data can be computed by our code.

from dyglib.

mgao97 avatar mgao97 commented on September 27, 2024

Thanks a lot! I will check it ASAP.

from dyglib.

yule-BUAA avatar yule-BUAA commented on September 27, 2024

Close this issue now. Feel free to reopen it when needed.

from dyglib.

mgao97 avatar mgao97 commented on September 27, 2024

Hi,

I have a question regarding the dataset utilized in your code. Could you please specify if the dataset employed is an instance of a homogeneous graph within the context of dynamic node classification?

from dyglib.

yule-BUAA avatar yule-BUAA commented on September 27, 2024

Hi,

Our work uses two datasets for dynamic node classification, including Wikipedia and Reddit. Compared with homogeneous graphs, these two datasets are bipartite with two types of nodes. You can refer to Section B.1 and Table 6 in our paper for more details.

from dyglib.

mgao97 avatar mgao97 commented on September 27, 2024

Yes, I see.

However, I am uncertain as to how these models can be adapted to classify both source and target nodes in a single homogeneous graph.

Wikipedia and Reddit both contain source nodes that are labelled, but in a homogeneous graph, both source and target nodes must be labelled.

Could you please advise me on how to modify the code to accommodate this scenario? According to your previous suggestions #13 (comment), it still does not work.

Thank you so much!

from dyglib.

yule-BUAA avatar yule-BUAA commented on September 27, 2024

In my opinion, by adding a column to additionally denote the target node labels and loading them with slight code modifications, your case can be solved. Could you further explain why the previous suggestion cannot work?

from dyglib.

mgao97 avatar mgao97 commented on September 27, 2024

Sure!

One of the most important questions for me is how to construct the original csv file like wikipedia.csv?

Specifically, I have 100 million edges and each with individual timestamp, I am not clear how to place the node features for both source nodes and target nodes in the original csv file. The node features are also about 1k dimensions.

from dyglib.

yule-BUAA avatar yule-BUAA commented on September 27, 2024

We obtain the original files like wikipedia.csv from previous work EdgeBank.

For the mentioned case, you can directly place the edge features as the original wikipedia.csv does. Then, for node features, if the node features change over time, you can append the features of source and target nodes after edge feature at each line. Moreover, you should also record the edge and node feature dimensions, so that you can split them like this. If the node features are consistent regardless of time, you can additionally save node features in a separate file with shape (num_nodes, node_feat_dim), and then load the node features by the node index in each line.

from dyglib.

Related Issues (17)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.