Is this a correct interpretation of how this works:
By randomly removing edges we're simulating creating a 'past' state of the graph. In the paper you do so twice, so we end up with three 'snapshots' of the graph: G_train < G_ho < G_orig
Then, in the training phase we're trying to predict the 'missing' edges E_ho \ E_train with every non-edge of G_ho as negatives, and in the test phase we're trying to predict the 'missing' edges E_orig \ E_ho with every non-edge of G_orig as negatives.
If that's all correct, why is it necessary to create two snapshots? Why can't we just randomly remove edges once to create: G_past < G_orig, positives are E_orig \ E_past and negatives are non-edges of G_orig, and then just do a random 80-20 split for training/test? In other words, why do training and test have to be separate graphs?
And, if we did want to create two snapshots, why does E_train have to be a strict subset of E_ho? Couldn't we have the random edge removal for G_train and G_ho be independent?
Thanks for any help. I'm trying out something similar to your paper, so thanks for this repo and the paper.