Giter Site home page Giter Site logo

mokge's Introduction

Diversifying Commonsense Reasoning Generation on Knowledge Graph

Introduction

-- This is the pytorch implementation of our ACL 2022 paper "Diversifying Content Generation for Commonsense Reasoning with Mixture of Knowledge Graph Experts" [PDF]. In this paper, we propose MoKGE, a novel method that diversifies the generative commonsense reasoning by a mixture of expert (MoE) strategy on knowledge graphs (KG). A set of knowledge experts seek diverse reasoning on KG to encourage various generation outputs.

Create an environment

transformers==3.3.1
torch==1.7.0
nltk==3.4.5
networkx==2.1
spacy==2.2.1
torch-scatter==2.0.5+${CUDA}
psutil==5.9.0

-- For torch-scatter, ${CUDA} should be replaced by either cu101 cu102 cu110 or cu111 depending on your PyTorch installation. For more information check here.

-- A docker environment could be downloaded from wenhaoyu97/divgen:5.0

We summarize some common environment installation problems and solutions here.

Preprocess the data

-- Extract English ConceptNet and build graph.

cd data
wget https://s3.amazonaws.com/conceptnet/downloads/2018/edges/conceptnet-assertions-5.6.0.csv.gz
gzip -d conceptnet-assertions-5.6.0.csv.gz
cd ../preprocess
python extract_cpnet.py
python graph_construction.py

-- Preprocess multi-hop relational paths. Set $DATA to either anlg or eg.

export DATA=eg
python ground_concepts_simple.py $DATA
python find_neighbours.py $DATA
python filter_triple.py $DATA

-- Download pre-processed data if you do not want process by yourself. [GoogleDrive]

Run Baseline

Baseline Name Run Baseline Model Venue and Reference
Truncated Sampling bash scripts/TruncatedSampling.sh Fan et al., ACL 2018 [PDF]
Nucleus Sampling bash scripts/NucleusSampling.sh Holtzman et al., ICLR 2020 [PDF]
Variational AutoEncoder bash scripts/VariationalAutoEncoder.sh Gupta et al., AAAI 2018 [PDF]
Mixture of Experts
(MoE-embed)
bash scripts/MixtureOfExpertCho.sh Cho et al., EMNLP 2019 [PDF]
Mixture of Experts
(MoE-prompt)
bash scripts/MixtureOfExpertShen.sh Shen et al., ICML 2019 [PDF]

Run MoKGE

-- Independently parameterizing each expert may exacerbate overfitting since the number of parameters increases linearly with the number of experts. We follow the parameter sharing schema in Cho et al., (2019); Shen et al., (2019) to avoid this issue. This only requires a negligible increase in parameters over the baseline model that does not uses MoE. Speficially, Cho et al., (2019) added a unique expert embedding to each input token, while Shen et al., (2019) added an expert prefix token before the input text sequence.

-- MoKGE-embed (Cho et al.,) bash scripts/KGMixtureOfExpertCho.sh

-- MoKGE-prompt (shen et al.,) bash scripts/KGMixtureOfExpertShen.sh

Citation

@inproceedings{yu2022diversifying,
  title={Diversifying Content Generation for Commonsense Reasoning with Mixture of Knowledge Graph Experts},
  author={Yu, Wenhao and Zhu, Chenguang and Qin, Lianhui and Zhang, Zhihan and Zhao, Tong and Jiang, Meng},
  booktitle={Findings of Annual Meeting of the Association for Computational Linguistics (ACL)},
  year={2022}
}

Please kindly cite our paper if you find this paper and the codes helpful.

Acknowledgements

Many thanks to the Github repository of Transformers, KagNet and MultiGen.

Part of our codes are modified based on their codes.

mokge's People

Contributors

wyu97 avatar ziems avatar

Stargazers

Yu Yuqing avatar Yiren Liu avatar Hwq avatar  avatar Tianhui Zhang avatar  avatar  avatar 6ev avatar Eunjeong Hwang avatar 周嘉莹 avatar  avatar Pollawat Hongwimol avatar XingWu_UCAS avatar Yasuhiro Morioka avatar Islam Mohamed avatar  avatar Hikari avatar  avatar  avatar  avatar ZYD'ing avatar Zhuoran Jin avatar seanbbear avatar  avatar Jie Huang avatar  avatar zjcanjux avatar Ashis Samal avatar  avatar  avatar Sizhe Zhou avatar 爱可可-爱生活 avatar  avatar Meng Jiang avatar Weike Fang avatar Wasserstein Yin avatar  avatar Richard Woodard (XMan) avatar  avatar  avatar Zack Chen avatar Cheng Yang avatar kkkkkkliu avatar 可爱垃圾 avatar DELAG avatar xiaopenhu avatar Gaoheng Zhang avatar 黑漏 avatar Sindre Sorhus avatar  avatar Dev Runner avatar Shen Hong avatar 龙佚 avatar  avatar アンモラル~ avatar  avatar Elisabeth Lucci avatar 樹都京葬 avatar Thomas Witter avatar timothy Rasinski avatar Nikhil Bhende avatar 酱香型人工智能 avatar Mr. Pieixoto avatar Shuohang avatar Zhuosheng Zhang avatar  avatar Donghan Yu avatar  avatar

Watchers

Meng Jiang avatar Daheng avatar Tong Zhao avatar  avatar 酱香型人工智能 avatar Marco Liang avatar DELAG avatar Thomas Witter avatar Dev Runner avatar

mokge's Issues

“Which one of the output evaluation metrics is Uni.C, and which one is Jaccard?”

{
"epoch": "test_metric",
"top1_bleu_1": 0.3906327852515804,
"top1_bleu_2": 0.2431810916775413,
"top1_bleu_3": 0.15490071862237648,
"top1_bleu_4": 0.09866710943157721,
"top1_rouge_l": 0.3339492423258864,
"topk_bleu_1": 0.4864532019703934,
"topk_bleu_2": 0.3468932338728804,
"topk_bleu_3": 0.24911986796542576,
"topk_bleu_4": 0.17547013509788645,
"topk_rouge_l": 0.4175594655655992,
"self_bleu_1": 0.4784498612364838,
"self_bleu_2": 0.3688739712346435,
"self_bleu_3": 0.29729560703936,
"self_bleu_4": 0.24654754205573387,
"self_rouge_l": 0.3721558822061126,
"entropy_1": 6.504218945328833,
"entropy_2": 8.760527233988983,
"entropy_3": 9.542476642245632,
"entropy_4": 9.711528680698537,
"distinct_1": 0.16952552914033447,
"distinct_2": 0.5034347035015392,
"distinct_3": 0.7436860780630318,
"distinct_4": 0.8636731970065263
}

关于main.py程序的运行问题

作者您好,自从上次拜读了您的文章后,我想问下是否在Windows系统上,也可以用pycharm作为编译器运行您的这个main.py的主程序呢?

Low Bleu Score

Hello, I ran your code with the same hyper-parameter you provided in the paper and code and I'm not able to get the same bleu score as you put in the paper. The model I ran is KGMixtureOfExpertCho.sh with eg dataset. The output score is "topk_bleu_4": 0.1459059145584853, "topk_rouge_l": 0.3882581646863371, which are much lower than the numbers in the paper. (The metrics (distinct_2, self_bleu,..etc) look similar to the one with your paper )

Could you guide me on what things I'm possibly missing?

Thank you,

High Self BLEU score with anlg dataset

Hello,

I ran the KGMixtureOfExpertShen model with anlg dataset and I'm getting quite high self-bleu-3 and self-bleu-4 scores than the paper, while other metrics (distinct2, entropy4, topk-bleu4 and topk-rouge-l) produce similar scores as in the paper. For the Shen model, I've added --weight_decay 0.01 --warmup_steps 10000 to reproduce the performance, but is there anything I need to change in hyperparameters?

The output_pred_metric result is as follows:
{
"epoch": "test_metric",
"topk_bleu_4": 0.14420743386330465,
"topk_rouge_l": 0.3857134336425845,
"self_bleu_3": 0.330773404929831,
"self_bleu_4": 0.2793998961185776,
"entropy_4": 10.783610820453294,
"distinct_2": 0.38942323619377806,
}

About code operation, system problems and the use of script files

Hello, I am honored to have read your paper "common sense reasoning generation of mixed expert knowledge map". Can you tell me whether your code runs the main program and whether the system you need to run is on windows or Linux?, Is the script file in your library useful and how to use it,

A questions about data preprocessing

I'm honored to have read your papers. I have a question about the process procedure for the ConceptNet. Could you please tell me where the file "concept.txt" comes from? Does it have all the words in ConceptNet that can be generalized, or has it been processed already and only suitable for your task ?

comp_gcn function

Hello,

In comp_gcn function (graph_encoder.py):

def comp_gcn(...):
    ...
    # First part
    o = concept_hidden.gather(1, head.unsqueeze(2).expand(bsz, mem_t, hidden_size))
    o = o.masked_fill(triple_label.unsqueeze(2) == -1, 0)

    scatter_add(o, tail, dim=1, out=update_node)
    scatter_add( - relation_hidden.masked_fill(triple_label.unsqueeze(2) == -1, 0), tail, dim=1, out=update_node)
    scatter_add(count, tail, dim=1, out=count_out)
    # => o, update_node, and count_node variables are not used anywhere???

    # Second part
    o = concept_hidden.gather(1, tail.unsqueeze(2).expand(bsz, mem_t, hidden_size))
    o = o.masked_fill(triple_label.unsqueeze(2) == -1, 0)
    scatter_add(o, head, dim=1, out=update_node)
    scatter_add( - relation_hidden.masked_fill(triple_label.unsqueeze(2) == -1, 0), head, dim=1, out=update_node)
    scatter_add(count, head, dim=1, out=count_out)
    # => o, update_node, and count_node variables are used in the below part

    act = nn.ReLU()
    # calculating final update_node representation
    update_node = self.W_s[layer_idx](concept_hidden) + self.W_n[layer_idx](update_node) / count_out.clamp(min=1).unsqueeze(2)
    update_node = act(update_node)

Thank you!

self.compute_metrics in the kmoe_trainer

Where is the self.compute_metrics in the kmoe_trainer.py?

        if self.compute_metrics is not None and preds is not None and label_ids is not None:
            metrics = self.compute_metrics(EvalPrediction(predictions=preds, label_ids=label_ids))
        else:
            metrics = {}

Hello, the author. Is it convenient to provide some suggestions about the program running configuration? We can't implement 60 batch_ Size locally on the computer, it may cost a lot in renting servers, and the results are uncertain.

Hello, the author. Does your article about knowledge map need high configuration when it comes to knowledge map? For example, we can't fully use batch in the experimental configuration at present_ If the size is equal to 60, is there any other way for you to get through? If you rent a server for a long time, it is also a great expense and has uncertainty, so ask the author for help.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.