facebookresearch / lama Goto Github PK

View Code? Open in Web Editor NEW

1.3K 71.0 178.0 482 KB

LAnguage Model Analysis

License: Other

Shell 4.60% Python 95.40%

lama's Introduction

LAMA: LAnguage Model Analysis

LAMA is a probe for analyzing the factual and commonsense knowledge contained in pretrained language models.

The dataset for the LAMA probe is available at https://dl.fbaipublicfiles.com/LAMA/data.zip

LAMA contains a set of connectors to pretrained language models.
LAMA exposes a transparent and unique interface to use:

Transformer-XL (Dai et al., 2019)
BERT (Devlin et al., 2018)
ELMo (Peters et al., 2018)
GPT (Radford et al., 2018)
RoBERTa (Liu et al., 2019)

Actually, LAMA is also a beautiful animal.

Reference:

The LAMA probe is described in the following papers:

@inproceedings{petroni2019language,
  title={Language Models as Knowledge Bases?},
  author={F. Petroni, T. Rockt{\"{a}}schel, A. H. Miller, P. Lewis, A. Bakhtin, Y. Wu and S. Riedel},
  booktitle={In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019},
  year={2019}
}

@inproceedings{petroni2020how,
  title={How Context Affects Language Models' Factual Predictions},
  author={Fabio Petroni and Patrick Lewis and Aleksandra Piktus and Tim Rockt{\"a}schel and Yuxiang Wu and Alexander H. Miller and Sebastian Riedel},
  booktitle={Automated Knowledge Base Construction},
  year={2020},
  url={https://openreview.net/forum?id=025X0zPfn}
}

The LAMA probe

To reproduce our results:

1. Create conda environment and install requirements

(optional) It might be a good idea to use a separate conda environment. It can be created by running:

conda create -n lama37 -y python=3.7 && conda activate lama37
pip install -r requirements.txt

2. Download the data

wget https://dl.fbaipublicfiles.com/LAMA/data.zip
unzip data.zip
rm data.zip

3. Download the models

DISCLAIMER: ~55 GB on disk

Install spacy model

python3 -m spacy download en

Download the models

chmod +x download_models.sh
./download_models.sh

The script will create and populate a pre-trained_language_models folder. If you are interested in a particular model please edit the script.

4. Run the experiments

python scripts/run_experiments.py

results will be logged in output/ and last_results.csv.

Other versions of LAMA

LAMA-UHN

This repository also provides a script (scripts/create_lama_uhn.py) to create the data used in (Poerner et al., 2019).

Negated-LAMA

This repository also gives the option to evalute how pretrained language models handle negated probes (Kassner et al., 2019), set the flag use_negated_probes in scripts/run_experiments.py. Also, you should use this version of the LAMA probe https://dl.fbaipublicfiles.com/LAMA/negated_data.tar.gz

What else can you do with LAMA?

1. Encode a list of sentences

and use the vectors in your downstream task!

pip install -e git+https://github.com/facebookresearch/LAMA#egg=LAMA

import argparse
from lama.build_encoded_dataset import encode, load_encoded_dataset

PARAMETERS= {
        "lm": "bert",
        "bert_model_name": "bert-large-cased",
        "bert_model_dir":
        "pre-trained_language_models/bert/cased_L-24_H-1024_A-16",
        "bert_vocab_name": "vocab.txt",
        "batch_size": 32
        }

args = argparse.Namespace(**PARAMETERS)

sentences = [
        ["The cat is on the table ."],  # single-sentence instance
        ["The dog is sleeping on the sofa .", "He makes happy noises ."],  # two-sentence
        ]

encoded_dataset = encode(args, sentences)
print("Embedding shape: %s" % str(encoded_dataset[0].embedding.shape))
print("Tokens: %r" % encoded_dataset[0].tokens)

# save on disk the encoded dataset
encoded_dataset.save("test.pkl")

# load from disk the encoded dataset
new_encoded_dataset = load_encoded_dataset("test.pkl")
print("Embedding shape: %s" % str(new_encoded_dataset[0].embedding.shape))
print("Tokens: %r" % new_encoded_dataset[0].tokens)

2. Fill a sentence with a gap.

You should use the symbol [MASK] to specify the gap. Only single-token gap supported - i.e., a single [MASK].

python lama/eval_generation.py  \
--lm "bert"  \
--t "The cat is on the [MASK]."

_{^{source: https://commons.wikimedia.org/wiki/File:Bluebell_on_the_phone.jpg}}

Note that you could use this functionality to answer cloze-style questions, such as:

python lama/eval_generation.py  \
--lm "bert"  \
--t "The theory of relativity was developed by [MASK] ."

Install LAMA with pip

Clone the repo

git clone [email protected]:facebookresearch/LAMA.git && cd LAMA

Install as an editable package:

pip install --editable .

If you get an error in mac os x, please try running this instead

CFLAGS="-Wno-deprecated-declarations -std=c++11 -stdlib=libc++" pip install --editable .

Language Model(s) options

Option to indicate which language model(s) to use:

--language-models/--lm : comma separated list of language models (REQUIRED)

BERT

BERT pretrained models can be loaded both: (i) passing the name of the model and using huggingface cached versions or (ii) passing the folder containing the vocabulary and the PyTorch pretrained model (look at convert_tf_checkpoint_to_pytorch in here to convert the TensorFlow model to PyTorch).

--bert-model-dir/--bmd : directory that contains the BERT pre-trained model and the vocabulary
--bert-model-name/--bmn : name of the huggingface cached versions of the BERT pre-trained model (default = 'bert-base-cased')
--bert-vocab-name/--bvn : name of vocabulary used to pre-train the BERT model (default = 'vocab.txt')

RoBERTa

--roberta-model-dir/--rmd : directory that contains the RoBERTa pre-trained model and the vocabulary (REQUIRED)
--roberta-model-name/--rmn : name of the RoBERTa pre-trained model (default = 'model.pt')
--roberta-vocab-name/--rvn : name of vocabulary used to pre-train the RoBERTa model (default = 'dict.txt')

ELMo

--elmo-model-dir/--emd : directory that contains the ELMo pre-trained model and the vocabulary (REQUIRED)
--elmo-model-name/--emn : name of the ELMo pre-trained model (default = 'elmo_2x4096_512_2048cnn_2xhighway')
--elmo-vocab-name/--evn : name of vocabulary used to pre-train the ELMo model (default = 'vocab-2016-09-10.txt')

Transformer-XL

--transformerxl-model-dir/--tmd : directory that contains the pre-trained model and the vocabulary (REQUIRED)
--transformerxl-model-name/--tmn : name of the pre-trained model (default = 'transfo-xl-wt103')

GPT

--gpt-model-dir/--gmd : directory that contains the gpt pre-trained model and the vocabulary (REQUIRED)
--gpt-model-name/--gmn : name of the gpt pre-trained model (default = 'openai-gpt')

Evaluate Language Model(s) Generation

options:

--text/--t : text to compute the generation for
--i : interactive mode
one of the two is required

example considering both BERT and ELMo:

python lama/eval_generation.py \
--lm "bert,elmo" \
--bmd "pre-trained_language_models/bert/cased_L-24_H-1024_A-16/" \
--emd "pre-trained_language_models/elmo/original/" \
--t "The cat is on the [MASK]."

example considering only BERT with the default pre-trained model, in an interactive fashion:

python lamas/eval_generation.py  \
--lm "bert"  \
--i

Get Contextual Embeddings

python lama/get_contextual_embeddings.py \
--lm "bert,elmo" \
--bmn bert-base-cased \
--emd "pre-trained_language_models/elmo/original/"

Unified vocabulary

The intersection of the vocabularies for all considered models

Troubleshooting

If the module cannot be found, preface the python command with PYTHONPATH=.

If the experiments fail on GPU memory allocation, try reducing batch size.

Acknowledgements

Other References

(Kassner et al., 2019) Nora Kassner, Hinrich Schütze. Negated LAMA: Birds cannot fly. arXiv preprint arXiv:1911.03343, 2019.
(Poerner et al., 2019) Nina Poerner, Ulli Waltinger, and Hinrich Schütze. BERT is Not a Knowledge Base (Yet): Factual Knowledge vs. Name-Based Reasoning in Unsupervised QA. arXiv preprint arXiv:1911.03681, 2019.
(Dai et al., 2019) Zihang Dai, Zhilin Yang, Yiming Yang, Jaime G. Carbonell, Quoc V. Le, and Ruslan Salakhutdi. Transformer-xl: Attentive language models beyond a fixed-length context. CoRR, abs/1901.02860.
(Peters et al., 2018) Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. NAACL-HLT 2018
(Devlin et al., 2018) Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: pre-training of deep bidirectional transformers for language understanding. CoRR, abs/1810.04805.
(Radford et al., 2018) Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. 2018. Improving language understanding by generative pre-training.
(Liu et al., 2019) Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv preprint arXiv:1907.11692.

Licence

LAMA is licensed under the CC-BY-NC 4.0 license. The text of the license can be found here.

lama's People

Contributors

Stargazers

Watchers

Forkers

alexholdenmiller zpppy zorrock shenyong123 chaoyue729 lql0716 entn-at krzynio trendingtechnology zhangjiekui azharmithani yuxiang-wu debasishmaji vivekanon ricklentz intuitionmachine hypnoai darrengarvey codeaudit cwenner kaggledevs panda0881 ankur-gos aprilyapingzhang leiloong dragomirradev mmarius almoslmi james-assiene jzbjyb ikuyamada albertvillanova embeddedsamurai barseghyanartur tomarraj008 merajat lukehired zhangdongxu alphadl autoave anubrata npoe xiaming9880 arita37 tkngoutham ruizewang yyll008 kp-forks meinwerk norakassner taylorshin silencio94 gabrielilharco gdpan919 sanjibnarzary akariasai thesamuel subbaraomanchala mbrhd liweijiang chia-hsuan-lee awoziji seanliu96 tombombadil95 lucascofano giusgn mewadashreya sunyilgdx xennygrimmato xingluxi ym-han wgc20 wangdongde xiaoya-li minstar leox1v sunbear616 zhuchen03 morioka nijeah guyrosin lian6605 createrll casually-pylearner iyuele89 xrosliang yhiyori hayul7805 afiqmuzaffar cnjelita songweige danny911kr nativeatom yup111 sujiakuan matteomedioli pharnisch zjh-819 benjaminliu1998 lianlii

lama's Issues

RoBERTa mean P@1 for LAMA-T-REx and LAMA-Google-RE?

Hello,
Have you ever evaluated RoBERTa on LAMA-T-REx and LAMA-Google-RE?

🐛Bug for common_vocab

Hi, I encounter the same problem as in #10.
And I found the reason why 2 examples are filtered is that the obj_label are 1970s and 1990s. And in common_vocab_cased.txt generated by vocab_intersection.py, there are no 1970s and 1990s.

236: {"masked_sentences": ["Income inequality began to increase in the US in the [MASK]."], "obj_label": "1970s", "id": "57287b322ca10214002da3bf_0", "sub_label": "Squad"}
206: {"masked_sentences": ["The perception of Genghis Khan in Mongolia brightened in the [MASK]."], "obj_label": "1990s", "id": "5727404b708984140094db59_0", "sub_label": "Squad"}

Can it be tried on subword ?

Hi, Just wanted to know if this can be extended on a subword level ?
Can we mask like:

Sentence: The cat is sitting at highest point.
Input: The cat is sitting at high_[MASK]_ point

Model should be able to predict highest ?

Sentence selection for T-REx and GoogleRE

Hi there,

How were the sentences (evidences for T-REx and considered_sentences for GoogleRE) selected for T-REx and GoogleRE? Was it done manually or using some script?

Thanks,

The template for P27 in T-Rex is wrong

Hi,

The template of P27 (T-Rex) got zero accuracy, but I find it's because the template is wrong:
[CLS] Harashima is [MASK] citizen . rightanswer: Japan modelpredict: Japanese
In this case, it's actually right for the model to predict Japanese instead of Japan.

So I changed it to [X] is a citizen of [Y]. And the accuracy for this specific relation goes up to 55%.

Missing relation data for LAMA dataset

I tried to replicate the results but I ran into a problem. There are some T-REx relations like P166, P69, P54 that are mentioned in the relations file but don't have data files in the downloadable dataset. Were the results in the paper achieved without such relations or are these relations missing in the dataset?

Dataset size mismatch

Hi, thank you for open-sourcing this great project!

I looked into the datasets provided in this repository (https://dl.fbaipublicfiles.com/LAMA/data.zip) and some of their sizes do not match with the sizes described in the paper.

ConceptNet: 11458 (paper) vs 29774 (dataset)
Google-RE death-place: 765 (paper) vs 766 (dataset)

Also for the TREx dataset, could you explain how the sentences are selected from the 'evidences' in each line of jsonl file? There seems to be multiple 'masked_sentence' in 'evidences'.

Thank you.

Whether the project contains Basline code(RE_o) with LAMA probe

I do some works following the reference paper. I have reproduced the results of LMs. If I want to evaluate the baseline with the LAMA probe, I wonder if there is Baseline script/code in the project, or we need to rewrite this part and evaluate with the LAMA probe data.

./download_models.sh and run_experiments.py: Torch invalid memory size - maybe an overflow?

Hi,

when I run ./download_models.sh., I get the following exception:

Building common vocab
Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex.
Namespace(lm='transformerxl', transformerxl_model_dir='pre-trained_language_models/transformerxl/transfo-xl-wt103/')
Loading transformerxl model...
Loading Transformer XL model from pre-trained_language_models/transformerxl/transfo-xl-wt103/
Traceback (most recent call last):
  File "lama/vocab_intersection.py", line 158, in <module>
    main()
  File "lama/vocab_intersection.py", line 152, in main
    __vocab_intersection(CASED_MODELS, CASED_COMMON_VOCAB_FILENAME)
  File "lama/vocab_intersection.py", line 97, in __vocab_intersection
    model = build_model_by_name(args.lm, args)
  File "/LAMA/lama/modules/__init__.py", line 31, in build_model_by_name
    return MODEL_NAME_TO_CLASS[lm](args)
  File "/LAMA/lama/modules/transformerxl_connector.py", line 37, in __init__
    self.model = TransfoXLLMHeadModel.from_pretrained(model_name)
  File "/home/user123/anaconda3/envs/lama37/lib/python3.7/site-packages/pytorch_pretrained_bert/modeling_transfo_xl.py", line 939, in from_pretrained
    model = cls(config, *inputs, **kwargs)
  File "/home/user123/anaconda3/envs/lama37/lib/python3.7/site-packages/pytorch_pretrained_bert/modeling_transfo_xl.py", line 1312, in __init__
    self.transformer = TransfoXLModel(config)
  File "/home/user123/anaconda3/envs/lama37/lib/python3.7/site-packages/pytorch_pretrained_bert/modeling_transfo_xl.py", line 1033, in __init__
    div_val=config.div_val)
  File "/home/user123/anaconda3/envs/lama37/lib/python3.7/site-packages/pytorch_pretrained_bert/modeling_transfo_xl.py", line 780, in __init__
    self.emb_layers.append(nn.Embedding(r_idx-l_idx, d_emb_i))
  File "/home/user123/anaconda3/envs/lama37/lib/python3.7/site-packages/torch/nn/modules/sparse.py", line 100, in __init__
    self.weight = Parameter(torch.Tensor(num_embeddings, embedding_dim))
RuntimeError: $ Torch: invalid memory size -- maybe an overflow? at /pytorch/aten/src/TH/THGeneral.cpp:188

I tried different (newer) versions of torch, but that lead to the exact same dimension error that JXZe reports in Issue #32 :

      RuntimeError: Trying to create tensor with negative dimension -200001: [-200001, 16]

But in #32 there is no recommendation how to fix this dimension error.

All the packages from requirements.txt are installed correctly, but I have overrides==3.1.0 instead of overrides==6.1.0 as the import "from allennlp.modules.elmo import _ElmoBiLm" in elmo_connector.py didn't work, it worked only after changing to 3.1.0. I also tried to skip the building vocab-part and downloaded the provided common_vocab.txts from the README, but the same Torch: invalid memory size -- maybe an overflow?-error occurs when running run_experiments.py .

Does anybody have an idea how to fix this?

Windows, Ubuntu, Allennlp

Hi There - I could use your help on a few questions I have:

I am on Win10 and running ubuntu with Linux dev. I was wondering why i am getting the following error when I run "python scripts/run_experiments.py":

Traceback (most recent call last):
File "scripts/run_experiments.py", line 8, in
from batch_eval_KB_completion import main as run_evaluation
File "/home/USER/LAMA/scripts/batch_eval_KB_completion.py", line 7, in
from lama.modules import build_model_by_name
File "build/bdist.linux-x86_64/egg/lama/modules/init.py", line 8, in
File "build/bdist.linux-x86_64/egg/lama/modules/elmo_connector.py", line 10, in
ImportError: No module named allennlp.modules.elmo

I have reinstalled Allennlp several times and upgraded. Same error.

Also, i was wondering if I can run LAMA on windows without using the linux completely. Thank you!

How to load a huggingface RoBERTa checkpoint?

Hello,
Thanks for your great job.
I have one problem. I saw in the project, the "roberta_connector.py" load a FAIRSEQ RoBERTa, but I want to load huggingface RoBERTa. What should I do? Could you give me some advice?
Thanks a lot.

A tutorial on how to hack on LAMA with MacOS

Any chance you can add a detailed tutorial on how to work with LAMA on MacOs (M1 or M2 chips)?
There are many developers that are using Macs and it will be helpful to know how to start hacking on this powerful open-source project from Mac

Multiple Valid Objects

The "Language Models as Knowledge Bases" paper, in Section 4.4, mentions that the evaluation deals with multiple valid objects for the same subject and relation pair. Specifically, valid objects other than the one which is being tested are removed from the ranked list of answers before computing the metrics. However, the TREx data released in this repository only includes one object per tested fact, even for queries where multiple valid answers do exist (based on my browsing of Wikidata). So, does the LAMA evaluation account of multiple valid objects? If yes, how does it do that given that the multiple objects are not in the data.

ModuleNotFoundError: No module named 'lama'

I'm getting this error when trying to run ./download_models.sh

File "/home/Downloads/LAMA-main/lama/vocab_intersection.py", line 7, in
from lama.modules import build_model_by_name
ModuleNotFoundError: No module named 'lama'

Code to create the data

Hi all,

Is the code to create the sentences from KBs available? e.g. the code with the templating and the code to generate the sentences with [MASK]?

Thanks,
Arian

mac os installation issue

Hi,

I'm on Mojave, and using

CFLAGS="-Wno-deprecated-declarations -std=c++11 -stdlib=libc++" pip install --editable .

gives me

error: invalid argument '-std=c++11' not allowed with 'C'
error: command 'gcc' failed with exit status 1

Failed building wheel for cytoolz

Any solution please?

Negated templates for T-REx?

Hi,

When trying to run experiments, I'm getting errors in T-Rex regarding "template_negated" (full error below). It seems the file relations.jsonl contains a bunch of examples where the key "template" is present but the key "template_negated" is not, which makes this part of the code (in run_experiments.py) fail:

if "template" in relation:
    PARAMETERS["template"] = relation["template"]
    PARAMETERS["template_negated"] = relation["template_negated"]

Is the file relations.jsonl missing these keys?

Thanks!

Error + relation print:

2. T-REx
bert_base
{'description': 'most specific known '
                '(e.g. city instead of '
                'country, or hospital '
                'instead of city) birth '
                'location of a person, '
                'animal or fictional '
                'character',
 'label': 'place of birth',
 'relation': 'P19',
 'template': '[X] was born in [Y] .',
 'type': 'N-1'}
Traceback (most recent call last):
  File "scripts/run_experiments.py", line 215, in <module>
    run_all_LMs(parameters)
  File "scripts/run_experiments.py", line 204, in run_all_LMs
    run_experiments(*parameters, input_param=ip, use_negated_probes=True)
  File "scripts/run_experiments.py", line 114, in run_experiments
    PARAMETERS["template_negated"] = relation["template_negated"]
KeyError: 'template_negated'

Reproducing ELMo base results on Google-RE

Hi there,

Thank you for releasing the code to this wonderful project! I'm trying to reproduce the results on Google-RE (birth-date and death-place relations) but wasn't able to match the reported numbers in the paper. This is the output I get for ELMo original:

Birth-Date Google-RE
all_samples: 1825 list_of_results: 1825 global MRR: 0.00018831351372720835 global Precision at 10: 0.0 global Precision at 1: 0.0 P@1 : 0.0

Death-Place Google-RE
all_samples: 765 list_of_results: 765 global MRR: 3.727018365172404e-06 global Precision at 10: 0.0 global Precision at 1: 0.0 P@1 : 0.0

It seems like the paper reports Mean Precision@1 scores of 0.1 and 0.3 respectively for the above two relations.

Am I using the repository incorrectly? Please let me know how I can reproduce the results :)

Thank you in advance!

how do i get uniforms for my business split sides comedy club

looking to get uniforms for split sides comedy club

Mathematical expression corpus

Could LAMA be augmented to accept mathematical corpi in order to learn a mathematical knowledge base?

ma-label is not getting displayed in next line if mat-label is too lengthy in angular

How to make label to come in next line. If labe is too lengthy in mat-form-field in angular

If anybody knows this..please give the solution

AttributeError: 'NoneType' object has no attribute 'ids_to_tokens'

Hello,
I followed the tutorial, but when I do "./download_models.sh", I encountered with this problem:

$ ./download_models.sh
BERT BASE LOWERCASED
BERT BASE CASED
Building common vocab
...
Traceback (most recent call last):
  File "lama/vocab_intersection.py", line 160, in <module>
    main()
  File "lama/vocab_intersection.py", line 154, in main
    __vocab_intersection(CASED_MODELS, CASED_COMMON_VOCAB_FILENAME)
  File "lama/vocab_intersection.py", line 99, in __vocab_intersection
    model = build_model_by_name(args.lm, args)
  File "/LAMA-master/lama/modules/__init__.py", line 33, in build_model_by_name
    return MODEL_NAME_TO_CLASS[lm](args)
  File "/LAMA-master/lama/modules/bert_connector.py", line 79, in __init__
    self.vocab = list(self.tokenizer.ids_to_tokens.values())
AttributeError: 'NoneType' object has no attribute 'ids_to_tokens'

Actually, when I try to run the experiments, I get:

$ python scripts/run_experiments.py
Model name 'pre-trained_language_models/transformerxl/transfo-xl-wt103/' was not found in model name list (transfo-xl-wt103). We assumed 'pre-trained_language_models/transformerxl/transfo-xl-wt103/' was a path or url but couldn't find files pre-trained_language_models/transformerxl/transfo-xl-wt103/vocab.bin at this path or url.
Traceback (most recent call last):
  File "scripts/run_experiments.py", line 214, in <module>
    run_all_LMs(parameters)
  File "scripts/run_experiments.py", line 207, in run_all_LMs
    run_experiments(*parameters, input_param=ip, use_negated_probes=False)
  File "scripts/run_experiments.py", line 125, in run_experiments
    model = build_model_by_name(model_type_name, args)
  File "LAMA-master/lama/modules/__init__.py", line 33, in build_model_by_name
    return MODEL_NAME_TO_CLASS[lm](args)
  File "LAMA-master/lama/modules/transformerxl_connector.py", line 33, in __init__
    self.vocab = list(self.tokenizer.idx2sym)
AttributeError: 'NoneType' object has no attribute 'idx2sym'

I wonder if this is the cause: for simplicity, I only downloaded the BERT BASE CASED/UNCASED two models, so I commented out all the other models in the download_model.sh.

Any help will be appreciated, thanks a lot!

download the pre-trained models

Good morning,

I start by saying that I’m new on working on macOS because I tried to install the requirements on windows 10 but I’ve got many issues installing manually all the prerequisites, so I’m trying to switch.

When I try to download the pre-trained models (in particular "BERT BASE CASED”) I get this error:

./download_models.sh: line 12: realpath: command not found

ConceptNet in Negated-LAMA

Hi,

In the paper, the ConceptNet in Negated-LAMA constains 2996 items.

But the when the ConceptNet in https://dl.fbaipublicfiles.com/LAMA/negated_data.tar.gz constains 20000+ items without filter.

Best,
Deming

Ok, I see. The provided `common_vocab_cased.txt` does have `1990s` and `1970s `. So I think the problem probably lies in the `vocab_intersection.py` ?

    Ok, I see. The provided `common_vocab_cased.txt` does have `1990s` and `1970s `. So I think the problem probably lies in the `vocab_intersection.py` ?

Originally posted by @Hannibal046 in #51 (comment)

Use a fine-tuned model

Do you have examples for using a fine-tuned model to do the evaluation? I have saved the checkpoint but just by editing LM list in the run_experiments.py the code failed......

Hi,

when I run ./download_models.sh., I get the following exception:

Building common vocab
Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex.
Namespace(lm='transformerxl', transformerxl_model_dir='pre-trained_language_models/transformerxl/transfo-xl-wt103/')
Loading transformerxl model...
Loading Transformer XL model from pre-trained_language_models/transformerxl/transfo-xl-wt103/
Traceback (most recent call last):
  File "lama/vocab_intersection.py", line 158, in <module>
    main()
  File "lama/vocab_intersection.py", line 152, in main
    __vocab_intersection(CASED_MODELS, CASED_COMMON_VOCAB_FILENAME)
  File "lama/vocab_intersection.py", line 97, in __vocab_intersection
    model = build_model_by_name(args.lm, args)
  File "/LAMA/lama/modules/__init__.py", line 31, in build_model_by_name
    return MODEL_NAME_TO_CLASS[lm](args)
  File "/LAMA/lama/modules/transformerxl_connector.py", line 37, in __init__
    self.model = TransfoXLLMHeadModel.from_pretrained(model_name)
  File "/home/user123/anaconda3/envs/lama37/lib/python3.7/site-packages/pytorch_pretrained_bert/modeling_transfo_xl.py", line 939, in from_pretrained
    model = cls(config, *inputs, **kwargs)
  File "/home/user123/anaconda3/envs/lama37/lib/python3.7/site-packages/pytorch_pretrained_bert/modeling_transfo_xl.py", line 1312, in __init__
    self.transformer = TransfoXLModel(config)
  File "/home/user123/anaconda3/envs/lama37/lib/python3.7/site-packages/pytorch_pretrained_bert/modeling_transfo_xl.py", line 1033, in __init__
    div_val=config.div_val)
  File "/home/user123/anaconda3/envs/lama37/lib/python3.7/site-packages/pytorch_pretrained_bert/modeling_transfo_xl.py", line 780, in __init__
    self.emb_layers.append(nn.Embedding(r_idx-l_idx, d_emb_i))
  File "/home/user123/anaconda3/envs/lama37/lib/python3.7/site-packages/torch/nn/modules/sparse.py", line 100, in __init__
    self.weight = Parameter(torch.Tensor(num_embeddings, embedding_dim))
RuntimeError: $ Torch: invalid memory size -- maybe an overflow? at /pytorch/aten/src/TH/THGeneral.cpp:188

I tried different (newer) versions of torch, but that lead to the exact same dimension error that JXZe reports in Issue #32 :

      RuntimeError: Trying to create tensor with negative dimension -200001: [-200001, 16]

But in #32 there is no recommendation how to fix this dimension error.

Does anybody have an idea how to fix this?

Originally posted by @blrtvs in #47

evaluating customized model (similar to BERT)

Is there an option to load and evaluate a custom pre-trained BERT model?

ModuleNotFoundError: No module named 'llama'

what happened?

RoBERTa evaluation on LAMA

Thank you for open-sourcing and maintaining such a great project! :)

I have an issue related to RoBERTa evaluation on LAMA.
To evaluate the RoBERTa performance on LAMA, I downloaded RoBERTa {base,large} checkpoints from the fairseq repository. Then I slightly modify run_experiment.py to add RoBERTa to the target LMs as follows:

LMs = [
    {
        "lm": "roberta",
        "label": "roberta",
        "models_names": ["roberta"],
        "roberta_model_name": "model.pt",
        "roberta_model_dir": "pre-trained_language_models/roberta.large/",
        "roberta_vocab_name": "dict.txt"
    }, ...
]

Although the RoBERTa large model is loaded correctly, many warnings show up saying that many words are not included vocab_subset in model.

2020-03-12 17:55:33,651 - LAMA - WARNING - word wingspan from vocab_subset not in model vocabulary!
2020-03-12 17:55:33,651 - LAMA - WARNING - word woken from vocab_subset not in model vocabulary!
2020-03-12 17:55:33,651 - LAMA - WARNING - word wooded from vocab_subset not in model vocabulary!
2020-03-12 17:55:33,651 - LAMA - WARNING - word wrestled from vocab_subset not in model vocabulary!
2020-03-12 17:55:33,651 - LAMA - WARNING - word wrinkled from vocab_subset not in model vocabulary!
2020-03-12 17:55:33,651 - LAMA - WARNING - word yanked from vocab_subset not in model vocabulary!
2020-03-12 17:55:33,651 - LAMA - WARNING - word yd from vocab_subset not in model vocabulary!
2020-03-12 17:55:33,651 - LAMA - WARNING - word yellowish from vocab_subset not in model vocabulary!
2020-03-12 17:55:33,651 - LAMA - WARNING - word yer from vocab_subset not in model vocabulary!
2020-03-12 17:55:33,651 - LAMA - WARNING - word zu from vocab_subset not in model vocabulary!
2020-03-12 17:55:33,651 - LAMA - WARNING - word zur from vocab_subset not in model vocabulary!

Then, it was terminated by the error shown below:

Traceback (most recent call last):
  File "scripts/run_experiments.py", line 221, in <module>
    run_all_LMs(parameters)
  File "scripts/run_experiments.py", line 214, in run_all_LMs
    run_experiments(*parameters, input_param=ip, use_negated_probes=False)
  File "scripts/run_experiments.py", line 135, in run_experiments
    Precision1 = run_evaluation(args, shuffle_data=False, model=model)
  File "/home/akari/projects/LAMA/scripts/batch_eval_KB_completion.py", line 390, in main
    model, data, vocab_subset, args.max_sentence_length, args.template
  File "/home/akari/projects/LAMA/scripts/batch_eval_KB_completion.py", line 233, in filter_samples
    if obj_label_ids:
RuntimeError: bool value of Tensor with more than one value is ambiguous

Do you have any insights into why many words are not found in vocab_subset and how we could solve the RuntimeError.

Could you provide your common_vocab_*.txt

I have tried running the downlaod_models.sh script. However, my machine can't deal with transformerXL. Could you release your processed common_vocab_*.txt?

Horizon Zero Dawn Nexusmods

In Nexusmods Of Horizon Zero Dawn Mods, Whenever Fireclaw Does It's Ground Lava Fountain Attack It Crashes The Game, Plz Fix It As Soon As Possible THANKS 👍🔥❤️
WE ARE WAITING!

Running setup.py install for fastBPE ... error

It seems that I struggle with the requirements step ;
fastBPE and fairseq are the last 2 module to installed and I run into a problem :

How to fix that ?

Thank you for your time,

Mathieu

Results on GPT

Hi,
Do you plan releasing results on GPT ?
Especially on GPT-2 if you ran experiments

Thanks for this project, great job here

import allennlp.modules.highway fails

I just installed LAMA without error on Ubuntu 20.
The run_experiment script immediately crashes because of the following import:
when I open python and just run

import allennlp.modules

I've got this error:

>>> import allennlp.modules
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/xtof/anaconda3/envs/lama37/lib/python3.7/site-packages/allennlp/modules/__init__.py", line 9, in <module>
    from allennlp.modules.elmo import Elmo
  File "/home/xtof/anaconda3/envs/lama37/lib/python3.7/site-packages/allennlp/modules/elmo.py", line 20, in <module>
    from allennlp.modules.highway import Highway
  File "/home/xtof/anaconda3/envs/lama37/lib/python3.7/site-packages/allennlp/modules/highway.py", line 12, in <module>
    class Highway(torch.nn.Module):
  File "/home/xtof/anaconda3/envs/lama37/lib/python3.7/site-packages/allennlp/modules/highway.py", line 49, in Highway
    def forward(self, inputs: torch.Tensor) -> torch.Tensor:  # pylint: disable=arguments-differ
  File "/home/xtof/anaconda3/envs/lama37/lib/python3.7/site-packages/overrides/overrides.py", line 88, in overrides
    return _overrides(method, check_signature, check_at_runtime)
  File "/home/xtof/anaconda3/envs/lama37/lib/python3.7/site-packages/overrides/overrides.py", line 114, in _overrides
    _validate_method(method, super_class, check_signature)
  File "/home/xtof/anaconda3/envs/lama37/lib/python3.7/site-packages/overrides/overrides.py", line 135, in _validate_method
    ensure_signature_is_compatible(super_method, method, is_static)
  File "/home/xtof/anaconda3/envs/lama37/lib/python3.7/site-packages/overrides/signature.py", line 104, in ensure_signature_is_compatible
    method_name,
  File "/home/xtof/anaconda3/envs/lama37/lib/python3.7/site-packages/overrides/signature.py", line 211, in ensure_all_positional_args_defined_in_sub
    raise TypeError(f"{method_name}: `{super_param.name}` must be present")
TypeError: Highway.forward: `input` must be present

Thank you !

TREx directory does not exist in the dataset

run_experiments.py uses TREx directory which does not exist in the publicized dataset (data.zip). Also, the dataset contains TREx_alpaca directory. Is it OK to change the directory name in this line to TREx_alpaca to run experiments?

Error in running ./download_models.sh

Hi,

When I execute ./download_models.sh, encounter the following problem

lowercase models
OpenAI GPT
BERT BASE LOWERCASED
BERT LARGE LOWERCASED
cased models
Transformer XL
ELMO ORIGINAL 5.5B
ELMO ORIGINAL
BERT BASE CASED
BERT LARGE CASED
Building common vocab
Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex.
Traceback (most recent call last):
  File "lama/vocab_intersection.py", line 7, in <module>
    from lama.modules import build_model_by_name
  File "/data/datasets/yuke/LAMA/lama/modules/__init__.py", line 8, in <module>
    from .elmo_connector import Elmo
  File "/data/datasets/yuke/LAMA/lama/modules/elmo_connector.py", line 10, in <module>
    from allennlp.modules.elmo import _ElmoBiLm #, Elmo as AllenNLP_Elmo
  File "/home/yuke_wang/anaconda3/envs/lama37/lib/python3.7/site-packages/allennlp/modules/__init__.py", line 9, in <module>
    from allennlp.modules.elmo import Elmo
  File "/home/yuke_wang/anaconda3/envs/lama37/lib/python3.7/site-packages/allennlp/modules/elmo.py", line 20, in <module>
    from allennlp.modules.highway import Highway
  File "/home/yuke_wang/anaconda3/envs/lama37/lib/python3.7/site-packages/allennlp/modules/highway.py", line 12, in <module>
    class Highway(torch.nn.Module):
  File "/home/yuke_wang/anaconda3/envs/lama37/lib/python3.7/site-packages/allennlp/modules/highway.py", line 49, in Highway
    def forward(self, inputs: torch.Tensor) -> torch.Tensor:  # pylint: disable=arguments-differ
  File "/home/yuke_wang/anaconda3/envs/lama37/lib/python3.7/site-packages/overrides/overrides.py", line 83, in overrides
    return _overrides(method, check_signature, check_at_runtime)
  File "/home/yuke_wang/anaconda3/envs/lama37/lib/python3.7/site-packages/overrides/overrides.py", line 170, in _overrides
    _validate_method(method, super_class, check_signature)
  File "/home/yuke_wang/anaconda3/envs/lama37/lib/python3.7/site-packages/overrides/overrides.py", line 189, in _validate_method
    ensure_signature_is_compatible(super_method, method, is_static)
  File "/home/yuke_wang/anaconda3/envs/lama37/lib/python3.7/site-packages/overrides/signature.py", line 113, in ensure_signature_is_compatible
    method_name,
  File "/home/yuke_wang/anaconda3/envs/lama37/lib/python3.7/site-packages/overrides/signature.py", line 220, in ensure_all_positional_args_defined_in_sub
    raise TypeError(f"{method_name}: `{super_param.name}` must be present")
TypeError: Highway.forward: `input` must be present

Is there any solution to this? Thanks!

Requirements in requirements.txt and setup.py are different

Hi,

Just wanted to point out that the requirement versions in setup.py and requirements.txt file are different. (e.g. 'fairseq==0.6.1'
vs fairseq==0.8.0)

Thanks!

RuntimeError: invalid argument 2: k not in range for dimension.

Hi,
I followed the requirements to install the environment.
But when I run the scripts/run_experiments.py, there is a bug in __max_probs_values_indices:
RuntimeError: invalid argument 2: k not in range for dimension at /pytorch/aten/src/TH/generic/THTensorMoreMath.cpp:1190

I checked the log_prob is 28 but k is 10000, which exceeds.

GPT-2 Support

Are you going to support GPT-2. There is an unofficial training here https://github.com/nshepperd/gpt-2

KG evaluation datasets?

Nice work on the EMNLP paper "Language Models as Knowledge Bases?", and thanks for making the code available.

Are the datasets available to reproduce the results from the paper, namely Table 2?

Negated templates results are not shown even though use_negated_probes=True

Hi,

I set use_negated_probes=True in run_experiments.py, but the written results are still for positive templates. Are the negated template results being written somewhere?

Thanks

Duplicates in data?

Hi,

The following is most likely a misunderstanding on my part but I notice that there are many duplicates and pseudo-duplicates in the jsonl files.

For instance, this line in lama/TREx/P17.jsonl:

{
	"uuid": "df10f035-6269-4cdf-88df-26395e0dc3b4",
	"obj_uri": "Q16",
	"obj_label": "Canada",
	"sub_uri": "Q7517499",
	"sub_label": "Simcoe Composite School",
	"predicate_id": "P17",
	"evidences": [{
		"sub_surface": "Simcoe Composite School",
		"obj_surface": "Canada",
		"masked_sentence": "Simcoe Composite School is a high school in Simcoe, Ontario, [MASK]."
	}, {
		"sub_surface": "Simcoe Composite School",
		"obj_surface": "Canada",
		"masked_sentence": "Simcoe Composite School is a high school in Simcoe, Ontario, [MASK]."
	}]
}

has two evidences both of which are the same. This is not always the case, i.e., in many other cases the evidences are different sentences.

Further, in the conceptnet corpora, apparently every UUID appears twice. As an example, here are two instances with the same UUID:

{
	"sub": "alive",
	"obj": "think",
	"pred": "HasSubevent",
	"masked_sentences": ["One of the things you do when you are alive is [MASK]."],
	"obj_label": "think",
	"uuid": "d4f11631dde8a43beda613ec845ff7d1"
}

and

{
	"pred": "HasSubevent",
	"masked_sentences": ["One of the things you do when you are alive is [MASK]."],
	"obj_label": "think",
	"uuid": "d4f11631dde8a43beda613ec845ff7d1",
	"sub_label": "alive"
}

Here, in the second time the instance does not have the following fields sub, obj but otherwise seems to remain unchanged.

So, based on this, my question is:

Are the duplicates intentional? For instance, when computing metrics of my model over the probe, am I to treat the task as-is and if need be, make predictions twice over the same instance?

Alternatively, I could easily root out the duplicates when processing the files? Do I do that instead? Have others done that?

I know for a fact that LAMA on HuggingFace datasets (https://huggingface.co/datasets/lama) contains these duplicates.

ModuleNotFoundError: No module named 'lama'

I got the following error while executing ./download_dataset.sh for the first time. The same problem appears when I tried to execute the same script again.

Building common vocab                        
Traceback (most recent call last):           
  File "lama/vocab_intersection.py", line 7, 
in <module>                                  
    from lama.modules import build_model_by_n
ame                                          
ModuleNotFoundError: No module named 'lama'

The bert-base-cased downloaded from s3 link has something wrong

The tokenizer.vocab_size is 30522, however it should be 28996 which is just the same with config.vocab_size.
I suggust that we can just download the model from hugging face.

No Matching distribution found for torch

Hi everyone,

I want to install Lama on my WIN10 laptop and this error keep interrupt:

Can you help me ?

Thanks

Mathieu

Reproduce results in the paper

Hi,

I was trying to reproduce results by running your code, and couldn't get exactly the same precision on SQuAD.
Here is what I got for bert_large model on SQuAD:
all_samples: 303
list_of_results: 303
global MRR: 0.3018861233236291
global Precision at 10: 0.5676567656765676
global Precision at 1: 0.16831683168316833

However, in the paper, the table shows that there should be 305 samples and the precision should be 17.4%.

At first, I guessed that it is because 2 samples are excluded because their object labels are out of the common vocabulary, but even after testing without common vocabulary, I got global Precision at 1: 0.1704918, which is still different to results in the paper.

Is there a way to reproduce the same results in the paper?
Please correct me if I made any mistakes! Thanks!

facebookresearch / lama Goto Github PK

lama's Introduction

LAMA: LAnguage Model Analysis

The dataset for the LAMA probe is available at https://dl.fbaipublicfiles.com/LAMA/data.zip

Reference:

The LAMA probe

1. Create conda environment and install requirements

2. Download the data

3. Download the models

DISCLAIMER: ~55 GB on disk

4. Run the experiments

Other versions of LAMA

LAMA-UHN

Negated-LAMA

What else can you do with LAMA?

1. Encode a list of sentences

2. Fill a sentence with a gap.

Install LAMA with pip

Language Model(s) options

BERT

RoBERTa

ELMo

Transformer-XL

GPT

Evaluate Language Model(s) Generation

Get Contextual Embeddings

Unified vocabulary

Troubleshooting

Acknowledgements

Other References

Licence

lama's People

Contributors

Stargazers

Watchers

Forkers

lama's Issues

Recommend Projects

Recommend Topics

Recommend Org