nelson-liu / contextual-repr-analysis Goto Github PK

A toolkit for evaluating the linguistic knowledge and transferability of contextual representations. Code for "Linguistic Knowledge and Transferability of Contextual Representations" (NAACL 2019).

Home Page: http://nelsonliu.me/papers/liu+gardner+belinkov+peters+smith.naacl2019.pdf

Dockerfile 0.05% Shell 5.33% Python 75.55% Perl 18.02% C 1.05%

contextual-repr-analysis's People

Contributors

Stargazers

Watchers

contextual-repr-analysis's Issues

How to get hdf5 file?

How to get contextualizers/elmo_original/ewt_pos.hdf5 in ewt_pos_tagging.json to run example? What should I run in Step 1. Precomputing the Word Representations.
Thx!

ELMo 4-layer variant

Hi @nelson-liu

I'm trying to reproduce the results of this paper https://aclanthology.org/2020.acl-main.422.pdf, where the authors take the advantage of the settings in this repo.

I am wondering whether you can kindly share the trained model of the ELMo 4-layer variant. I can't find it on the Allennlp ELMo page https://allennlp.org/elmo.

Errors when do installation

Hi Liu,

When I do pip install -r requirements.txt in step 5 of installation, I got the following errors/warnings:

WARNING: Built wheel for allennlp is invalid: Metadata 1.2 mandates PEP 440 version, but '0.7.2-unreleased' is not

DEPRECATION: allennlp was installed using the legacy 'setup.py install' method, because a wheel could not be built for it. A possible replacement is to fix the wheel build issue reported above. You can find discussion regarding this at https://github.com/pypa/pip/issues/8368.

Any idea how to fix this?

torch shape error when training probing models

I'm trying to train tagging probes using precomputed representations from the encoder of my Transformer-based machine translation model, but am getting the following error when training attempts to begin:

2020-07-29 05:23:27,936 - INFO - allennlp.training.trainer - Training
...
  File "/raj-learn/contextual-repr-analysis/contexteval/models/tagger.py", line 340, in forward
    average=self.loss_average)                                                                                                                             
  File "/raj-learn/envs/contextual_repr_analysis/lib/python3.6/site-packages/allennlp/nn/util.py", line 633, in sequence_cross_entropy_with_logits
    negative_log_likelihood_flat = - torch.gather(log_probs_flat, dim=1, index=targets_flat)
RuntimeError: invalid argument 2: Input tensor must have same size as output tensor apart from the specified dimension at /pytorch/aten/src/THC/generic/THCTensorScatterGather.cu:29
[INFO/MainProcess] process shutting down

I've computed these representations and saved them in a correctly-formatted hdf5 file using the make_hdf5_file function from the README. I then confirmed that this hdf5 file looks as expected: the keys are string id's that map to (N_tokens, 1024) numpy arrays. N_tokens is the length of that sentence and 1024 is my encoder representation dim (which I correctly pass into my config file). sentence_to_index also looks fine, mapping newline-stripped sentences to string id's.

I get this error when training begins with both the CCG supertagging task and the PTB POS tagging task (which both use the WSJ sentences dataset).

Here's my config file for CCG supertagging (modeled off of yours for ELMo original):

{
    "dataset_reader": {
        "type": "ccg_supertagging",
        "contextualizer": {
            "type": "precomputed_contextualizer",
            "representations_path": "../data/precomputed_reps/ccg_wsj_sentences_all.hdf5"
        }
    },
    "validation_dataset_reader": {
        "type": "ccg_supertagging",
        "contextualizer": {
            "type": "precomputed_contextualizer",
            "representations_path": "../data/precomputed_reps/ccg_wsj_sentences_all.hdf5"
        }
    },
    "train_data_path": "../data/probing_task_data/ccg/train.txt",
    "validation_data_path": "../data/probing_task_data/ccg/dev.txt",
    "test_data_path": "../data/probing_task_data/ccg/test.txt",
    "evaluate_on_test" : true,
    "model": {
        "type": "tagger",
        "token_representation_dim": 1024
    },
    "iterator": {
        "type": "basic",
        "batch_size" : 80
    },
    "trainer": {
        "num_epochs": 50,
        "patience": 3,
        "cuda_device": 0,
        "validation_metric": "+accuracy",
        "optimizer": {
            "type": "adam",
            "lr": 0.001
        }
    }
}

I'm going to try looking into this error now and will update if I figure out what's going on. I cloned the contextual-repr-analysis repo and installed dependencies from requirements.txt, so I imagine I should have the version of allennlp + conllu that were used at the time this package was written.

Thanks again for your help with this excellent work!

While Testing: ImportError: No module named 'allennlp'

Hi, and also I think your paper is awesome! Anyway, onto the issue:

I'm following your 'getting started' guide, and while running py.test -v I get this error for every single task (I'm just showing the error for word_conditional_conllu_pos_tagging_test.py:

_______ ERROR collecting tests/models/word_conditional_majority_tagger/word_conditional_conllu_pos_tagging_test.py _______
ImportError while importing test module '/home/atreyee/contextual-repr-analysis/tests/models/word_conditional_majority_tagger/word_conditional_conllu_pos_tagging_test.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
tests/models/word_conditional_majority_tagger/word_conditional_conllu_pos_tagging_test.py:3: in <module>
    from contexteval.common.model_test_case import ModelTestCase
contexteval/__init__.py:1: in <module>
    from contexteval.contextualizers import *  # noqa: F401,F403
contexteval/contextualizers/__init__.py:1: in <module>
    from contexteval.contextualizers.contextualizer import Contextualizer
contexteval/contextualizers/contextualizer.py:4: in <module>
    from allennlp.common.registrable import Registrable
E   ImportError: No module named 'allennlp'

The test is run within a conda environment, set up as you recommended, with one minor difference: I used the https:// link instead of the git:// link for cloning allennlp, because I can't access git links on my network.

Nonetheless, allennlp is installed and can be accessed by the code- I tested this by test-printing Registrable from within the above contextualizer.py module, and it successfully prints <class 'allennlp.common.registrable.Registrable'>. So clearly the code can access the module, but while running tests, it can't...?

ELMo Transformer model

Hi @nelson-liu,

thanks for sharing the code for the nice paper :)

I would really like to know, if you could share your trained ELMo Transformer model (´model.tar.gz/`) - I would like to integrate it into the flair library to do more experiments with it :)

Thanks in advance,

Stefan

Poor results for OpenAI GPT model (NER)

Hi @nelson-liu,

I just read the appendix of the paper and noticed the bad performance of the GPT model for NER.

I made some experiments with the pytorch_pretrained_bert implementation (in combination with flair) to get embeddings for each token in a sentence to train a model (see this PR).

The results are pretty good (81.67%) on the CoNLL-2003 dataset.

I found the generate_openai_transformer_embeddings.py script that uses the internal GPT model implementation of allennlp.

Is there any chance that you could try to use pytorch_pretrained_bert implementation to retrieve the embeddings? I'm just wondering why there's a difference of ~30% compared to the other models.

Preparing data for (GC)Parent and CCG Tasks

Hey!

I'm trying to run these tasks on some embeddings but I've run into 2 problems for dataset prep:

I'm not sure how I'd go about linearizing the PTB trees for the (GC)Parent tasks. The code depends on "wsj.{train,test,dev}.trees." Do I just remove the newlines in the .prd files from the original dataset to get these files?
For the CCG Tasks, do I just concatenate the .auto files (from the original dataset) using the same PTB splits for train/dev/test to get the "ccg/{train,test,dev}.txt" files?

Thanks a ton for this btw!
Omar

How to generate HDF5 file for BERT?

Hi Nelson,

I am sorry about opening a new issue. But I have a hard time generating hdf5 files for bert.

As you mentioned here #5, we can generate hdf5 files for ELMo models, but how about BERT?

Is this https://github.com/allenai/allennlp/pull/2581/files the right direction?

nelson-liu / contextual-repr-analysis Goto Github PK

contextual-repr-analysis's People

Contributors

Stargazers

Watchers

Forkers

contextual-repr-analysis's Issues

How to get hdf5 file?

ELMo 4-layer variant

Errors when do installation

torch shape error when training probing models

While Testing: ImportError: No module named 'allennlp'

ELMo Transformer model

Poor results for OpenAI GPT model (NER)

Preparing data for (GC)Parent and CCG Tasks

How to generate HDF5 file for BERT?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent