Giter Site home page Giter Site logo

nelson-liu / contextual-repr-analysis Goto Github PK

View Code? Open in Web Editor NEW
210.0 210.0 30.0 20.46 MB

A toolkit for evaluating the linguistic knowledge and transferability of contextual representations. Code for "Linguistic Knowledge and Transferability of Contextual Representations" (NAACL 2019).

Home Page: http://nelsonliu.me/papers/liu+gardner+belinkov+peters+smith.naacl2019.pdf

Dockerfile 0.05% Shell 5.33% Python 75.55% Perl 18.02% C 1.05%

contextual-repr-analysis's People

Contributors

nelson-liu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

contextual-repr-analysis's Issues

How to get hdf5 file?

How to get contextualizers/elmo_original/ewt_pos.hdf5 in ewt_pos_tagging.json to run example? What should I run in Step 1. Precomputing the Word Representations.
Thx!

Errors when do installation

Hi Liu,

When I do pip install -r requirements.txt in step 5 of installation, I got the following errors/warnings:

WARNING: Built wheel for allennlp is invalid: Metadata 1.2 mandates PEP 440 version, but '0.7.2-unreleased' is not

DEPRECATION: allennlp was installed using the legacy 'setup.py install' method, because a wheel could not be built for it. A possible replacement is to fix the wheel build issue reported above. You can find discussion regarding this at https://github.com/pypa/pip/issues/8368.

Any idea how to fix this?

torch shape error when training probing models

I'm trying to train tagging probes using precomputed representations from the encoder of my Transformer-based machine translation model, but am getting the following error when training attempts to begin:

2020-07-29 05:23:27,936 - INFO - allennlp.training.trainer - Training
...
  File "/raj-learn/contextual-repr-analysis/contexteval/models/tagger.py", line 340, in forward
    average=self.loss_average)                                                                                                                             
  File "/raj-learn/envs/contextual_repr_analysis/lib/python3.6/site-packages/allennlp/nn/util.py", line 633, in sequence_cross_entropy_with_logits
    negative_log_likelihood_flat = - torch.gather(log_probs_flat, dim=1, index=targets_flat)
RuntimeError: invalid argument 2: Input tensor must have same size as output tensor apart from the specified dimension at /pytorch/aten/src/THC/generic/THCTensorScatterGather.cu:29
[INFO/MainProcess] process shutting down  

I've computed these representations and saved them in a correctly-formatted hdf5 file using the make_hdf5_file function from the README. I then confirmed that this hdf5 file looks as expected: the keys are string id's that map to (N_tokens, 1024) numpy arrays. N_tokens is the length of that sentence and 1024 is my encoder representation dim (which I correctly pass into my config file). sentence_to_index also looks fine, mapping newline-stripped sentences to string id's.

I get this error when training begins with both the CCG supertagging task and the PTB POS tagging task (which both use the WSJ sentences dataset).

Here's my config file for CCG supertagging (modeled off of yours for ELMo original):

{
    "dataset_reader": {
        "type": "ccg_supertagging",
        "contextualizer": {
            "type": "precomputed_contextualizer",
            "representations_path": "../data/precomputed_reps/ccg_wsj_sentences_all.hdf5"
        }
    },
    "validation_dataset_reader": {
        "type": "ccg_supertagging",
        "contextualizer": {
            "type": "precomputed_contextualizer",
            "representations_path": "../data/precomputed_reps/ccg_wsj_sentences_all.hdf5"
        }
    },
    "train_data_path": "../data/probing_task_data/ccg/train.txt",
    "validation_data_path": "../data/probing_task_data/ccg/dev.txt",
    "test_data_path": "../data/probing_task_data/ccg/test.txt",
    "evaluate_on_test" : true,
    "model": {
        "type": "tagger",
        "token_representation_dim": 1024
    },
    "iterator": {
        "type": "basic",
        "batch_size" : 80
    },
    "trainer": {
        "num_epochs": 50,
        "patience": 3,
        "cuda_device": 0,
        "validation_metric": "+accuracy",
        "optimizer": {
            "type": "adam",
            "lr": 0.001
        }
    }
}

I'm going to try looking into this error now and will update if I figure out what's going on. I cloned the contextual-repr-analysis repo and installed dependencies from requirements.txt, so I imagine I should have the version of allennlp + conllu that were used at the time this package was written.

Thanks again for your help with this excellent work!

While Testing: ImportError: No module named 'allennlp'

Hi, and also I think your paper is awesome! Anyway, onto the issue:

I'm following your 'getting started' guide, and while running py.test -v I get this error for every single task (I'm just showing the error for word_conditional_conllu_pos_tagging_test.py:

_______ ERROR collecting tests/models/word_conditional_majority_tagger/word_conditional_conllu_pos_tagging_test.py _______
ImportError while importing test module '/home/atreyee/contextual-repr-analysis/tests/models/word_conditional_majority_tagger/word_conditional_conllu_pos_tagging_test.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
tests/models/word_conditional_majority_tagger/word_conditional_conllu_pos_tagging_test.py:3: in <module>
    from contexteval.common.model_test_case import ModelTestCase
contexteval/__init__.py:1: in <module>
    from contexteval.contextualizers import *  # noqa: F401,F403
contexteval/contextualizers/__init__.py:1: in <module>
    from contexteval.contextualizers.contextualizer import Contextualizer
contexteval/contextualizers/contextualizer.py:4: in <module>
    from allennlp.common.registrable import Registrable
E   ImportError: No module named 'allennlp'

The test is run within a conda environment, set up as you recommended, with one minor difference: I used the https:// link instead of the git:// link for cloning allennlp, because I can't access git links on my network.

Nonetheless, allennlp is installed and can be accessed by the code- I tested this by test-printing Registrable from within the above contextualizer.py module, and it successfully prints <class 'allennlp.common.registrable.Registrable'>. So clearly the code can access the module, but while running tests, it can't...?

ELMo Transformer model

Hi @nelson-liu,

thanks for sharing the code for the nice paper :)

I would really like to know, if you could share your trained ELMo Transformer model (´model.tar.gz/`) - I would like to integrate it into the flair library to do more experiments with it :)

Thanks in advance,

Stefan

Poor results for OpenAI GPT model (NER)

Hi @nelson-liu,

I just read the appendix of the paper and noticed the bad performance of the GPT model for NER.

I made some experiments with the pytorch_pretrained_bert implementation (in combination with flair) to get embeddings for each token in a sentence to train a model (see this PR).

The results are pretty good (81.67%) on the CoNLL-2003 dataset.

I found the generate_openai_transformer_embeddings.py script that uses the internal GPT model implementation of allennlp.

Is there any chance that you could try to use pytorch_pretrained_bert implementation to retrieve the embeddings? I'm just wondering why there's a difference of ~30% compared to the other models.

Preparing data for (GC)Parent and CCG Tasks

Hey!

I'm trying to run these tasks on some embeddings but I've run into 2 problems for dataset prep:

  1. I'm not sure how I'd go about linearizing the PTB trees for the (GC)Parent tasks. The code depends on "wsj.{train,test,dev}.trees." Do I just remove the newlines in the .prd files from the original dataset to get these files?
  2. For the CCG Tasks, do I just concatenate the .auto files (from the original dataset) using the same PTB splits for train/dev/test to get the "ccg/{train,test,dev}.txt" files?

Thanks a ton for this btw!
Omar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.