Giter Site home page Giter Site logo

vlp-mabsa's Introduction

Vision-Language Pre-Training for Multimodal Aspect-Based Sentiment Analysis(VLP-MABSA)

Codes and datasets for our ACL'2022 paper:Vision-Language Pre-Training for Multimodal Aspect-Based Sentiment Analysis

Author

Yan Ling

[email protected]

Data Processing

The pre-training dataset we use is MVSA-Multi. You can get from this git. At first, you need to use the judgement rules provided by the git to remove the samples with inconsistent labels.

Text Processing

For texts in MVSA-Multi dataset, we first use NLTK to perform the tokenization.

  • How to obtain aspects

We use twitter_nlp to perform Named Entity Recognition in order to find the aspects. For example, given the text "It is unbelievable ! Stephen Curry won the game !". The result of NER through twitter_nlp is

It/O is/O unbelievable/O !/O Stephen/B-ENTITY Curry/I-ENTITY won/O the/O game/O !/O

We save the result in a dict with the following format

{"text_id":{"aspect_spans":[list of aspect spans],"aspect_texts":[list of aspect texts]},...}
{"100":{"aspect_spans":[[4,5]],"aspect_texts":[["Stephen","Curry"]},...}
  • How to obtain opinions

We use the sentiment lexicon sentiwordnet to matching the opinion words. The lexicon is adopted as a dictionary. The words in the text which belongs to the lexicon are considered as the opinion terms. Using the text above as an example, the word "nice" belongs to the lexicon. We save the infomation with the same format of aspect spans.

{"text_id":{"opinion_spans":[list of opinion spans],"opinion_texts":[list of opinion texts]},...}
{"100":{"opinion_spans":[[2,2]],"opinion_texts":[["unbelievable"]},...}

The dics of aspect spans and opinion spans are used for the AOE pre-training task.

Image Processing

  • How to obtain the features of images

For images in MVSA-Multi dataset, we perform Faster-RCNN to extract the region feature(only retain 36 regions with highest Confidence) as the input feature and the dimension of each region feature is 2048. For the details of how to perform Faster-RCNN, you can refer to the Faster-RCNN.

  • How to obtain the ANP of each image

We employ ANPs extractor to predict the ANPs distribution of each image. To run the code, you need to provide the list of image paths like

/home/MVSA/data/2499.jpg
/home/MVSA/data/2500.jpg
...

The result of ANPs extractor is a dict

{
    "numbers":100             #the number of images
    "images":[
        {
             "bi-concepts": {
                   handsome_guy: 0.13, # probability of each ANP in descending order 
                   cute_boy: 0.08,
                   ...
              },
              "features": [ 
              ...
              ]
         },
         ...
}

The ANP with the highest probability is chosen as the output text of the AOG pre-training task.

Sentiment Processing

As introduced in the git, there are many tweets, in which the labels of text and image are inconsistent. Firstly, you need to adopt the judgement rule defined by the author to remove the incosistent data. Then we save the sentiment labels with the following format

{"data_id": sentiment label}
{"13357": 2, "13356": 0,...} # 0,1,2 denote negtive, neutral and positive, respectively

For more details, we provide the description of our pre-training data files in src/data/jsons/MVSA_descriptions.txt which explains the files defined in src/data/jsons/MVSA.json.

Data Download

Because the pre-training dataset after processing is very large, we only provide the downstream datasets. You can download the downstream datasets and our pre-training model via Baidu Netdist with code:d0tn or Google Drive

Pre-Training

If you have done all the processing above, you can perform the pre-training by running the code as follows.

sh MVSA_pretrain.sh

Downstream Task Training

To Train the downstream JMASA task on two twitter datasets, you can just run the following code. Note that you need to change all the file path in file src\data\jsons\twitter15_info.json and src\data\jsons\twitter17_info.json to your own path.

sh 15_pretrain_full.sh
sh 17_pretrain_full.sh

The following is the description of some parameters of the above shell

--dataset           include dataset name and the path of info json file.
--checkpoint_dir    path to save your training model
--log_dir           path to save the training log
--checkpoint        path of the pre-training model

We also provide our training logs on two datasets in folder ./log.

Acknowledgements

vlp-mabsa's People

Contributors

lyhuohuo avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.