Giter Site home page Giter Site logo

xinhen / mfm Goto Github PK

View Code? Open in Web Editor NEW

This project forked from xmu-xiaoma666/mfm

0.0 0.0 0.0 1.34 MB

An official implementation for "Knowing What to Learn: A Metric-Oriented Focal Mechanism for Image Captioning"

Home Page: https://ieeexplore.ieee.org/document/9802801/

License: MIT License

Python 100.00%

mfm's Introduction

Knowing What to Learn: A Metric-Oriented Focal Mechanism for Image Captioning

An official implementation for "Knowing What to Learn: A Metric-Oriented Focal Mechanism for Image Captioning"

Motivation

img1

Overview

img2

Environment setup

Please refer to meshed-memory-transformer

Data preparation

  • Annotation. Download the annotation file annotation.zip. Extarct and put it in the project root directory.
  • Feature. You can download our ResNeXt-101 feature (hdf5 file) here. Acess code: jcj6.
  • evaluation. Download the evaluation tools here. Acess code: jcj6. Extarct and put it in the project root directory.

Training

### CIDEr-Based MFM Training on Region Feature

python region_cmfm.py --exp_name Region_CMFM --batch_size 50 --rl_batch_size 100 --workers 4 --head 8 --warmup 10000 --features_path your_region_feature_path --annotation /home/data/m2_annotations --logs_folder tensorboard_logs


### ECIDEr-Based MFM Training on Region Feature

python region_emfm.py --exp_name Region_EMFM --batch_size 50 --rl_batch_size 100 --workers 4 --head 8 --warmup 10000 --features_path your_region_feature_path --annotation /home/data/m2_annotations --logs_folder tensorboard_logs


### CIDEr-Based MFM Training on Grid Feature

python grid_cmfm.py --exp_name Grid_CMFM --batch_size 50 --rl_batch_size 100 --workers 4 --head 8 --warmup 10000 --features_path your_grid_feature_path --annotation /home/data/m2_annotations --logs_folder tensorboard_logs


### ECIDEr-Based MFM Training on Grid Feature

python grid_emfm.py --exp_name Grid_EMFM --batch_size 50 --rl_batch_size 100 --workers 4 --head 8 --warmup 10000 --features_path your_grid_feature_path --annotation /home/data/m2_annotations --logs_folder tensorboard_logs

Evaluation

python eval.py --batch_size 50 --exp_name MFM --features_path your_feature_path --annotation /home/data/m2_annotations

Citation

@ARTICLE{
    ji2022koniwing,  
    author={Ji, Jiayi and Ma, Yiwei and Sun, Xiaoshuai and Zhou, Yiyi and Wu, Yongjian and Ji, Rongrong},  
    journal={IEEE Transactions on Image Processing},   
    title={Knowing What to Learn: A Metric-Oriented Focal Mechanism for Image Captioning},   
    year={2022},  
    volume={31},  
    number={},  
    pages={4321-4335},  
    doi={10.1109/TIP.2022.3183434}
    }

mfm's People

Contributors

xmu-xiaoma666 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.