Giter Site home page Giter Site logo

coranholmes / swinbert Goto Github PK

View Code? Open in Web Editor NEW

This project forked from microsoft/swinbert

0.0 0.0 2.0 20.6 MB

Research code for CVPR 2022 paper "SwinBERT: End-to-End Transformers with Sparse Attention for Video Captioning"

Home Page: https://arxiv.org/abs/2111.13196

License: MIT License

Shell 0.18% Python 99.82%

swinbert's Introduction

Dense caption generation using SwinBERT

The code provided here was used in the TEVAD paper to generate text features. Please note that these codes are not actively maintained and should be used at your own risk. For instructions on setting up the environment, please refer to the README_orig.md file.

Requirements

pre-requisites:

sudo apt-get install libopenmpi-dev
pip install -r requirements.txt

To install apex:

git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir ./

Download

Refer to the Download section in the original README_orig.md. For pretrained model, I am currently using VATEX only.

Dense caption generation

Take the ucf-crime dataset as an example, set the paths in the below command and run accordingly

python ./src/tasks/dense_caption_mass.py \
--resume_checkpoint path/to/SwinBERT/models/table1/vatex/best-checkpoint/model.bin \
--eval_model_dir path/to/SwinBERT/models/table1/vatex/best-checkpoint/ \
--dataset_path path/to/SwinBERT/datasets/Crime/data/ \
--caption_file path/to/SwinBERT/datasets/Crime/RTFM_train_caption/vatex_all_captions.txt \
--file_type video \
--file_format mp4 \
--do_lower_case \
--dense_caption \
--do_test

Generate sentence embeddings based on captions

Run

python src/tasks/generate_caption_se.py --dataset ucf \
--caption_path path/to/SwinBERT/datasets/Crime/RTFM_train_caption/vatex_all_captions.txt \
--output_path path/to/SwinBERT/datasets/Crime/RTFM_train_caption/sent_emb_n/

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.