yzc526,github

a-pytorch-tutorial-to-image-captioning

Show, Attend, and Tell | a PyTorch Tutorial to Image Captioning

aoanet

Code for paper "Attention on Attention for Image Captioning". ICCV 2019

arctic-captions

attention-is-all-you-need-pytorch

A PyTorch implementation of the Transformer model in "Attention is All You Need".

bottom-up-attention-vqa

An updated PyTorch implementation of hengyuan-hu's version for 'Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering'

butd_model

A pytorch implementation of "Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering" for image captioning.

coco-caption-3

coco-caption2.7

cv-backbones

CV backbones including GhostNet, TinyNet and TNT, developed by Huawei Noah's Ark Lab.

deep-learning-for-image-processing

deep learning for image processing including classification and object-detection etc.

deep-tutorials-for-pytorch

In-depth tutorials for implementing deep learning models on your own with PyTorch.

im2txt_demo

im2txt + pretrained models + docker = easy tryout

knowing-when-to-look-adaptive-attention

PyTorch Implementation of Knowing When to Look: Adaptive Attention via a Visual Sentinal for Image Captioning

learning_to_execute

Learning to Execute

lstm

m3ae

[MICCAI-2022] This is the official implementation of Multi-Modal Masked Autoencoders for Medical Vision-and-Language Pre-Training.

m3ae_public

Multimodal Masked Autoencoders (M3AE): A JAX/Flax Implementation

mae

PyTorch implementation of MAE https//arxiv.org/abs/2111.06377

mae-pytorch

Unofficial PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners

meshed-memory-transformer

Meshed-Memory Transformer for Image Captioning. CVPR 2020

models

Models and examples built with TensorFlow

oscar

Oscar and VinVL

pytorch-beginner

pytorch tutorial for beginners

pytorch-image-models

PyTorch image models, scripts, pretrained weights -- ResNet, ResNeXT, EfficientNet, EfficientNetV2, NFNet, Vision Transformer, MixNet, MobileNet-V3/V2, RegNet, DPN, CSPNet, and more

r2

[ACL-2021] The official implementation of Cross-modal Memory Networks for Radiology Report Generation.

self-critical.pytorch

Unofficial pytorch implementation for Self-critical Sequence Training for Image Captioning. and others.

subword-nmt

Unsupervised Word Segmentation for Neural Machine Translation and Text Generation

swin-transformer

This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".

unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

videocaption

视频的文本摘要(标注)，输入一段视频，通过深度学习网络和人工智能程序识别视频主要表达的意思(Input a video output a txt decribing the video)。

yzc526 Goto Github PK

yzc526's Projects

Recommend Projects

Recommend Topics

Recommend Org