woshiawang Goto Github PK
Type: User
Type: User
第一次实验
Coarse-to-Fine Reasoning for Visual Question Answering
This is the official implementation of the Video Dialog as Conversation about Objects Living in Space-Time paper
An elegant dashboard
Code and released pre-trained model for our ACL 2022 paper: "DialogVED: A Pre-trained Latent Variable Encoder-Decoder Model for Dialog Response Generation"
DMRM: A Dual-channel Multi-hop Reasoning Model for Visual Dialog
We rank the 1st in DSTC8 Audio-Visual Scene-Aware Dialog competition. This is the source code for our IEEE/ACM TASLP (AAAI2020-DSTC8-AVSD) paper "Bridging Text and Video: A Universal Multimodal Transformer for Video-Audio Scene-Aware Dialog".
evalai-test
Distillation from Heterogeneous Models for Top-K Recommendation (WWW'23)
A Toolbox for MultiModal Recommendation. Integrating 10+ Models...
[WWW'2023] Multi-Modal Self-Supervised Learning for Recommendation
A curated list of awesome resources about multimodal recommender systems.
OVSegmentor, CVPR23
🍿Positioning tooltips and popovers is difficult. Popper is here to help!
Implementation of CVPR 2023 paper "Prompting Large Language Models with Answer Heuristics for Knowledge-based Visual Question Answering".
code for "State Graph Reasoning for Multimodal Conversational Recommendation"
Code for Paper "Selecting Stickers in Open-Domain Dialogue through Multitask Learning"
Spatio-Temporal Two-Stage Fusion
The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"
Implementation for the paper "Unified Multimodal Model with Unlikelihood Training for Visual Dialog"
Source code for paper "VD-PCR: Improving Visual Dialog with Pronoun Coreference Resolution"
Multi-model video-to-text by combining embeddings from Flan-T5 + CLIP + Whisper + SceneGraph. The 'backbone LLM' is pre-trained from scratch on YouTube (YT-1B dataset).
Implementation for "Large-scale Pretraining for Visual Dialog" https://arxiv.org/abs/1912.02379
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.