Comments (2)
Q1: We follow the original AVA paper, which choose 0.4.Check here:https://arxiv.org/pdf/1901.01342.pdf, Page 7. Similar setting can be found in this paper: https://openaccess.thecvf.com/content_CVPR_2020/papers/Alcazar_Active_Speakers_in_Context_CVPR_2020_paper.pdf, page 5, they choose 1.0.
Q2: As we mentioned in Q1, in many existing works already used 3 losses togethor. Adding this into the figure will make it difficult to understand, so we ignore that.
From my experments, this setting can improve the performance by a small margin (about 0.5% mAP)
from talknet-asd.
Also for prediction, in our work we use the outsAV only. Three losses are only used for training as what I remember.
from talknet-asd.
Related Issues (20)
- Bug in audio loss computation? HOT 1
- Question about window length and hop size for spectrogram HOT 7
- How long does it take to train from scratch on Talkset and AVA datasets? HOT 2
- Minimum length of the audio and video feature HOT 1
- Question about repeated calls to the model by using same duration multiple times in parameter durationset HOT 3
- About ColumbiaASD dataset HOT 4
- Identifying speaker change positions HOT 1
- No video attached to the video_out.avi HOT 2
- Demo with Visualization HOT 2
- Is it possible to run this on CPU only, without cuda? HOT 1
- Can I use less FPS to make things done? HOT 1
- How to annotate the AVA dataset? HOT 1
- can I change the ffmpeg commands to opencv HOT 1
- 关于将代码移植到windows系统的问题 HOT 1
- Auto Cropping using TalkNet (like Opus.pro)
- Update to PySceneDetect 0.6 - change from VideoManager to open_video HOT 1
- 关于说话人概率的计算 HOT 3
- 关于视频FPS的问题 HOT 3
- Extract Face region , timestamp of each unique face appearance , active speaker or not from a video , in Json or any format .
- 关于消融实验
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from talknet-asd.