Giter Site home page Giter Site logo

pritamqu / avcaffe Goto Github PK

View Code? Open in Web Editor NEW
15.0 3.0 1.0 25.63 MB

[AAAI 2023] AVCAffe: A Large Scale Audio-Visual Dataset of Cognitive Load and Affect for Remote Work

Home Page: https://www.pritamsarkar.com/

License: Other

Python 100.00%
audio-dataset dataset emotion-recognition video-dataset cognitive-load audiovisual audiovisual-dataset

avcaffe's Introduction

AVCAffe: A Large Scale Audio-Visual Dataset of Cognitive Load and Affect for Remote Work

AAAI 2023

This is the official code repository of AVCAffe. Please check the project website: https://pritamqu.github.io/AVCAffe/ for additional details. Before you download and/or use the AVCAffe Dataset, please make sure you have read the Database License Agreement available here: DLA.

Updates

  • Paper
  • Database license agreement
  • Instruction to download
  • Dataloader code
  • Database release
    • Audio-visual recordings, ground truths, and additional meta data
    • Face-crops (released on 20 July 2023, please see note below, if you already downloaded the old version.)

Overview

AVCAffe is hosted in borealisdata.ca under Queen's University Dataverse. Please follow the instructions below to download the dataset.

The directory structure of the dataset is as follows.

    ├── ..                              # total size = 34.7 GB
    ├── ground_truths                   # size = 192 KB
    ├── info                            # size = 552 KB
    ├── videos                          
    |   │   ├── per_participant_per_task    # size = 13.9 GB; 950 files
    │   │   ├── aiim001
    │   │   ├── aiim002
    │   │   ├── ...
    │   │   ├── ...
    │   │   └── aiim108
    │   └── shorter_segments            # size = 20.3 GB; 58112 files
    │       ├── aiim001 
    │       ├── aiim002
    │       ├── ...
    │       ├── ...
    │       └── aiim108
    └── face_crops                       
        └── shorter_segments_face    # size = 41.7 GB
           ├── aiim001_task_1.zip
           ├── aiim001_task_2.zip
           ├── ...
           ├── ...
           └── aiim108_task_9.zip

  • ground_truths contains the self-reported ground-truths for affect and cognitive load.
  • info contains additional meta data, e.g., train-val split, pre-study responses, etc. Please find details below.
  • videos/per_participant_per_task contains full length videos of each participant per each task. Video length of 2.5-10 minutes, resolution of 640x360 pixel, format .mp4.
  • videos/shorter_segments contains segmented clips of the same videos in per_participant_per_task. Video length approximately 6 seconds, resized the shorter side at 256 pixel, format .avi. Note, the shorter clips are prepared for easy and efficient to use to train deep learning models.
  • face_crops/shorter_segments_face contains the face crops of the participants. To train the baseline models we use just the face-crops instead of the full frames, which works better (at-least for simple models).

How to Request Access?

Step 1: To access this dataset you must have an account in the Dataverse using an institutional email address. Please go to this link to create an account: https://borealisdata.ca/loginpage.xhtml.

Step 2: Once you have an account, and signed in on the same browser simply click on this link https://borealisdata.ca/dataverseuser.xhtml?selectTab=apiTokenTab, it will show the your API Token. Alternatively, you can click on the drop down list shown under your name (top right corner), and click API Token. Please note the API Token which will be required at future step.

Step 3: Once the account is created please go to this link https://borealisdata.ca/dataset.xhtml?persistentId=doi:10.5683/SP3/PSWY62. Select all 1,918 59960 files in this dataset and Request Access. Please read the Terms of Use and Terms of Access and Accecpt to submit a request. Please see the screenshot below. This screenshot was taken before the face crops are uploaded. Now, the number of total files and size are changed.

step 3

Step 4: Your request to access this dataset is successfully submitted. It may take 5-7 days to approve the request, if your access is not granted by that time, please write us an email at [email protected] and/or [email protected]. Once your request is approved please proceed to the next step.

How to Download?

Step 1: We provide a script to download the dataset. Create a directory where you want to download the dataset and go to that directory. Please run the following commands from a terminal mkdir avcaffe and cd avcaffe.

Step 2: Please run the following command to clone the github repository in your current location.

git clone https://github.com/pritamqu/AVCAffe.git
cd AVCAffe

Step 3: Please open the codes/downloader/downloader.py using a text editor. Please update the API_TOKEN variable with your API Token noted earlier.

Step 4: Next, please ensure you have the required packages installed or you can install them by running

pip install -r codes/downloader/requirements.txt

Step 5: You can download the entire dataset simply running the following command:

python codes/downloader/downloader.py

Step 6: Congratualations! The dataset is downloaded.

Updates on 20 July 2023

We have now released the face crops. In case you have already downloaded this dataset before 20 July 2023, you may just download the face crops by following the given instruction here. If you encounter an error, please create an issue, I will look into it.

Dataloader

We provide supporting codes for easy access of the dataset. A dataloader written in PyTorch is available in avcaffe/codes/dataloader/. A minimum example usage is shown below and details usage is presented here:

db = AVCAffe(ROOT,
             subset='train',
             return_video=True,
             video_clip_duration=2,
             video_fps=16.,
             return_audio=True,
             audio_clip_duration=2,
             audio_fps=16000,
             return_labels=True,
             class_name='mental_demand',
             mode='clip',
             clips_per_video=1,
             )

sample = db.__getitem__(1)

Additional IMPORTANT Details

We list some of the additional details here:

  • the list of participants who agree to use their faces/images/videos for article or accopanied media contents, are available in info/public_face_ids.txt.
  • participants used for training and validation splits are mentioned in info/train.txt and info/val.txt respectively.
  • the outcomes of prestudy questionnaire are available in info/prestudy_response.csv.
  • some of the clips available in shorter_segments have no speech when the participant were listening to the other participant or thinking or trying to solve the tasks, the file ids of such clips are available in info/no_audio_files.txt.

Citation

If you find this repository useful, please consider giving a star ⭐ and citation using the given BibTeX entry:

@misc{sarkar2022avcaffe,
title={AVCAffe: A Large Scale Audio-Visual Dataset of Cognitive Load and Affect for Remote Work},
author={Pritam Sarkar and Aaron Posen and Ali Etemad},
year={2022},
eprint={2205.06887},
archivePrefix={arXiv},
primaryClass={cs.HC}} 

Question

You may directly contact me at [email protected] or connect with me on LinkedIn.

avcaffe's People

Contributors

brennancruse avatar pritamqu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

gtamilselvan07

avcaffe's Issues

AttributeError in dataloader

Hi, I face AttributeError running this line of code. Though the error is handled, the data loader seems not to read data properly on my laptop.

np_snd = audio_resampler.resample(frame).to_ndarray()

audio_resampler.resample(frame) returns a list of one audio frame that doesn't provide to_ndarray() method. Changing the line to np_snd = audio_resampler.resample(frame)[0].to_ndarray() solves the issue on me laptop. I am unsure if someone faces the same issue, but I report here for reference.

dataset please!

Dear all:
thanks for publilshing such a insightful work. I am looking forward to download the dataset and reproduce your work. Please release the dataset as soon as possible.

The dataset download is slow and often gets stuck

Hi,

When I use the official code to download the total 1918 files, request.get() often gets stuck, probably due to too frequent requests from the same ip address.

Could you package the whole dataset to a few zip files, so that we can download them manually?

Thanks very much!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.