AVCAffe: A Large Scale Audio-Visual Dataset of Cognitive Load and Affect for Remote Work

Under review.

Pritam Sarkar   Aaron Posen   Ali Etemad
[Project Page]



We introduce AVCAffe, the first Audio-Visual dataset consisting of Cognitive load and Affect attributes. We record AVCAffe by simulating remote work scenarios over a video-conferencing platform, where subjects collaborate to complete a number of cognitively engaging tasks. AVCAffe is the largest originally collected (not collected from the Internet) affective dataset in English language. We recruit 106 participants from 18 different countries of origin, spanning an age range of 18 to 57 years old, with a balanced male-female ratio. AVCAffe comprises a total of 108 hours of video, equivalent to more than 58,000 clips along with task-based self-reported ground truth labels for arousal, valence, and cognitive load attributes such as mental demand, temporal demand, effort, and a few others. We believe AVCAffe would be a challenging benchmark for the deep learning research community given the inherent difficulty of classifying affect and cognitive load in particular. Moreover, our dataset fills an existing timely gap by facilitating the creation of learning systems for better self-management of remote work meetings, and further study of hypotheses regarding the impact of remote work on cognitive load and affective states.

AVCAffe Statistics

Attribute Details
Total participants 106
Gender stats. Male: 52 or 49%
Female: 53 or 50%
Non-Binary: 1 or 0.01%
Age stats. 18 to 20: 8
21 to 30: 75
31 to 40: 17
41 to 50: 2
51 to 60: 4
Country of origins Bangladesh, Brazil, Canada, China, Ecuador, Egypt, Germany, Hong Kong, India, Iran, Ireland, Jordan, Mexico, Nigeria, Pakistan, Sweden, USA, Vietnam
Total hours 108 Hours
Total clips 58118
Modalities Audio, Video
Ground truths Arousal, Valence,
Mental Demand, Temporal Demand, Effort, Physical Demand, Frustration, Performance

Request for Access and Download

This dataset is freely available for non-commercial research purpose only. Please see the download and installation instructions here: README.md.


Please cite our paper using the given BibTeX entry.

title={AVCAffe: A Large Scale Audio-Visual Dataset of Cognitive Load and Affect for Remote Work},
author={Pritam Sarkar and Aaron Posen and Ali Etemad},


We are grateful to Bank of Montreal and Mitacs for funding this research. We are also thankful to SciNet HPC Consortium for helping with the computation resources. We thank Shuvendu Roy, Dept. of Electrical and Computer Engineering, at Queen's University for his collaboration during this study. We would like to further thank Prof. Kevin Munhall, Dept. of Psychology, at Queen's University for his valuable discussions at the study design stage.


You may directly contact me at pritam.sarkar@queensu.ca or connect with me on LinkedIn.