Facial Emotion Recognition with Inter-Modality-Attention-Transformer-Based Self-Supervised Learning

Chaudhari, Aayushi; Bhatt, Chintan; Adiraju, Achyut Krishna Sai; Krishna, Achyut; Travieso-González, Carlos M.

Published in

MDPI, Electronics, 2(12), p. 288, 2023

DOI: 10.3390/electronics12020288

Tools

Export citation

Search in Google Scholar

Facial Emotion Recognition with Inter-Modality-Attention-Transformer-Based Self-Supervised Learning

Journal article published in 2023 by Aayushi Chaudhari, Chintan Bhatt

, Achyut Krishna Sai Adiraju, Achyut Krishna, Carlos M. Travieso-González

This paper is made freely available by the publisher.

Full text: Download

Preprint: archiving allowed

Upload

Postprint: archiving allowed

Upload

Published version: archiving allowed

Upload

Policy details

Data provided by

Abstract

Emotion recognition is a very challenging research field due to its complexity, as individual differences in cognitive–emotional cues involve a wide variety of ways, including language, expressions, and speech. If we use video as the input, we can acquire a plethora of data for analyzing human emotions. In this research, we use features derived from separately pretrained self-supervised learning models to combine text, audio (speech), and visual data modalities. The fusion of features and representation is the biggest challenge in multimodal emotion classification research. Because of the large dimensionality of self-supervised learning characteristics, we present a unique transformer and attention-based fusion method for incorporating multimodal self-supervised learning features that achieved an accuracy of 86.40% for multimodal emotion classification.

Published in

Links

Tools

Facial Emotion Recognition with Inter-Modality-Attention-Transformer-Based Self-Supervised Learning

Abstract