RECOGNITION OF AUDIO-VISUAL EMOTIONS USING VIDEO CLIPS

Main Author: Pragya Singh Tomar* & Brahma Datta Shukla
Format: Article Journal
Terbitan: , 2018
Subjects:
Online Access: https://zenodo.org/record/5184752
ctrlnum 5184752
fullrecord <?xml version="1.0"?> <dc schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd"><creator>Pragya Singh Tomar* &amp; Brahma Datta Shukla</creator><date>2018-05-31</date><description>This research describes a multimodal emotion identification system that uses auditory and visual inputs to recognize emotions. Mel-Frequency Cepstral Coefficients, Filter Bank Energies, and prosodic characteristics are retrieved from the audio channel. Two techniques are being investigated for the visual element. First, the geometric relationships between face landmarks, such as distances and angles, are calculated. Second, we condense each emotional movie into a smaller collection of key-frames that may be used to visually distinguish between different emotions. To accomplish so, key-frame summary films are fed into a convolutional neural network. Finally, in a late fusion/stacking approach, the confidence outputs of all the classifiers from all the modalities are utilized to build a new feature space to be trained for final emotion label prediction. Experiments on the SAVEE, eNTERFACE&amp;#39;05, and RML databases reveal that our proposed solution performs significantly better than current options, defining the current state-of-the-art in all three databases.</description><identifier>https://zenodo.org/record/5184752</identifier><identifier>10.5281/zenodo.5184752</identifier><identifier>oai:zenodo.org:5184752</identifier><relation>doi:10.5281/zenodo.5184751</relation><rights>info:eu-repo/semantics/openAccess</rights><rights>https://creativecommons.org/licenses/by/4.0/legalcode</rights><source>INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES &amp; RESEARCH TECHNOLOGY 7(5) 582-584</source><subject>Multimodal Emotion Recognition, Classifier Fusion, Data Fusion, Convolutional Neural Networks.</subject><title>RECOGNITION OF AUDIO-VISUAL EMOTIONS USING VIDEO CLIPS</title><type>Journal:Article</type><type>Journal:Article</type><recordID>5184752</recordID></dc>
format Journal:Article
Journal
Journal:Journal
author Pragya Singh Tomar* & Brahma Datta Shukla
title RECOGNITION OF AUDIO-VISUAL EMOTIONS USING VIDEO CLIPS
publishDate 2018
topic Multimodal Emotion Recognition
Classifier Fusion
Data Fusion
Convolutional Neural Networks
url https://zenodo.org/record/5184752
contents This research describes a multimodal emotion identification system that uses auditory and visual inputs to recognize emotions. Mel-Frequency Cepstral Coefficients, Filter Bank Energies, and prosodic characteristics are retrieved from the audio channel. Two techniques are being investigated for the visual element. First, the geometric relationships between face landmarks, such as distances and angles, are calculated. Second, we condense each emotional movie into a smaller collection of key-frames that may be used to visually distinguish between different emotions. To accomplish so, key-frame summary films are fed into a convolutional neural network. Finally, in a late fusion/stacking approach, the confidence outputs of all the classifiers from all the modalities are utilized to build a new feature space to be trained for final emotion label prediction. Experiments on the SAVEE, eNTERFACE&#39;05, and RML databases reveal that our proposed solution performs significantly better than current options, defining the current state-of-the-art in all three databases.
id IOS16997.5184752
institution ZAIN Publications
institution_id 7213
institution_type library:special
library
library Cognizance Journal of Multidisciplinary Studies
library_id 5267
collection Cognizance Journal of Multidisciplinary Studies
repository_id 16997
subject_area Multidisciplinary
city Stockholm
province INTERNASIONAL
shared_to_ipusnas_str 1
repoId IOS16997
first_indexed 2022-06-06T02:32:34Z
last_indexed 2022-06-06T02:32:34Z
recordtype dc
_version_ 1734895149478051840
score 17.610363