Transfer learning from speech to music: towards language-sensitive emotion recognition models
Main Authors: | Gómez-Cañón, Juan Sebastián, Cano, Estefanía, Herrera, Perfecto, Gómez, Emilia |
---|---|
Format: | Proceeding |
Bahasa: | eng |
Terbitan: |
, 2020
|
Subjects: | |
Online Access: |
https://zenodo.org/record/4076791 |
ctrlnum |
4076791 |
---|---|
fullrecord |
<?xml version="1.0"?>
<dc schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd"><creator>Gómez-Cañón, Juan Sebastián</creator><creator>Cano, Estefanía</creator><creator>Herrera, Perfecto</creator><creator>Gómez, Emilia</creator><date>2020-10-05</date><description>In this study, we address emotion recognition using unsupervised feature learning from speech data, and test its transferability to music. Our approach is to pre-train models using speech in English and Mandarin, and then fine-tune them with excerpts of music labeled with categories of emotion.
Our initial hypothesis is that features automatically learned from speech should be transferable to music. Namely, we expect the intra-linguistic setting (e.g., pre-training on speech in English and fine-tuning on music in English) should result in improved performance over the cross-linguistic setting (e.g., pre-training on speech in English and fine-tuning on music in Mandarin). Our results confirm previous research on cross-domain transferability, and encourage research towards language-sensitive Music Emotion Recognition (MER) models.</description><identifier>https://zenodo.org/record/4076791</identifier><identifier>10.5281/zenodo.4076791</identifier><identifier>oai:zenodo.org:4076791</identifier><language>eng</language><relation>info:eu-repo/grantAgreement/EC/H2020/770376/</relation><relation>doi:10.5281/zenodo.4076790</relation><rights>info:eu-repo/semantics/openAccess</rights><rights>https://creativecommons.org/licenses/by/4.0/legalcode</rights><subject>sparse convolutional autoencoder</subject><subject>speech emotion recognition</subject><subject>music emotion recognition</subject><subject>unsupervised learning</subject><subject>multi-task learning</subject><title>Transfer learning from speech to music: towards language-sensitive emotion recognition models</title><type>Journal:Proceeding</type><type>Journal:Proceeding</type><recordID>4076791</recordID></dc>
|
language |
eng |
format |
Journal:Proceeding Journal |
author |
Gómez-Cañón, Juan Sebastián Cano, Estefanía Herrera, Perfecto Gómez, Emilia |
title |
Transfer learning from speech to music: towards language-sensitive emotion recognition models |
publishDate |
2020 |
topic |
sparse convolutional autoencoder speech emotion recognition music emotion recognition unsupervised learning multi-task learning |
url |
https://zenodo.org/record/4076791 |
contents |
In this study, we address emotion recognition using unsupervised feature learning from speech data, and test its transferability to music. Our approach is to pre-train models using speech in English and Mandarin, and then fine-tune them with excerpts of music labeled with categories of emotion.
Our initial hypothesis is that features automatically learned from speech should be transferable to music. Namely, we expect the intra-linguistic setting (e.g., pre-training on speech in English and fine-tuning on music in English) should result in improved performance over the cross-linguistic setting (e.g., pre-training on speech in English and fine-tuning on music in Mandarin). Our results confirm previous research on cross-domain transferability, and encourage research towards language-sensitive Music Emotion Recognition (MER) models. |
id |
IOS16997.4076791 |
institution |
DEFAULT |
institution_type |
library:public library |
library |
DEFAULT |
collection |
DEFAULT |
city |
DEFAULT |
province |
DEFAULT |
repoId |
IOS16997 |
first_indexed |
2022-06-06T04:56:03Z |
last_indexed |
2022-06-06T04:56:03Z |
recordtype |
dc |
merged_child_boolean |
1 |
_version_ |
1739400927514722304 |
score |
17.608969 |