Show simple item record

dc.contributor.authorZubiaga Amar, Irune
dc.contributor.authorJusto Blanco, Raquel ORCID
dc.contributor.authorDe Velasco Vázquez, Mikel ORCID
dc.contributor.authorTorres Barañano, María Inés ORCID
dc.date.accessioned2023-01-24T18:10:09Z
dc.date.available2023-01-24T18:10:09Z
dc.date.issued2022
dc.identifier.citationProceedings of IberSPEECH : 186-190 (2022)es_ES
dc.identifier.urihttp://hdl.handle.net/10810/59459
dc.description.abstractEmotion recognition from speech is an active field of study that can help build more natural human-machine interaction systems. Even though the advancement of deep learning technology has brought improvements in this task, it is still a very challenging field. For instance, when considering real life scenarios, things such as tendency toward neutrality or the ambiguous definition of emotion can make labeling a difficult task causing the data-set to be severally imbalanced and not very representative. In this work we considered a real life scenario to carry out a series of emotion classification experiments. Specifically, we worked with a labeled corpus consisting of a set of audios from Spanish TV debates and their respective transcriptions. First, an analysis of the emotional information within the corpus was conducted. Then different data representations were analyzed as to choose the best one for our task; Spectrograms and UniSpeech-SAT were used for audio representation and DistilBERT for text representation. As a final step, Multimodal Machine Learning was used with the aim of improving the obtained classification results by combining acoustic and textual information.es_ES
dc.description.sponsorshipThe research presented in this paper was conducted as part of the AMIC PdC project, which received funding from the Spanish Ministry of Science under grants TIN2017-85854-C4- 3-R, PID2021-126061OB-C42 and PDC2021-120846-C43 and it was also partially funded by the European Union’s Horizon 2020 research and innovation program under grant agreement No. 823907 (MENHIR).es_ES
dc.language.isoenges_ES
dc.publisherISCAes_ES
dc.relationinfo:eu-repo/grantAgreement/EC/H2020/823907es_ES
dc.relationinfo:eu-repo/grantAgreement/MINECO/TIN2017-85854-C4-3- Res_ES
dc.rightsinfo:eu-repo/semantics/openAccesses_ES
dc.subjectAcoustic Signales_ES
dc.subjectTextual Informationes_ES
dc.subjectMulti- modal Machine Learninges_ES
dc.subjectEmotion Recognitiones_ES
dc.titleSpeech emotion recognition in Spanish TV Debateses_ES
dc.typeinfo:eu-repo/semantics/conferenceObjectes_ES
dc.rights.holder(c) 2022 ISCAes_ES
dc.relation.publisherversionhttps://www.isca-speech.org/archive/smm_2022/zubiaga22_smm.htmles_ES
dc.identifier.doi10.21437/IberSPEECH.2022-38
dc.contributor.funderEuropean Commission
dc.departamentoesCiencia de la computación e inteligencia artificiales_ES
dc.departamentoeuKonputazio zientziak eta adimen artifizialaes_ES


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record