Basque and Spanish Multilingual TTS Model for Speech-to-Speech Translation

De Zuazo Oteiza, Xabier

dc.contributor.advisor	Navas Cordón, Eva
dc.contributor.advisor	Saratxaga Couceiro, Ibon
dc.contributor.author	De Zuazo Oteiza, Xabier
dc.date.accessioned	2023-06-30T14:44:38Z
dc.date.available	2023-06-30T14:44:38Z
dc.date.issued	2023-06-30
dc.identifier.uri	http://hdl.handle.net/10810/61815
dc.description.abstract	[EN] Lately, multiple Text-to-Speech models have emerged using Deep Neural networks to synthesize audio from text. In this work, the state-of-the-art multilingual and multi-speaker Text-to-Speech model has been trained in Basque, Spanish, Catalan, and Galician. The research consisted of gathering the datasets, pre-processing their audio and text data, training the model in the languages in different steps, and evaluating the results at each point. For the training step, a transfer learning approach has been used from a model already trained in three languages: English, Portuguese, and French. Therefore, the final model created here supports a total of seven languages. Moreover, these models also support zero-shot voice conversion, using an input audio file as a reference. Finally, a prototype application has been created to do Speech-to-Speech Translation, putting together the models trained here and other models from the community. Along the way, some Deep Speech Speech-to-Text models have been generated for Basque and Galician.	es_ES
dc.description.abstract	[EU] Azkenaldian, Text-to-Speech eredu anitz sortu dira sare neuronal sakonak erabiliz, testutik audioa sintetizatzeko. Lan honetan, state-of-the-art Text-to-Speech eredu eleaniztun eta hiztun anitzeko eredua landu da euskaraz, gaztelaniaz, katalanez eta galegoz. Ikerketa honetan datu-multzoak bildu, haien audio- eta testu-datuak aldez aurretik prozesatu, eredua hizkuntzetan entrenatu da urrats desberdinetan eta emaitzak puntu bakoitzean ebaluatu dira. Entrenatze-urratserako, ikaskuntza-transferentzia teknika erabili da dagoeneko hiru hizkuntzatan trebatutako eredu batetik abiatuta: ingelesa, portugesa eta frantsesa. Beraz, hemen sortutako azken ereduak zazpi hizkuntza onartzen ditu guztira. Gainera, eredu hauek zero-shot ahots bihurketa ere egiten dute, sarrerako audio fitxategi bat erreferentzia gisa erabiliz. Azkenik, Speech-to-Speech Translation egiteko prototipo aplikazio bat sortu da hemen entrenatutako ereduak eta komunitateko beste eredu batzuk elkartuz. Bide horretan, Deep Speech Speech-to-Text eredu batzuk sortu dira euskararako eta galegorako.	es_ES
dc.language.iso	eng	es_ES
dc.rights.uri	http://creativecommons.org/licenses/by-sa/4.0/
dc.subject	multilingual multi-speaker text-to-speech	es_ES
dc.subject	speech-to-text	es_ES
dc.subject	machine translation	es_ES
dc.subject	speech-to-speech translation	es_ES
dc.subject	cross-lingual zero-shot voice conversion	es_ES
dc.subject	Basque	es_ES
dc.subject	Spanish	es_ES
dc.title	Basque and Spanish Multilingual TTS Model for Speech-to-Speech Translation	es_ES
dc.type	info:eu-repo/semantics/masterThesis
dc.date.updated	2023-02-09T11:17:24Z
dc.language.rfc3066	es
dc.rights.holder	Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
dc.contributor.degree	Máster Universitario en Análisis y Procesamiento del Lenguaje
dc.contributor.degree	Hizkuntzaren Azterketa eta Prozesamendua Unibertsitate Masterra
dc.identifier.gaurregister	128824-383364-05	es_ES
dc.identifier.gaurassign	142572-383364	es_ES

Files in this item

Name:: MT_Xabier_DeZuazo.pdf
Size:: 6.478Mb
Format:: PDF
Description:: MasterThesis_Main_article

View/Open

This item appears in the following Collection(s)

Máster Universitario en Análisis y Procesamiento del Lenguaje

Show simple item record

Except where otherwise noted, this item's license is described as Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)