dc.contributor.advisor | Navas Cordón, Eva  | |
dc.contributor.advisor | Saratxaga Couceiro, Ibon  | |
dc.contributor.author | De Zuazo Oteiza, Xabier | |
dc.date.accessioned | 2023-06-30T14:44:38Z | |
dc.date.available | 2023-06-30T14:44:38Z | |
dc.date.issued | 2023-06-30 | |
dc.identifier.uri | http://hdl.handle.net/10810/61815 | |
dc.description.abstract | [EN] Lately, multiple Text-to-Speech models have emerged using Deep Neural networks to
synthesize audio from text. In this work, the state-of-the-art multilingual and
multi-speaker Text-to-Speech model has been trained in Basque, Spanish, Catalan, and
Galician. The research consisted of gathering the datasets, pre-processing their audio and
text data, training the model in the languages in different steps, and evaluating the
results at each point. For the training step, a transfer learning approach has been used
from a model already trained in three languages: English, Portuguese, and French.
Therefore, the final model created here supports a total of seven languages. Moreover,
these models also support zero-shot voice conversion, using an input audio file as a
reference. Finally, a prototype application has been created to do Speech-to-Speech
Translation, putting together the models trained here and other models from the
community. Along the way, some Deep Speech Speech-to-Text models have been
generated for Basque and Galician. | es_ES |
dc.description.abstract | [EU] Azkenaldian, Text-to-Speech eredu anitz sortu dira sare neuronal sakonak erabiliz, testutik audioa sintetizatzeko. Lan honetan, state-of-the-art Text-to-Speech eredu
eleaniztun eta hiztun anitzeko eredua landu da euskaraz, gaztelaniaz, katalanez eta
galegoz. Ikerketa honetan datu-multzoak bildu, haien audio- eta testu-datuak aldez
aurretik prozesatu, eredua hizkuntzetan entrenatu da urrats desberdinetan eta emaitzak
puntu bakoitzean ebaluatu dira. Entrenatze-urratserako, ikaskuntza-transferentzia
teknika erabili da dagoeneko hiru hizkuntzatan trebatutako eredu batetik abiatuta:
ingelesa, portugesa eta frantsesa. Beraz, hemen sortutako azken ereduak zazpi hizkuntza
onartzen ditu guztira. Gainera, eredu hauek zero-shot ahots bihurketa ere egiten dute,
sarrerako audio fitxategi bat erreferentzia gisa erabiliz. Azkenik, Speech-to-Speech
Translation egiteko prototipo aplikazio bat sortu da hemen entrenatutako ereduak eta
komunitateko beste eredu batzuk elkartuz. Bide horretan, Deep Speech Speech-to-Text
eredu batzuk sortu dira euskararako eta galegorako. | es_ES |
dc.language.iso | eng | es_ES |
dc.rights.uri | http://creativecommons.org/licenses/by-sa/4.0/ | |
dc.subject | multilingual multi-speaker text-to-speech | es_ES |
dc.subject | speech-to-text | es_ES |
dc.subject | machine translation | es_ES |
dc.subject | speech-to-speech translation | es_ES |
dc.subject | cross-lingual zero-shot voice conversion | es_ES |
dc.subject | Basque | es_ES |
dc.subject | Spanish | es_ES |
dc.title | Basque and Spanish Multilingual TTS Model for Speech-to-Speech Translation | es_ES |
dc.type | info:eu-repo/semantics/masterThesis | |
dc.date.updated | 2023-02-09T11:17:24Z | |
dc.language.rfc3066 | es | |
dc.rights.holder | Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) | |
dc.contributor.degree | Máster Universitario en Análisis y Procesamiento del Lenguaje | |
dc.contributor.degree | Hizkuntzaren Azterketa eta Prozesamendua Unibertsitate Masterra | |
dc.identifier.gaurregister | 128824-383364-05 | es_ES |
dc.identifier.gaurassign | 142572-383364 | es_ES |