Hizkuntza Anitzeko Erlazio Semantikoen Erauzketa Medikuntzaren Domeinuan

Sainz Jiménez, Oscar

dc.contributor.advisor	López de Lacalle Lecuona, Oier
dc.contributor.advisor	Labaka Intxauspe, Gorka
dc.contributor.author	Sainz Jiménez, Oscar
dc.date.accessioned	2020-11-26T17:24:45Z
dc.date.available	2020-11-26T17:24:45Z
dc.date.issued	2020-11-26
dc.date.submitted	2020-10-15
dc.identifier.uri	http://hdl.handle.net/10810/48627
dc.description.abstract	Aro digital honentan datu kopuru handiena textu gordin formatuan aurkitzen da. Datu horiekin lan egiteko Informazio Erauzketa (IE) bihurtzen da oinarri gaur egungo aplikazioetan. Hizkuntzaren prozesaketa automatikoko ataza gehientxuenetan gertatu den bezala ikasketa sakonak artearen egoera ezarri du, baita IEn ere. Jakina da teknika hauek datu kopuru handiak behar dituztela errendimendu ona lortzeko. Badira hainbat domeinu eta testuinguru, datu anotatu gutxikoak, zailtasunak dituztenak ikasketa sakoneko tekniken aurrerapenak modu eraginkorrean erabiltzeko. Anotazio berriak egitea garestia izaten da orokorrean, batez ere eredu berri hauek behar duten kopuruetara iristeko. Lan honen helburu nagusia domeinu eta testuinguru hauentzako modu merke batean ikasketa sakoneko sistemen errendimendua hobetzeko teknikak esploratzea da. Zehatzago esanda, ezagutza-transferentzia eta datuen-gehikuntza automatikoa paradigmetan ikertuko dugu helburua lortzeko. Azkenik, teknika hauek baliabide urrikoa den medikuntzako domeinuko eHealth-KD 2020 ataza-partekatuan aplikatuko eta ebalutako dira, uneko artearen egoera hobetzeko helburuarekin.	es_ES
dc.description.abstract	In this digital age the greatest amount of data is found in raw text format. Information Extraction (IE) to work with this data becomes the basis in today's applications. As has happened in most tasks of automatic language processing, deep learning has established the state of the art in IE as well. It is well known that these techniques require a large amount of data to achieve good performance. There are a number of domains and contexts, with little annotated data, that have di culties making e ective the use of advances in deep learning techniques. Making new annotations is generally expensive, especially to reach the numbers needed for these new models. The main goal of this work is to explore techniques to improve the performance of deep learning systems in a cost-e ective way for these domains and contexts. More speci cally, we will investigate transfer-learning and automatic data augmentation paradigms to achieve the goal. Finally, these techniques will be applied and evaluated in the shared task eHealth-KD 2020 in the low-resource medical domain, with the goal of improving the state of the art.	es_ES
dc.language.iso	eus	es_ES
dc.rights	info:eu-repo/semantics/openAccess	es_ES
dc.rights.uri	http://creativecommons.org/licenses/by-nc-sa/3.0/es/	*
dc.title	Hizkuntza Anitzeko Erlazio Semantikoen Erauzketa Medikuntzaren Domeinuan	es_ES
dc.type	info:eu-repo/semantics/masterThesis	es_ES
dc.rights.holder	Atribución-NoComercial-CompartirIgual 3.0 España	*
dc.departamentoes	Lenguajes y sistemas informáticos	es_ES
dc.departamentoeu	Hizkuntza eta sistema informatikoak	es_ES

Files in this item

Name:: license_rdf
Size:: 1.012Kb
Format:: application/rdf+xml

View/Open

Name:: MAL-Oscar_Sainz.pdf
Size:: 2.761Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Máster Universitario en Análisis y Procesamiento del Lenguaje

Show simple item record

Except where otherwise noted, this item's license is described as Atribución-NoComercial-CompartirIgual 3.0 España