Study of a Metric for Measuring Gender Bias in BERT Language Models

González Hernández, Elvira

dc.contributor.advisor	Pérez de Viñaspre Garralda, Olatz
dc.contributor.advisor	Arbelaiz Gallego, Olatz
dc.contributor.author	González Hernández, Elvira
dc.date.accessioned	2023-06-30T15:14:25Z
dc.date.available	2023-06-30T15:14:25Z
dc.date.issued	2023-06-30
dc.identifier.uri	http://hdl.handle.net/10810/61835
dc.description.abstract	[EN] Since the creation of language models such as BERT, they are being deployed widely as services on platforms to serve millions of users. With their increasing popularity, the fairness of NLP systems and algorithms is a subject of great interest nowadays given the harms an unethical system can cause. That is why researchers have been interested in the development of techniques for detection and mitigation of bias. In this work, a previously proposed metric for measuring gender bias by studying associations between gender-denoting referents and names of professions will be analysed. The accuracy of previous results will be questioned, and a deeper analysis of the metric will demonstrate that the metric has some flaws and limitations in the way it represents gender bias. The experiments will be carried out for three languages, English, Basque and Spanish, and its corresponding monolingual BERT models: BERT base, BERTeus and BETO. The fact that the three languages are very different linguistically, especially regarding grammatical gender, together with a thorough analysis of the metric will reveal some interesting conclusions about the metric's limitations.	es_ES
dc.description.abstract	[ES] Desde su creación, los modelos del lenguaje BERT están siendo implementados en multitud de plataformas que dan servicio a millones de usuarios. Debido a su creciente popularidad, se empezó a dar importancia al hecho de crear sistemas éticos dentro del campo del procesamiento del lenguaje natural, sobre todo teniendo en cuenta los perjuicios que un sistema que no sea justo e imparcial puede producir en algunos grupos de la sociedad. Por este motivo, se está investigando cada vez más sobre técnicas para la detección y reducción del sesgo de género. En este trabajo se va a analizar una métrica para la medición del sesgo de género estudiando la asociación entre referentes con marca de género y profesiones. Se cuestionarán los resultados obtenidos con la misma métrica en trabajos anteriores y, llevando a cabo un análisis más exhaustivo se examinarán las limitaciones que tiene la métrica. Los experimentos se llevarán a cabo en tres idiomas, inglés, euskera y español; en los tres modelos BERT monolingües correspondientes a cada uno de ellos: BERT base, BERTeus y BETO. La variedad lingüística de los tres idiomas en cuanto al género gramatical, junto con el análisis más detallado de la métrica ayudará a obtener interesantes conclusiones sobre las limitaciones de la métrica.	es_ES
dc.language.iso	eng	es_ES
dc.rights	info:eu-repo/semantics/openAccess
dc.subject	gender bias	es_ES
dc.subject	BERT
dc.subject	metric
dc.subject	language models
dc.subject	sesgo de género
dc.subject	métrica
dc.subject	modelos del lenguaje
dc.title	Study of a Metric for Measuring Gender Bias in BERT Language Models	es_ES
dc.type	info:eu-repo/semantics/masterThesis
dc.date.updated	2022-09-08T07:38:07Z
dc.language.rfc3066	es
dc.rights.holder	© 2022, la autora
dc.contributor.degree	Máster Universitario en Análisis y Procesamiento del Lenguaje
dc.contributor.degree	Hizkuntzaren Azterketa eta Prozesamendua Unibertsitate Masterra
dc.identifier.gaurregister	126771-1020687-11	es_ES
dc.identifier.gaurassign	138484-1020687	es_ES

Files in this item

Name:: MAL_Elvira_Gonzalez.pdf
Size:: 1.506Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Máster Universitario en Análisis y Procesamiento del Lenguaje

Show simple item record