Query-by-Example Spoken Term Detection ALBAYZIN 2012 evaluation: overview, systems, results, and discussion

Tejedor, Javier; Toledano, Doroteo T.; Anguera, Xavier; Varona Fernández, Amparo; Hurtado, Lluís F.; Miguel, Antonio; Colás, José

dc.contributor.author	Tejedor, Javier
dc.contributor.author	Toledano, Doroteo T.
dc.contributor.author	Anguera, Xavier
dc.contributor.author	Varona Fernández, Amparo
dc.contributor.author	Hurtado, Lluís F.
dc.contributor.author	Miguel, Antonio
dc.contributor.author	Colás, José
dc.date.accessioned	2014-02-06T19:29:50Z
dc.date.available	2014-02-06T19:29:50Z
dc.date.issued	2013-09
dc.identifier.citation	Eurasip Journal on Audio Speech and Music Processing 2013 : (2013) // Article N. 23	es
dc.identifier.issn	1687-4722
dc.identifier.uri	http://hdl.handle.net/10810/11377
dc.description.abstract	Query-by-Example Spoken Term Detection (QbE STD) aims at retrieving data from a speech data repository given an acoustic query containing the term of interest as input. Nowadays, it has been receiving much interest due to the high volume of information stored in audio or audiovisual format. QbE STD differs from automatic speech recognition (ASR) and keyword spotting (KWS)/spoken term detection (STD) since ASR is interested in all the terms/words that appear in the speech signal and KWS/STD relies on a textual transcription of the search term to retrieve the speech data. This paper presents the systems submitted to the ALBAYZIN 2012 QbE STD evaluation held as a part of ALBAYZIN 2012 evaluation campaign within the context of the IberSPEECH 2012 Conference(a). The evaluation consists of retrieving the speech files that contain the input queries, indicating their start and end timestamps within the appropriate speech file. Evaluation is conducted on a Spanish spontaneous speech database containing a set of talks from MAVIR workshops(b), which amount at about 7 h of speech in total. We present the database metric systems submitted along with all results and some discussion. Four different research groups took part in the evaluation. Evaluation results show the difficulty of this task and the limited performance indicates there is still a lot of room for improvement. The best result is achieved by a dynamic time warping-based search over Gaussian posteriorgrams/posterior phoneme probabilities. This paper also compares the systems aiming at establishing the best technique dealing with that difficult task and looking for defining promising directions for this relatively novel task.	es
dc.language.iso	eng	es
dc.publisher	Springer	es
dc.rights	info:eu-repo/semantics/openAccess	es
dc.subject	query-by-example	es
dc.subject	spoken term detection	es
dc.subject	international evaluation	es
dc.subject	search on spontaneous speech	es
dc.title	Query-by-Example Spoken Term Detection ALBAYZIN 2012 evaluation: overview, systems, results, and discussion	es
dc.type	info:eu-repo/semantics/article	es
dc.rights.holder	© 2013 Tejedor et al.; licensee Springer. This is an Open Access article distributed under the terms of the Creative Commons Attribution License(http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.	es
dc.relation.publisherversion	http://asmp.eurasipjournals.com/content/2013/1/23	es
dc.identifier.doi	10.1186/1687-4722-2013-23
dc.departamentoes	Electricidad y electrónica	es_ES
dc.departamentoeu	Elektrizitatea eta elektronika	es_ES
dc.subject.categoria	ACOUSTICS
dc.subject.categoria	ELECTRICAL AND ELECTRONIC ENGINEERING

Files in this item

Name:: 1687-4722-2013-23.pdf
Size:: 810.6Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Artículos

Show simple item record