Show simple item record

dc.contributor.authorIsasa, Imanol
dc.contributor.authorHernández, Mikel
dc.contributor.authorEpelde Unanue, Gorka
dc.contributor.authorLondoño, Francisco
dc.contributor.authorBeristain Iraola, Andoni
dc.contributor.authorLarrea, Xabat
dc.contributor.authorAlberdi Aramendi, Ane
dc.contributor.authorBamidis, Panagiotis D.
dc.contributor.authorKonstantinidis, Evdokimos I.
dc.date.accessioned2024-04-30T17:01:18Z
dc.date.available2024-04-30T17:01:18Z
dc.date.issued2024
dc.identifier.citationBMC Medical Informatics and Decision Making 24 : (2024) // Article ID 27es_ES
dc.identifier.issn1472-6947
dc.identifier.urihttp://hdl.handle.net/10810/66964
dc.description.abstractBackground Synthetic data is an emerging approach for addressing legal and regulatory concerns in biomedical research that deals with personal and clinical data, whether as a single tool or through its combination with other privacy enhancing technologies. Generating uncompromised synthetic data could significantly benefit external researchers performing secondary analyses by providing unlimited access to information while fulfilling pertinent regulations. However, the original data to be synthesized (e.g., data acquired in Living Labs) may consist of subjects’ metadata (static) and a longitudinal component (set of time-dependent measurements), making it challenging to produce coherent synthetic counterparts. Methods Three synthetic time series generation approaches were defined and compared in this work: only generating the metadata and coupling it with the real time series from the original data (A1), generating both metadata and time series separately to join them afterwards (A2), and jointly generating both metadata and time series (A3). The comparative assessment of the three approaches was carried out using two different synthetic data generation models: the Wasserstein GAN with Gradient Penalty (WGAN-GP) and the DöppelGANger (DGAN). The experiments were performed with three different healthcare-related longitudinal datasets: Treadmill Maximal Effort Test (TMET) measurements from the University of Malaga (1), a hypotension subset derived from the MIMIC-III v1.4 database (2), and a lifelogging dataset named PMData (3). Results Three pivotal dimensions were assessed on the generated synthetic data: resemblance to the original data (1), utility (2), and privacy level (3). The optimal approach fluctuates based on the assessed dimension and metric. Conclusion The initial characteristics of the datasets to be synthesized play a crucial role in determining the best approach. Coupling synthetic metadata with real time series (A1), as well as jointly generating synthetic time series and metadata (A3), are both competitive methods, while separately generating time series and metadata (A2) appears to perform more poorly overall.es_ES
dc.description.sponsorshipThis research was partly funded by the VITALISE (VIrtual healTh And weLlbeing Living Lab InfraStructurE) project, funded by the Horizon 2020 Framework Program of the European Union for Research Innovation (grant agreement 101007990). Ane Alberdi is part of the Intelligent Systems for Industrial Systems research group of Mondragon Unibertsitatea (IT1676-22), supported by the Department of Education, Universities and Research of the Basque Country.es_ES
dc.language.isoenges_ES
dc.publisherBMCes_ES
dc.relationinfo:eu-repo/grantAgreement/EC/H2020/101007990es_ES
dc.rightsinfo:eu-repo/semantics/openAccesses_ES
dc.rights.urihttp://creativecommons.org/licenses/by/3.0/es/*
dc.subjecttime series
dc.subjectsynthetic data
dc.subjectprivacy-preserving data sharing
dc.subjecthealth data
dc.titleComparative assessment of synthetic time series generation approaches in healthcare: leveraging patient metadata for accurate data synthesises_ES
dc.typeinfo:eu-repo/semantics/articlees_ES
dc.rights.holder© The Author(s) 2024. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.es_ES
dc.rights.holderAtribución 3.0 España*
dc.relation.publisherversionhttps://bmcmedinformdecismak.biomedcentral.com/articles/10.1186/s12911-024-02427-0es_ES
dc.identifier.doi10.1186/s12911-024-02427-0
dc.contributor.funderEuropean Commission
dc.departamentoesCiencia de la computación e inteligencia artificiales_ES
dc.departamentoeuKonputazio zientziak eta adimen artifizialaes_ES


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record

© The Author(s) 2024. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
Except where otherwise noted, this item's license is described as © The Author(s) 2024. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.