dc.description.abstract | Deep learning has spurred remarkable advancements in vision tasks. Particularly,large-scale models with a high number of trainable parameters can significantly boostperformance. However, as the size and complexity of models increase, the requiredtraining data often escalates in tandem, especially for Vision Transformers. Yet,collecting and annotating data is often time-consuming, costly, or impractical.Overfitting, a common occurrence in the face of data scarcity, poses a significantchallenge for deep learning. This challenge is particularly pronounced in medicalimage segmentation tasks. Addressing the issue of data scarcity and mitigating theresulting overfitting phenomenon are crucial endeavors in the field of deep learning,especially when dealing with large-scale models.To address the challenge of data scarcity, data augmentation strategies stand out asthe most widely recognized solution, known for their effectiveness and efficiency.These strategies aim to combat data scarcity and overfitting by generating additionaltraining samples. Various methodologies exist, including basic data augmentation,Generative Adversarial Network (GAN)-based data augmentation, automaticaugmentation, and regional dropout regularization data augmentation methods. Basicdata augmentation is straightforward but offers limited diversity in the augmentedspace. GAN-based data augmentation, on the other hand, can produce high-qualityand plausible samples, although it heavily relies on GAN models that pose challengesdue to their substantial overhead. Automatic augmentation, which achieves superiorperformance through automated augmentation policy search, often requires trade-offsbetween complexity, cost, and performance. Meanwhile, regional dropoutregularization data augmentation has demonstrated effectiveness, but existingmethods have some shortcomings. These include: (i) Existing methods performcutting and pasting with square-shaped or rectangle-shaped regions, resulting inincomplete object-part information for classification and loss of contour informationfor segmentation. (ii) Current regional dropout regularization data augmentationtechniques are primarily designed for classification tasks, with limited researchconducted on the effectiveness of the cut-and-paste strategy in segmentation. (iii)Most methods only utilize global semantics along with image-level constraints andoverlook local context constraints. (iv) For the classification task, generated labelsoften inconsistently match the augmented images, leading to a mismatch betweenaugmented images and their labels. (v) For the segmentation task, existing regionaldropout augmentations do not fully utilize prior knowledge from segmentation masksor images.In this thesis, we propose a range of innovative data augmentation methods for imageclassification and segmentation. The primary objective of this thesis is to proposecontour-aware and local-aware regional dropout regularization data augmentationapproaches for vision tasks with superpixel-grid-based mixing. Our motivation is toalleviate the data scarcity and resulting overfitting issues, as well as to address thelimitations of existing dropout regularization data augmentation methods. Severaldata augmentation approaches have been proposed, focusing on either imageclassification or image segmentation. Additionally, numerous loss functions havebeen developed to improve the efficacy of the proposed data augmentationtechniques. The main contributions of the thesis are outlined below.(1) We present contour-aware regional dropout data augmentation techniques for bothimage classification and segmentation tasks, employing superpixel-grid-basedmixing.(2) We introduce local-aware regional dropout regularization data augmentationmethods and incorporate local constraints to encourage the model to prioritize localregions. The associated loss functions have been shown to significantly improve theeffectiveness of data augmentation techniques.(3) We present efficient attention-guided superpixel-based data augmentationmethods for classification tasks, ensuring consistency between augmented imagesand generated labels. Our approach utilizes attention mechanisms in both image spaceand label space, respectively.(4) We suggest regional dropout regularization data augmentation methods tailoredfor medical image segmentation. To our knowledge, this represents the first instanceof contour-aware superpixel-mixing-based data augmentation specifically designedfor segmentation tasks.(5) We extensively leverage the prior segmentation mask knowledge of the trainingsamples and investigate loss functions that can enhance the training process. Inparticular, this thesis introduces the novel concepts of superpixel-wise adaptive focalmargin classification loss and reconstruction loss on mixed images for the first time.(6) Comprehensive experiments have conclusively shown the superior performanceof the proposed methods across various image datasets and benchmarks,encompassing both classification and semantic segmentation tasks.Keywords: Deep learning, data augmentation, superpixel, image classification, imagesegmentation | es_ES |