ACTIVE SMOTE for Imbalanced Medical Data Classification

Raul Sena; Sana Ben Hamida

doi:10.1007/978-3-031-51664-1_6

Communication Dans Un Congrès Année : 2024

ACTIVE SMOTE for Imbalanced Medical Data Classification

(1, 2) , (1)

1
2

Raul Sena

Fonction : Auteur

Laboratoire d'analyse et modélisation de systèmes pour l'aide à la décision

Université Paris Nanterre

Sana Ben Hamida

Fonction : Auteur
PersonId : 177299
IdHAL : sana-ben-hamida
ORCID : 0000-0003-4202-613X

Laboratoire d'analyse et modélisation de systèmes pour l'aide à la décision

Résumé

Classifying imbalanced data is a big challenge for machine learning techniques, especially for medical data. To deal with this challenge, many solutions have been proposed. The most famous methods are based on the Synthetic Minority Over-sampling Technique (SMOTE), which creates new synthetic instances in the minority class. In this paper, we study the efficiency of the SMOTE-based methods on some imbalanced data sets. We then propose extending these techniques with Active Learning to control the evolution of the minority class better. Active Learning uses uncertainty and diversity sampling to choose wisely the data points from which the synthetic samples will be generated. To evaluate our approach, we make comprehensive experimental studies on two medical data sets for diabetes diagnosis and breast cancer diagnosis.

Mots clés

Imbalanced medical data Machine Learning SMOTE Active Learning Diversity Sampling Uncertainty Sampling Diabetes Diagnosis Breast Cancer Detection Imbalanced medical data Machine Learning SMOTE Active Learning Diversity Sampling Uncertainty Sampling Diabetes Diagnosis Breast Cancer Detection Imbalanced medical data Machine Learning

Domaines

Informatique [cs]

Fichier principal

Active_Smote_paper_ICIKS2023 (2).pdf (1.02 Mo)

Origine	Fichiers produits par l'(les) auteur(s)

Sana Ben Hamida : Connectez-vous pour contacter le contributeur

https://hal.parisnanterre.fr/hal-04462505

Soumis le : vendredi 16 février 2024-15:28:29

Dernière modification le : vendredi 4 octobre 2024-14:24:57

Dates et versions

hal-04462505 , version 1 (16-02-2024)

Identifiants

HAL Id : hal-04462505 , version 1
DOI : 10.1007/978-3-031-51664-1_6

Citer

Raul Sena, Sana Ben Hamida. ACTIVE SMOTE for Imbalanced Medical Data Classification. International Conference on Information and Knowledge Systems, Inès Saad, Camille Rosenthal-Sabroux, Faiez Gargouri, Salem Chakhar, Nigel Williams, Ella Haig, Jun 2023, Portsmouth, United Kingdom. pp.81-97, ⟨10.1007/978-3-031-51664-1_6⟩. ⟨hal-04462505⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS UNIV-DAUPHINE LAMSADE-DAUPHINE PSL UNIV-PARIS-LUMIERES UNIV-PARIS-NANTERRE

28 Consultations

42 Téléchargements