

Adlès Francis Kouassi1, Tanon Lambert Kadjo2, K. Yablé Didier3, and Olivier Asseu4
1 ESATIC, Côte d’Ivoire
2 INPHB, Côte d’Ivoire
3 ESATIC, Côte d’Ivoire
4 ESATIC, Côte d’Ivoire
Original language: English
Copyright © 2025 ISSR Journals. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract
Early detection of type 2 diabetes is a public health priority due to its high prevalence and the severe complications that may result. However, traditional machine learning approaches face several limitations, particularly in model optimization, handling class imbalance, and ensuring clinical interpretability. In this context, we propose an optimized machine learning approach that combines advanced preprocessing, optimization, and modeling techniques. Our methodology is based on four key components: (i) feature engineering guided by medical knowledge (e.g., Glucose/BMI, Age×BMI), (ii) adaptive class rebalancing using SMOTEENN, (iii) Bayesian hyperparameter optimization with Optuna for XGBoost and MLP (Multilayer Perceptron) models, and (iv) an ensemble stacking strategy integrating Random Forest, XGBoost, and MLP, with logistic regression as the meta-learner. The PIMA Indians and Frankfurt Hospital datasets were used to validate this approach. The results are remarkable: an accuracy of 94.05% on PIMA, 99.27% on Frankfurt, and 99.71% on the merged data, with an AUC reaching 99.99%. SHAP analysis highlights the increased importance of insulin in PIMA and the Age×BMI interaction in Frankfurt, while confirming the stability of universal markers such as glucose and BMI. This approach not only delivers outstanding predictive performance but also provides differentiated interpretability, paving the way for more personalized and equitable predictive medicine.
Author Keywords: Machine Learning, Diabetes, Stacking Ensemble, Bayesian Optimization, Feature Engineering, SHAP, Medical Prediction.