An Optimized and Robust Machine Learning Framework for Early Parkinson's Disease Prediction Using Speech Signals

Authors

    Ghadeer Aqil Ali Department of Electrical And Computer Engineering, Urmia University, Urmia, Iran.
    Leila Sharifi Assistant Professor, Department of Electrical and Computer Engineering, Urmia University, Urmia, Iran.
    Parviz Rashidi-Khazaee * Assistant Professor, Department of Information Technology and Computer Engineering, Urmia University of Technology, Urmia, Iran. p.rashidi@uut.ac.ir
    Hossein Nahid-Titkanlue Assistant Professor, Department of Industrial Engineering, Payame Noor University, Tehran, Iran.

Keywords:

XGBoost, Tree-structured Parzen Estimator, Data Augmentation, SMOTE, Decision Support System

Abstract

With the rapid advancement of technologies in the present era, predicting Parkinson's disease (PD) early using non-invasive and low-cost methods, such as speech analysis with machine learning (ML) tools, remains a challenging task and lacks sufficient confidence for healthcare providers to use in daily practice. Therefore, this study presents an optimized early PD prediction tool and investigates its stability and robustness using a rigorous evaluation mechanism. For early PD prediction using speech signal data, the eXtreme Gradient Boosting (XGB) model is optimized using the Tree-structured Parzen Estimator (TPE) method and the Synthetic Minority Oversampling Technique (SMOTE) for solving the imbalanced dataset problem. Its performance was rigorously evaluated using an optimized strategy to ensure reliability and to earn the trust of clinicians for real-world operational use. To validate the model's trustworthiness and prediction capability, it was evaluated through 10 different runs of Stratified 10-Fold Cross Validation (SCV). The average measures of accuracy as 96.76%, precision as 97.70%, f1-score as 96.70%, recall as 95.91% and ROC-AUC 98.72% show great progress and performance in comparison with similar works. The model performance and stability were evaluated in many different situations and showed that the proposed model is stable and strong enough, and could be used as a practical tool in daily medical care. This tool brings the opportunity to be used easily as a decision support system through a website and detect PD early using patient voice signal with low cost in a non-invasive way that could be used remotely and easily.

 

 

References

Aarsland, D., Batzu, L., Halliday, G. M., Geurtsen, G. J., Ballard, C., Ray Chaudhuri, K., & Weintraub, D. (2021). Parkinson disease-associated cognitive impairment. Nature Reviews Disease Primers, 7(1), 47. https://pubmed.ncbi.nlm.nih.gov/34210995/

Akila, B., & Nayahi, J. J. V. (2024). Parkinson classification neural network with mass algorithm for processing speech signals. Neural Computing and Applications, 36(17), 10165-10181. https://link.springer.com/article/10.1007/s10462-025-11347-y

Almeida, J. S., Rebouças Filho, P. P., Carneiro, T., Wei, W., Damaševičius, R., Maskeliūnas, R., & de Albuquerque, V. H. C. (2019). Detecting Parkinson’s disease with sustained phonation and speech signals using machine learning techniques. Pattern Recognition Letters, 125, 55-62. https://www.sciencedirect.com/science/article/pii/S0167865519301163

Alshammri, R., Alharbi, G., Alharbi, E., & Almubark, I. (2023). Machine learning approaches to identify Parkinson's disease using voice signal features. Frontiers in Artificial Intelligence, 6, 1084001. https://www.frontiersin.org/journals/artificial-intelligence/articles/10.3389/frai.2023.1084001/full

Armstrong, M. J., & Okun, M. S. (2020). Diagnosis and treatment of Parkinson disease: a review. JAMA, 323(6), 548-560. https://pubmed.ncbi.nlm.nih.gov/32044947/

Balaha, H. M., Hassan, A. E.-S., Ahmed, R. A., & Balaha, M. H. (2025). Comprehensive multimodal approach for Parkinson’s disease classification using artificial intelligence: insights and model explainability. Soft Computing, 1-33. https://dl.acm.org/doi/10.1007/s00500-025-10463-9

Baqer, N. R., & Rashidi-Khazaee, P. (2025). Residential Building Energy Usage Prediction Using Bayesian-Based Optimized XGBoost Algorithm. IEEE Access. https://ieeexplore.ieee.org/iel8/6287639/10820123/10900361.pdf

Baruah, D., Rehman, R., Bora, P. K., Mahanta, P., Dutta, K., & Konwar, P. (2025). Performance Evaluation of Classification Algorithms for Parkinson’s Disease Diagnosis: A Comparative Study. Journal of Electronics, Electromedical Engineering, and Medical Informatics, 7(3), 692-712. https://jeeemi.org/index.php/jeeemi/article/view/713

Ben-Shlomo, Y., Darweesh, S., Llibre-Guerra, J., Marras, C., San Luciano, M., & Tanner, C. (2024). The epidemiology of Parkinson's disease. The lancet, 403(10423), 283-292. https://pubmed.ncbi.nlm.nih.gov/38245248/

Bergstra, J., Yamins, D., & Cox, D. (2013). Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures. International conference on machine learning, https://proceedings.mlr.press/v28/bergstra13.html

Beriich, M., Ouhmida, A., Alouani, Z., Saleh, S., Cherradi, B., & Raihani, A. (2025). Advancing Parkinson’s Disease Detection: A Review of AI and Deep Learning Innovations. 2025 5th International Conference on Innovative Research in Applied Science, Engineering and Technology (IRASET), https://jglobal.jst.go.jp/en/detail?JGLOBAL_ID=202502252256845613

Bloem, B. R., Okun, M. S., & Klein, C. (2021). Parkinson's disease. The lancet, 397(10291), 2284-2303. https://pubmed.ncbi.nlm.nih.gov/33848468/

Braga, D., Madureira, A. M., Coelho, L., & Ajith, R. (2019). Automatic detection of Parkinson’s disease based on acoustic analysis of speech. Engineering Applications of Artificial Intelligence, 77, 148-158. https://www.sciencedirect.com/science/article/abs/pii/S0952197618302045

Cantürk, İ., & Günay, O. (2024). Investigation of scalograms with a deep feature fusion approach for detection of Parkinson’s disease. Cognitive Computation, 16(3), 1198-1209. https://link.springer.com/article/10.1007/s12559-024-10254-8

Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. https://doi.org/10.1145/2939672.2939785

Das, P., Nanda, S., & Panda, G. (2020). Automated improved detection of Parkinson’s disease using ensemble modeling. 2020 IEEE International Symposium on Sustainable Energy, Signal Processing and Cyber Security (iSSSC), https://www.proceedings.com/content/057/057976webtoc.pdf

Dorsey, E. R., Sherer, T., Okun, M. S., & Bloem, B. R. (2018). The emerging evidence of the Parkinson pandemic. Journal of Parkinson’s disease, 8(s1), S3-S8. https://pubmed.ncbi.nlm.nih.gov/30584159/

Gupta, D., Julka, A., Jain, S., Aggarwal, T., Khanna, A., Arunkumar, N., & de Albuquerque, V. H. C. (2018). Optimized cuttlefish algorithm for diagnosis of Parkinson’s disease. Cognitive Systems Research, 52, 36-48. https://www.sciencedirect.com/science/article/pii/S1389041718301876

Islam, M. A., Majumder, M. Z. H., Hussein, M. A., Hossain, K. M., & Miah, M. S. (2024). A review of machine learning and deep learning algorithms for Parkinson's disease detection using handwriting and voice datasets. Heliyon, 10(3). https://www.sciencedirect.com/science/article/pii/S2405844024015007

Jain, D., Mishra, A. K., & Das, S. K. (2020). Machine learning based automatic prediction of Parkinson’s disease using speech features. Proceedings of International Conference on Artificial Intelligence and Applications: ICAIA 2020, https://www.researchgate.net/publication/342640627_Machine_Learning_Based_Automatic_Prediction_of_Parkinson's_Disease_Using_Speech_Features

Jain, D., Mishra, A. K., & Das, S. K. (2021). Machine learning based automatic prediction of Parkinson’s disease using speech features. Proceedings of International Conference on Artificial Intelligence and Applications: ICAIA 2020,

Kadam, V. J., & Jadhav, S. M. (2018). Feature ensemble learning based on sparse autoencoders for diagnosis of Parkinson’s disease. In Computing, Communication and Signal Processing: Proceedings of ICCASP 2018 (pp. 567-581). Springer. https://dl.acm.org/doi/abs/10.1007/s00521-021-05741-0

Kadam, V. J., & Jadhav, S. M. (2019). Feature ensemble learning based on sparse autoencoders for diagnosis of Parkinson’s disease. Computing, Communication and Signal Processing: Proceedings of ICCASP 2018, https://dl.acm.org/doi/abs/10.1007/s00521-021-05741-0

Kadhim, M. N., Al-Shammary, D., & Sufi, F. (2024). A novel voice classification based on Gower distance for Parkinson disease detection. International Journal of Medical Informatics, 191, 105583. https://www.sciencedirect.com/science/article/pii/S1386505624002466

Kardan, R., Nazari, M., Hemmati, J., Ahmadi, A., & Ashab, M. (2024). A Novel Therapeutic Strategy for Parkinson's Disease based on the Gut Microbiota: A Rreview Article [Review]. Scientific Journal of Kurdistan University of Medical Sciences, 29(3), 127-138. https://doi.org/10.61186/sjku.29.3.11

Lamba, R., Gulati, T., Alharbi, H. F., & Jain, A. A hybrid system for Parkinson’s disease diagnosis using machine learning techniques. International Journal of Speech Technology, 1-11. https://dl.acm.org/doi/10.4018/IJSI.292027

Lamba, R., Gulati, T., & Jain, A. (2022). A hybrid feature selection approach for parkinson’s detection based on mutual information gain and recursive feature elimination. Arabian Journal for Science and Engineering, 47(8), 10263-10276. https://www.springerprofessional.de/en/a-hybrid-feature-selection-approach-for-parkinson-s-detection-ba/20046808

Little, M., McSharry, P., Hunter, E., Spielman, J., & Ramig, L. (2008). Suitability of dysphonia measurements for telemonitoring of Parkinson’s disease. Nature Precedings, 1-1. https://pmc.ncbi.nlm.nih.gov/articles/PMC3051371/

Mei, J., Desrosiers, C., & Frasnelli, J. (2021). Machine learning for the diagnosis of Parkinson's disease: a review of literature. Frontiers in Aging Neuroscience, 13, 633752. https://www.frontiersin.org/journals/aging-neuroscience/articles/10.3389/fnagi.2021.633752/full

Pahuja, G., & Nagabhushan, T. (2021). A comparative study of existing machine learning approaches for Parkinson's disease detection. IETE Journal of Research, 67(1), 4-14. https://www.shs-conferences.org/articles/shsconf/ref/2022/09/shsconf_etltc2022_03027/shsconf_etltc2022_03027.html

Patel¹, N., Srividhya, R., Linda, P. E., & Rajesh¹, S. (2025). Parkinson's Insight: Leveraging CNN and LSTM Networks for Enhanced Diagnostic Accuracy. Proceedings of the International Conference on Advancements in Computing Technologies and Artificial Intelligence (COMPUTATIA 2025), https://www.atlantis-press.com/proceedings/computatia-25/126010054

Poewe, W., Seppi, K., Tanner, C. M., Halliday, G. M., Brundin, P., Volkmann, J., Schrag, A.-E., & Lang, A. E. (2017). Parkinson disease. Nature Reviews Disease Primers, 3(1), 17013. https://doi.org/10.1038/nrdp.2017.13

Rahman, S., Hasan, M., Sarkar, A. K., & Khan, F. (2023). Classification of Parkinson’s disease using speech signal with machine learning and deep learning approaches. European Journal of Electrical Engineering and Computer Science, 7(2), 20-27. https://ejece.org/index.php/ejece/article/view/488

Reddy, A., Reddy, R. P., Roghani, A. K., Garcia, R. I., Khemka, S., Pattoor, V., Jacob, M., Reddy, P. H., & Sehar, U. (2024). Artificial intelligence in Parkinson's disease: Early detection and diagnostic advancements. Ageing Research Reviews, 99, 102410. https://ejece.org/index.php/ejece/article/view/488

Reddy, H., Jagadeesh, D. V. S., Pati, P. B., & Kn, B. P. (2024). Parkinson's Disease Diagnosis from Patients Speech Analysis. 2024 IEEE 9th International Conference for Convergence in Technology (I2CT), https://www.semanticscholar.org/paper/Parkinson's-Disease-Diagnosis-from-Patients-Speech-HarshithaReddy-Aryagopal/e24528e3b84b9b2f6d65f7c8821d1bc9a9f16639

Saha, D. K., & Nath, T. D. (2025). A lightweight CNN-based ensemble approach for early detecting Parkinson’s disease with enhanced features. International Journal of Speech Technology, 1-15. https://pubmed.ncbi.nlm.nih.gov/28592904/

Schapira, A. H., Chaudhuri, K. R., & Jenner, P. (2017). Non-motor features of Parkinson disease. Nature Reviews Neuroscience, 18(7), 435-450. https://pubmed.ncbi.nlm.nih.gov/28592904/

Senturk, Z. K. (2020). Early diagnosis of Parkinson’s disease using machine learning algorithms. Medical Hypotheses, 138, 109603. https://pubmed.ncbi.nlm.nih.gov/32028195/

Sharma, P., Sundaram, S., Sharma, M., Sharma, A., & Gupta, D. (2019). Diagnosis of Parkinson’s disease using modified grey wolf optimization. Cognitive Systems Research, 54, 100-115. https://www.sciencedirect.com/science/article/abs/pii/S1389041718308726

Srinivasan, S., Ramadass, P., Mathivanan, S. K., Panneer Selvam, K., Shivahare, B. D., & Shah, M. A. (2024). Detection of Parkinson disease using multiclass machine learning approach. Scientific reports, 14(1), 13813. https://www.nature.com/articles/s41598-024-64004-9

Thirapanish, W., Kantavat, P., Wanvarie, D., Chuangsuwanich, E., & Punyabukkana, P. (2024). Evaluating Machine Learning-Based Feature Selection Methods for Diagnosing Parkinson's Disease Under the SVM Framework. 2024 7th International Conference on Artificial Intelligence and Big Data (ICAIBD), https://www.researchgate.net/publication/382718440_Evaluating_Machine_Learning-Based_Feature_Selection_Methods_for_Diagnosing_Parkinson's_Disease_Under_the_SVM_Framework

Yang, L., & Shami, A. (2020). On hyperparameter optimization of machine learning algorithms: Theory and practice. Neurocomputing, 415, 295-316. https://doi.org/10.1016/j.neucom.2020.07.061

Yang, Z., Zhou, H., Srivastav, S., Shaffer, J. G., Abraham, K. E., Naandam, S. M., & Kakraba, S. (2025). Optimizing parkinson’s disease prediction: A comparative analysis of data aggregation methods using multiple voice recordings via an automated artificial intelligence pipeline. Data, 10(1), 4. https://www.mdpi.com/2306-5729/10/1/4

Zolin, A., Ooi, H., Zhou, M., Su, C., Wang, F., & Sarva, H. (2025). Liver fibrosis associated with more severe motor deficits in early Parkinson’s disease. Clinical Neurology and Neurosurgery, 252, 108861. https://scholar.google.com/citations?user=P4PgpD4AAAAJ&hl=en

Downloads

Published

2026-01-01

Submitted

2025-06-18

Revised

2025-09-07

Accepted

2025-10-19

Issue

Section

Articles

How to Cite

Aqil Ali, G. ., Sharifi, L. ., Rashidi-Khazaee, P., & Nahid-Titkanlue, H. . (2026). An Optimized and Robust Machine Learning Framework for Early Parkinson’s Disease Prediction Using Speech Signals. Journal of Resource Management and Decision Engineering, 1-11. https://journalrmde.com/index.php/jrmde/article/view/187

Similar Articles

11-20 of 158

You may also start an advanced similarity search for this article.