Prediction of main components in clove essential oil using optimized machine learning models

Uzun, Yusuf; SALTAN, FATMA

doi:10.1080/0972060x.2026.2614372

Prediction of main components in clove essential oil using optimized machine learning models

Uzun Y., SALTAN F. Z.

Journal of Essential Oil-Bearing Plants, cilt.29, sa.1, ss.242-257, 2026 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 29 Sayı: 1
Basım Tarihi: 2026
Doi Numarası: 10.1080/0972060x.2026.2614372
Dergi Adı: Journal of Essential Oil-Bearing Plants
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
Sayfa Sayıları: ss.242-257
Anahtar Kelimeler: Artificial intelligence, Essential oil, Eugenia caryophyllata, Machine learning, Pharmacognosy
Anadolu Üniversitesi Adresli: Evet

Özet

Gas chromatography-mass spectrometry (GC-MS) is a crucial method for analyzing essential oils, but it remains time-consuming and resource-intensive. This study proposes a machine learning (ML)-based framework as a rapid, pre-screening tool to predict the dominant chemical component of clove (Eugenia caryophyllata L.) essential oil using readily available metadata. The goal is not to replace GC-MS but to complement it by enabling faster preliminary assessments and guiding targeted analyses. This study employs five machine learning models, Random Forest, SVM, XGBoost, KNN, and Decision Tree, optimized via hyperparameter tuning to predict the main components of E. caryophyllata essential oil. Performance metrics, including accuracy, R2, MSE, RMSE, and MAE, were evaluated to compare the effectiveness of the models. The results indicate that the XGBoost model, evaluated via rigorous 10-fold cross-validation, achieved superior performance with a test accuracy of 0.9565 and an R2 score of 0.9718, significantly outperforming other models (Random Forest, SVM, KNN, and Decision Tree). In contrast, the KNN model exhibited the lowest performance with an accuracy of 0.5652. The study demonstrates that XGBoost, with its advanced ensemble learning and hyperparameter optimization capabilities, is the most suitable model for predicting the primary components in clove essential oils. Future research could explore deep learning approaches that use larger datasets to improve prediction accuracy.