Uncovering the Efficiency of Phishing Detection: An In-depth Comparative Examination of Classification Algorithms

Main Article Content

Dwi Sugianto
Rilliandi Arindra Putawa
Calvina Izumi
Soeltan Abdul Ghaffar

Abstract

This research aims to investigate the potential security risks associated with phishing email attacks and compare the performance of three main classification algorithms: random forest, SVM, and a combination of k-fold cross-validation with the xgboost model. The dataset consists of 18,634 emails, with 7,312 identified as phishing emails and 11,322 considered safe. Through experiments, the combination of k-fold cross-validation and xgboost demonstrated the best performance with the highest accuracy of 0.9712828770799785. The email classification graph provides a visual insight into the distribution of classification results, aiding in understanding patterns and trends in phishing attack detection. The analysis of the ROC curve results indicates that k-fold cross-validation and xgboost have a higher AUC compared to random forest and SVM, signifying a better ability to predict the correct class. The conclusion emphasizes the importance of the combination of k-fold cross-validation and xgboost in enhancing email security, with the potential for increased accuracy through parameter adjustments.

Article Details

How to Cite
[1]
D. Sugianto, R. A. Putawa, C. Izumi, and S. A. Ghaffar, “Uncovering the Efficiency of Phishing Detection: An In-depth Comparative Examination of Classification Algorithms”, Int. J. Appl. Inf. Manag., vol. 4, no. 1, pp. 22–29, Apr. 2024.
Section
Articles