Comparative Study of Classification Algorithms for Website Phishing Detection on Multiple Datasets
Author
Abstract

Phishing has become a prominent method of data theft among hackers, and it continues to develop. In recent years, many strategies have been developed to identify phishing website attempts using machine learning particularly. However, the algorithms and classification criteria that have been used are highly different from the real issues and need to be compared. This paper provides a detailed comparison and evaluation of the performance of several machine learning algorithms across multiple datasets. Two phishing website datasets were used for the experiments: the Phishing Websites Dataset from UCI (2016) and the Phishing Websites Dataset from Mendeley (2018). Because these datasets include different types of class labels, the comparison algorithms can be applied in a variety of situations. The tests showed that Random Forest was better than other classification methods, with an accuracy of 88.92% for the UCI dataset and 97.50% for the Mendeley dataset.

Year of Publication
2022
Conference Name
2022 International Seminar on Application for Technology of Information and Communication (iSemantic)
Google Scholar | BibTeX