Phishing Detection Using Hybrid Machine learning Techniques
##plugins.themes.bootstrap3.article.main##
摘要
Cyber security has become a crucial component of the new digital age with more than 820 million users of internet in year 2023 and social media users are expected to reach 82.3% from the total number of internet users by 2024. According to these figures, security systems are required to shield the public from phishing scams, which have a negative impact not only on financial resources but also on people's mental health by making them fearful to use the internet or surf. This drives efforts to find effective solutions for the issue. The swift alterations in phishing attack patterns necessitate constant improvement of existing phishing detection systems in order to effectively counter new and upcoming phishing attempts.
This research aims to identify common characteristics displayed by phishing websites and create a model to identify them. The dataset was used to train a number of models, including the Random Forest Classifier, Artificial Neural Networks, and Principal component Analysis. Feature selection and clustering technique were also integrated to detect unknown attacks. The dataset was collected from Kaggle and contains information of 549,346 entries. RF attained the highest accuracy of 94%.
##plugins.themes.bootstrap3.article.details##
##submission.howToCite##
参考
Alkhalil, Z., Hewage, C., Nawaf, L., & Khan, I. (2021). Phishing attacks: A recent comprehensive study and a new anatomy. Frontiers in Computer Science, 3, 563060.
Zongo, W. B. S., Kabore, B., & Vaghela, R. S. (2023, January). Phishing URLs Detection Using Machine Learning. In Advancements in Smart Computing and Information Security: First International Conference, ASCIS 2022, Rajkot, India, November 24–26, 2022, Revised Selected Papers, Part II (pp. 159-167). Cham: Springer Nature Switzerland.
Salahdine, F., El Mrabet, Z., & Kaabouch, N. (2021, December). Phishing Attacks Detection A Machine Learning-Based Approach. In 2021 IEEE 12th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON) (pp. 0250-0255). IEEE.
Sundara Pandiyan S, Prabha Selvaraj, Vijay Kumar Burugari, Julian Benadit P, Kanmani P,Phishing attack detection using Machine Learning , Measurement: Sensors, Volume 24, 2022, 100476, ISSN 2665-9174.
Proof point, 2022 State of the Phish, https://www.proofpoint.com/us/resources/threat-reports/state-of-phish , visit date 30 january 2023
https://www.tessian.com/blog/phishing-vs-spear-phishing/
Mughaid, A., AlZu’bi, S., Hnaif, A., Taamneh, S., Alnajjar, A., & Elsoud, E. A. (2022). An intelligent cyber security phishing detection system using deep learning techniques. Cluster Computing, 25(6), 3819-3828.
Muralidharan, T., & Nissim, N. (2023). Improving malicious email detection through novel designated deep-learning architectures utilizing entire email. Neural Networks, 157, 257-279.
Chawla, A. (2022). Phishing website analysis and detection using Machine Learning. International Journal of Intelligent Systems and Applications in Engineering, 10(1), 10-16.
Alam, M. N., Sarma, D., Lima, F. F., Saha, I., & Hossain, S. (2020, August). Phishing attacks detection using machine learning approach. In 2020 third international conference on smart systems and inventive technology (ICSSIT) (pp. 1173-1179). IEEE.
Rao RS, Pais AR. Jail-Phish: An improved search engine based phishing detection system. Computers & Security. 2019 Jun 1;83:246–67.
Aljofey A, Jiang Q, Qu Q, Huang M, Niyigena JP. An effective phishing detection model based on character level convolutional neural network from URL. Electronics. 2020 Sep;9(9):1514.
Kaggle.com, P.S.U.A.O. Available online: https://www.kaggle.com/taruntiwarihp/phishing-site-urls (accessed on 8 march 2023).
Deyanara Tuapattinaya, Antoni Wibowo , Phishing Website Detection using Neural Network and PCA based on Feature Selection, International Journal of Recent Technology and Engineering (IJRTE) ISSN: 2277-3878 (Online), Volume-8 Issue-6, March 2020.
Jayaraj, R., Pushpalatha, A., Sangeetha, K., Kamaleshwar, T., Shree, S. U., & Damodaran, D. (2024). Intrusion detection based on phishing detection with machine learning. Measurement: Sensors, 31, 101003.
van Geest, R. J., Cascavilla, G., Hulstijn, J., & Zannone, N. (2024). The applicability of a hybrid framework for automated phishing detection. Computers & Security, 139, 103736.
Alazaidah, R., Al-Shaikh, A., AL-Mousa, M. R., Khafajah, H., Samara, G., Alzyoud, M., ... & Almatarneh, S. (2024). Website phishing detection using machine learning techniques. Journal of Statistics Applications & Probability, 13(1), 119-129
Akinyelu, A. A., & Adewumi, A. O. (2014). Classification of phishing email using random forest machine learning technique. Journal of Applied Mathematics, 2014
Fette, N. Sadeh, and A. Tomasic, “Learning to detect phishing emails,” in Proceedings of the 16th International World Wide Web Conference (WWW '07), pp. 649–656, Alberta, Canada, May 2007.
C. Whittaker, B. Ryner, and M. Nazif, “Large-scale automatic classification of phishing pages,” in Proceedings of the 17th Annual Network & Distributed System Security Symposium (NDSS '10), The Internet Society, San Diego, Calif, USA, 2010.
Zhang, N., & Yuan, Y. (2012). Phishing detection using neural network. CS229 lecture notes, 34
Wang, W., Zhang, F., Luo, X., & Zhang, S. (2019). PDRCNN: Precise phishing detection with recurrent convolutional neural networks. Security and Communication Networks, 2019, 1-15.
Wanawe, K., Awasare, S., & Puri, N. V. (2014). An efficient approach to detecting phishing a web using k-means and naïve-bayes algorithms. International Journal of Research in Advent Technology, 2(3), 106-111.
Mhaske-Dhamdhere, V., & Vanjale, S. (2018). A novel approach for phishing emails real time classification using k-means algorithm. International Journal of Engineering and Technology, 7, 96-100.
Singh, I., & Jindal, R. (2021). Expectation maximization clustering and sequential pattern mining based approach for detecting intrusive transactions in databases. Multimedia Tools and Applications, 80(18), 27649-27681.