COMPARISON OF CLASSIFICATION ALGORITHMS TO DETECT PHISHING WEB PAGES USING FEATURE SELECTION AND EXTRACTION

Rajendra Gupta

doi:10.29121/granthaalayah.v4.i8.2016.2570

Authors

Dr. Rajendra Gupta Assistant Professor, BSSS Autonomous College,Barkatullah University, Bhopal - 462 024, INDIA

DOI:

https://doi.org/10.29121/granthaalayah.v4.i8.2016.2570

Keywords:

Phishing, Anti-Phishing, Add-on for Web Browser, Data mining classification algorithms

Abstract [English]

The phishing is a kind of e-commerce lure which try to steal the confidential information of the web user by making identical website of legitimate one in which the contents and images almost remains similar to the legitimate website with small changes. Another way of phishing is to make minor changes in the URL or in the domain of the legitimate website. In this paper, a number of anti-phishing toolbars have been discussed and proposed a system model to tackle the phishing attack. The proposed anti-phishing system is based on the development of the Plug-in tool for the web browser. The performance of the proposed system is studied with three different data mining classification algorithms which are Random Forest, Nearest Neighbour Classification (NNC), Bayesian Classifier (BC). To evaluate the proposed anti-phishing system for the detection of phishing websites, 7690 legitimate websites and 2280 phishing websites have been collected from authorised sources like APWG database and PhishTank. After analyzing the data mining algorithms over phishing web pages, it is found that the Bayesian algorithm gives fast response and gives more accurate results than other algorithms.

Downloads

Download data is not yet available.

References

APWG 1 to 3rd Quarter 2015 Phishing Activity Trends Report from www.antiphishing.org

A research report from http://securityresearch.in/ ?ubiquitous_id=88, January 2013

A.NagaVenkata Sunil, Sardana A., “A PageRank Based Detection Technique for Phishing Web Sites”, 2012 IEEE Symposium on Computers & Informatics, 2012, pp. 58-63 DOI: https://doi.org/10.1109/ISCI.2012.6222667

Javelin Strategy and Research. http://www.javelinstrategy.com, 2012

Chou N., LedesmaR., Teraguchi Y. and Mitchell John C. “Client-Side Defense Against Web-Based Identity Theft” in 11th Annual Network and Distributed System Security Symposium, San Diego, February, 2004

Dhamija R., Tygar J.D., “The Battle against phishing: Dynamic Security Skins. In: Proc. of ACM Symposium on Usable Security and Privacy, 2005, pp.77-88

A Report from ‘Computer Associate Internationals Inc.’, September 2012

Khonji M., JonesA., IraqiY., “A Novel Phishing Classification based on URL Features”, 2011 IEEE GCC Conference and Exhibition (GCC), February 19-22, 2011, Dubai, United Arab Emirates, 2011, pp. 221-224 DOI: https://doi.org/10.1109/IEEEGCC.2011.5752505

Wardman B., Stallings T., Warner G., Skjellum A., “High-Performance Content-Based Phishing Attack Detection”, published in IEEE conference on eCrime Researchers Summit (eCrime), 2011, pp. 1-9 DOI: https://doi.org/10.1109/eCrime.2011.6151977

Weider D. Yu, Nargundkar S.,Tiruthani N., “PhishCatch – A Phishing Detection Tool”, presented in 33rd Annual IEEE International Computer Software and Applications Conference, IEEE Computer Society, 2009, pp. 451-456

Prakash P., Manish K., Kompella R.R., Gupta M., “PhishNet: Predictive Blacklisting to Detect Phishing Attacks”, presented as part of the Mini-Conference at IEEE INFOCOM 2010 DOI: https://doi.org/10.1109/INFCOM.2010.5462216

IsredzaRahmi A Hamid and AbawajyJemal H., “Profiling Phishing Email Based on Clustering Approach” 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications, 2013, pp. 629-635

Jiang H., ZhangD., Yan Z., “A Classification Model for Detection of Chinese Phishing E-Business Websites”, PACIS2013Proceedings. 2013, Paper 152

Li T., HanF., Ding S.andChenZ., “LARX: Large-scale Anti-phishing by Retrospective Data-Exploring Based on a Cloud Computing Platform”, Computer Communications and Networks, Proceedings of 20th International Conference on, July 31-August 4, , 2011, pp. 1-5 DOI: https://doi.org/10.1109/ICCCN.2011.6005822

Huang H., Zhong S., TanJ., “Browser-side Countermeasures for Deceptive Phishing Attack”, 2009 Fifth International Conference on Information Assurance and Security, IEEE Computer Society, 2009, pp. 352-355 DOI: https://doi.org/10.1109/IAS.2009.12

Ferguson Edward, Weber Joseph, and Hasan Ragib, “Cloud Based Content Fetching: Using Cloud Infrastructure to Obfuscate Phishing Scam Analysis”, IEEE Eighth World Congress on Services, IEEE Computer Society, 2012, pp. 255-261 DOI: https://doi.org/10.1109/SERVICES.2012.60

Microsoft Corporation. Internet Explorer 7. http://www. microsoft.com/windows/ie/default.mspx, Accessed: November 9, 2010

Aburrous Maher, Khelifi Adel, “Phishing Detection Plug-In Toolbar Using Intelligent Fuzzy-Classification Mining Techniques”, The International Journal of Soft Computing and Software Engineering [JSCSE], Vol. 3, No. 3, pp. 54-61, March 2013

Mahmood Ali M., Dr. Rajamani L., “Deceptive Phishing Detection System (From Audio and Text messages in Instant Messengers using Data Mining Approach)”, Proceedings of the International Conference on Pattern Recognition, Informatics and Medical Engineering (IEEE), March 21-23, 2012

Chou N., Ledesma R., Teraguchi Y., Boneh D., and Mitchell J.C., “Client-side Defense Against Web-based Identity Theft” In Proc. Network and Distributed System Security Symposium, San Diego, CA., 2004

CallingID, Ltd. http://www.callingid.com/DesktopSolutions/ CallingIDToolbar.aspx, Accessed: December 1, 2008

Cloudmark, Inc. http://www.cloudmark.com/desktop/ download, Accessed: September 5, 2008

EarthLink, Inc. EarthLink Tool. http://www.earthlink.net /software/free/tool/, Accessed: November 9, 2010

eBay, Inc. Using eBay Tool’s Account Guard, Accessed: June 13, 2010, http://pages.eBay.com/help/confidence/accountguard.html

Kerner, Michael S., Firefox 2.0 Bakes in Anti-Phish Antidote. Internet News. http://www.internetnews.com/devnews/ article.php/3609816.2006

Google, Inc. Google Safe Browsing for Firefox. http://www.google.com/tools/firefox/safebrowsing/, Accessed: June 13, 2010

Netcraft. Netcraft Anti-Phishing Tool. http://tool.netcraft.com/, Accessed: June, 13, 2010

Netscape Communications Corp. “Security Center” Accessed: November 9, 2006. http://browser.netscape.com/ns8/product /security.jsp

Quick Start : Spoof Guard, A http://crypto. stanford.edu/SpoofGuard/, October 10, 2011

Jiang Hansi, Zhang Dongsong, Yan Zhijun, “A Classification Model for Detection of Chinese Phishing E-Business Websites”, PACIS 2013 Proceedings. Paper 152, 2013.

Zhuang Weiwei, Jiang Qingshan, XiongTengke, “An Intelligent Anti-phishing Strategy Model for Phishing Website Detection”, IEEE Computer Society, 32nd International Conference on Distributed Computing Systems Workshops, 2012. DOI: https://doi.org/10.1109/ICDCSW.2012.66

Balamuralikrishna T., Raghavendrasai N., Satya Sukumar M., “Mitigating Online Fraud by Ant phishing Model with URL & Image based Webpage Matching”, International Journal of Scientific & Engineering Research, Vol. 3, Issue 3, March-2012, pp.1-6

Madhuri S. Arade, Bhaskar P.C., Kamat R.K., “Antiphishing Model with URL & Image based Webpage Matching”, International Conference & Workshop on Recent Trends in Technology (TCET), Proceedings published in International Journal of Computer Applications® (IJCA), 2012, pp 18-23

Aburrous Maher, Hossain M.A., DahalKeshav, ThabatahFadi, “Modelling Intelligent Phishing Detection System for e-Banking using Fuzzy Data Mining”, IEEE Computer Society, International Conference on CyberWorlds, pp. 265-272, 2009 DOI: https://doi.org/10.1109/CW.2009.43

Zhuang W., Ye Y., Li T., Jiang Q. “Intelligent phishing website detection using classification ensemble Systems” Engineering Theory & Practice, Volume 31(10), 2011, P2008-2020

Kang JungMin, DoHoon Lee. “Advanced White List Approach for Preventing Access to Phishing Sites”, International Conference on Convergence Information Technology (ICCIT 2007), 2007, pp.491–496 DOI: https://doi.org/10.1109/ICCIT.2007.50

Abbasi Ahmed, “Mariam” Zahedi Fatemeh and Chen Yan, “Impact of Anti-Phishing Tool Performance on Attack Success Rates”, 10th IEEE International Conference on Intelligence and Security Informatics (ISI), Washington, D.C., USA, June 11-14, 2012. DOI: https://doi.org/10.1109/ISI.2012.6282648

Abbasi A. and Chen H., “A Comparison of Fraud Cues and Classification Methods for Fake Escrow Website Detection” Information Technology and Management, Vol. 10(2), 2009, pp. 83-101 DOI: https://doi.org/10.1007/s10799-009-0059-0

Bansal G., Zahedi F.M., and Gefen D., “The Impact of Personal Dispositions on Information Sensitivity, Privacy Concern and Trust in Disclosing Health Information Online Decision Support Systems”, Vol. 49(2), 2010, pp. 138-150 DOI: https://doi.org/10.1016/j.dss.2010.01.010

Chen Y., Zahedi F.M., and Abbasi A., “Interface Design Elements for Anti-phishing Systems” In Proc. Intl. Conf. Design Science Research in Information Systems and Technology, 2011, pp. 253- 265 DOI: https://doi.org/10.1007/978-3-642-20633-7_18

Grazioli S. and Jarvenpaa S.L., “Perils of Internet Fraud: An Empirical Investigation of Deception and Trust with Experienced Internet Consumers” IEEE Trans. Systems, Man, and Cybernetics Part A, Vol. 20(4), 2000, pp. 395-410 DOI: https://doi.org/10.1109/3468.852434

Martin A., AnutthamaaNa.Ba., Sathyavathy M., Marie Manjari Saint Francois, Dr. VenkatesanPrasanna, “A Framework for Predicting Phishing Websites Using Neural Networks”, IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 2, 2011, pp. 330-336

Aburrous Maher, Hossain M.A., DahalKeshav, ThabtahFadi, “Intelligent phishing detection system for e-banking using fuzzy data mining”, Expert Systems with Applications: An International Journal, Vol. 37 Issue 12, December, 2010. DOI: https://doi.org/10.1016/j.eswa.2010.04.044

Zhang, H., Liu, G., Chow, T., and Liu. W., “Textual and Visual Content-Based Anti-Phishing: A Bayesian Approach”, IEEE Transactions on Neural Networks, 22(10), 2011, 1532–1546 DOI: https://doi.org/10.1109/TNN.2011.2161999

Herzberg A. and Jbara A. “Security and identification indicators for browsers against spoofing and phishing attacks”, ACM Transactions on Internet Technology, 8(4), 2008, pp.1-36 DOI: https://doi.org/10.1145/1391949.1391950

Prakash P., Kumar M., Kompella R.R., and Gupta M., “Phish-Net: predictive blacklisting to detect phishing attacks” in IEEE INFOCOM Proceedings. San Diego, California, USA: IEEE, March, 2010, pp. 1–5

Garera S., Provos N., Chew M. and Rubin A.D., “A framework for detection and measurement of phishing attacks” Alexandria, Viriginia, USA: ACM, 2007, pp. 1–8

Dunlop Matthew, Groat Stephen and Shelly David, “GoldPhish: Using Images for Content-Based Phishing Analysis”, The Fifth International Conference on Internet Monitoring and Protection, IEEE Computer Society, 2010, pp. 123-128 DOI: https://doi.org/10.1109/ICIMP.2010.24

Chou N., Ledesma R., Teraguchi Y., D. Boneh, and Mitchell J. “Client-side defense against web-based identity theft”, In 11th Network and Distributed System Security Symposium (NDSS), 2004

Ross B., Jackson C., Miyake N., Boneh D., and Mitchell J., “Stronger Password Authentication Using Browser Extensions”, in 14th Usenix Security Symposium, 2005

Microsoft. Sender ID Framework Overview. http://www.microsoft.com, 2005

Yahoo. Yahoo! Anti-Spam Resource Center. http://antispam.yahoo.com, 2006

Hara M., Yamada A., and Miyake Y., “Visual similarity-based phishing detection without victim site information” Nashville, Tennessee, USA: IEEE, Apr. 2009, pp. 30–36 DOI: https://doi.org/10.1109/CICYBS.2009.4925087

Zhang Y., Egelman S., Cranor L., and Hong J., “Phinding phish: Evaluating Anti-Phishing tools” in Proceedings of the 14th Annual Network & Distributed System Security Symposium, San Diego, California, USA, Mar. 2007

Zhang Y., Hong J., and Cranor L., “CANTINA : A Content-Based approach to detecting phishing web sites” in Proceedings of the 16th international conference on WorldWideWeb. Banff, Alberta, Canada: ACM, May 2007, pp. 639–648 DOI: https://doi.org/10.1145/1242572.1242659

Garera S., Provos N., Chew M., “A Framework for Detection and Measurement of Phishing Attacks”, In: Proc. of the 5th ACM Workshop on Recurring Malcode, 2007, pp.1-8 DOI: https://doi.org/10.1145/1314389.1314391

Raffetseder Thomas, KirdaEngin, and Kruegel Christopher, “Building Anti-Phishing Browser Plug-Ins: An Experience Report”, SESS '07 Proceedings of the Third International Workshop on Software Engineering for Secure Systems, IEEE Computer Society Washington, DC, USA ©2007, p.6 DOI: https://doi.org/10.1109/SESS.2007.6

Aburrous Maher, Hossain M.A., DahalKeshav, ThabtahFadi,

“Predicting Phishing Websites using Classification Mining Techniques with Experimental Case Studies”, Seventh International Conference on Information Technology, IEEE Computer Society, 2010, pp. 176-184

Wedyan Suzan, WedyanFadi, “An Associative Classification Data Mining Approach for Detecting Phishing Websites”, Journal of Emerging Trends in Computing and Information Sciences, Vol. 4, No. 12, 2013, pp. 888-899

H. Wahbeh Abdullah, A. Al-RadaidehQasem, Mohammed N. Al-Kabi, and Emad M. Al-Shawakfa, “A Comparison Study between Data Mining Tools over some Classification Methods”, International Journal of Advanced Computer Science and Applications, Special Issue on Artificial Intelligence, 2012, pp. 19-26

APWG 4th Quarter 2015 Phishing Activity Trends Report from www.antiphishing.org, 2015

Phishing website list from http://www.phishtank.com/, November 2015