Expanding Phish-NET: Detecting Phishing Emails Using Natural Language Processing
Project Members
Student: Bryan R. Lee & GilchanPark / Advisor: Julia M. Taylor
Student: Bryan R. Lee & GilchanPark / Advisor: Julia M. Taylor
Abstract
Phishing is one of the most potentially disruptive actions that can be taken on the internet. Stealing a user’s account information within a business network through a phishing scam can be an easy way to gain access to that business network. Intellectual property and other pertinent business information could potentially be at risk if a phishing attack is successful. One of the most popular ways of carrying out a phishing attack is through email. Many businesses use typical spam filters such as blacklist-based or URL analysis techniques to protect users from some potentially malicious emails, but these alone are not enough. There have been quite a few attempts at creating a reliable, robust phishing email detection systems based on analyzing the content of the emails. For example, CANTINA, phishGILLNET and Phish-Net are proposed methods for content-based phishing detection. Phish-Net is a phishing detection utility that analyzes three parts of an email to determine whether or not it contains a phishing attack: the header, the text, and any links the email contains. The purpose of this research is to expand the text analysis portion of Phish-Net in determine whether it is possible to improve its email analysis capabilities. The text analysis portion of Phish-Net takes into account actionable verbs that tempt the user into performing an action. In this study, the new algorithm includes not only actionable verbs, but also other parts of speech so that it can catch any other actionable words in phishing emails.