Identifying types of Phishing Websites Based on Web Source Code and URL

Lemon Kazi
Oceanize Lab Geeks
Published in
5 min readNov 30, 2017

Major security issues for banking and financial institutions are Phishing. Phishing is a webpage attack. It is an illegitimate act to steals user personal information such as bank details, social security numbers and credit card details, by showing itself as a truthful object, in the public network.

When users provide confidential information, they are not aware of the fact that the websites they are using are phishing websites.

This article presents a technique for detecting phishing website attacks.

The United States Computer Emergency Readiness Team (US-CERT) defines phishing as a form of social engineering that uses email or malicious websites (among other channels) to solicit personal information from an individual or company by posing as a trustworthy organization or entity.

Phishing attacks often use email as a vehicle, sending email messages to users or company that the individual conducts business with, such as a banking or financial institution, or a web service through which the individual has an account.

The goal of a phishing attempt is to trick the recipient into taking the attacker’s desired action, such as providing login credentials or other sensitive information.

For instance, a phishing email appearing to come from a bank may warn the recipient that their account information has been compromised, directing the individual to a website where their username and/or password can be reset. This website is also fraudulent, designed to look legitimate, but exists solely to collect login information from phishing victims.

These fraudulent websites may also contain malicious code which executes on the user’s local machine when a link is clicked from a phishing email to open the website.

The most common purpose of phishing scams include:

Theft of login credentials

Typically credentials for accessing online services such as eBay, Hotmail, etc. More recently, the increase in online share trading services has meant that a customer’s trading credentials provide an easy route for international money transfers.

Theft of banking credentials —

Typically the online login credentials of popular high-street banking organizations and subsequent access to funds ready for transfer.

Observation of Credit Card details —

access to a steady stream of credit card details (i.e. card number, expiry and issue dates, cardholder’s name and credit card validation (CCV) number) has immediate value to most criminals.

Capture of address and other personal information —

any personal information, particularly address information in constant demand by direct marketing companies.

Distribution of botnet and DDoS agents —

criminals use phishing scams to install special bot and DDoS agents on unsuspecting computers and add them to their distributed networks. These agents can be rented to other criminals.

HOW TO IDENTIFY PHISHING ATTACKS

Phishing is most often initiated through email communications, but there are ways to distinguish suspicious emails from legitimate messages. Training employees on how to recognize these malicious emails is a must for enterprises who wish to prevent sensitive data loss.

Often, these data leaks occur because employees were not armed with the knowledge they need to help protect critical company data.

  • Emails with generic greetings. Phishing emails often include generic greetings, such as “Hello Bank Customer” rather than using the recipient’s actual name. This is an obvious tell for phishing attacks that are launched in bulk, whereas spear phishing attacks will typically be personalized.
  • Emails requesting personal information. Most legitimate companies will never email customers and ask them to enter login credentials or other private information by clicking on a link to a website. This is a safety measure to help protect consumers and help customers distinguish fraudulent emails from legitimate ones.
  • Emails requesting an urgent response. Most phishing emails attempt to create a sense of urgency, leading recipients to fear that their account is in jeopardy or they will lose access to important information if they don’t act immediately.
  • Emails with spoofed links. Does a hyperlink in the message body actually lead to the page it claims? Never click on these links to find out; instead, hover over the link to verify its authenticity. Also, look for URLs beginning with HTTPS. The “S” indicates that a website uses encryption to protect users’ page requests.

When in doubt, call. If the content of an email is concerning, call the company in question to find out if the email was sent legitimately.

If not, the company is now aware and can take action to warn other customers and users of potential phishing attempts appearing to come from their company.

Scripting in the source code:

A normal web user does not have knowledge whether a website is a malware. In the following steps are;

a) Web parsing:

Web parsing is a process in which every HTML code from the source of the web page is parsed.

Tags such as <>, html, br, textbox, regular expressions, etc., will be eliminated in this method each and every HTML tag in the source of the webpage are parsed.

b) Separating the Required Tokens:

After parsing is done on the source of the webpage only the data and information other than the unwanted links and tags will be displayed. After parsing the web page, the required tokens are separated. A token could be a keyword, an operator, or a punctuation mark.

c) Classification of Scripting Tokens If any external tokens are found while parsing, must be classified.These external tokens are created by hackers generally known as man-in-the-middle. Finally we text identification from the scripting and weight based find out phish site or legitimate site

References:

[1]. Xun Dong and John A. Clark, Jeremy L. Jacob “User Behaviour Based Phishing Websites Detection” Computer Science and Information Technology

[2.] Y. Zhang, J. Hong, and L. Cranor. “CANTINA: A Content-Based Approach to Detecting Phishing Web Sites”. May 2007.

--

--