This site is the archived OWASP Foundation Wiki and is no longer accepting Account Requests.
To view the new OWASP Foundation website, please visit https://owasp.org
OWASP Phishycat Project
OWASP Phishycat ProjectOWASP Phishycat is a phishing detection framework. It detects a phishing page and show alert to the user. Currently, it only supports chrome browser. This project also includes a plugin called "Phishblocker" to communicate with browser and back-end server. It can be extended in other platforms as well. DescriptionOWASP Phishycat is a phishing detection framework. Main idea here is to guess the original domain that attacker is trying to phish. Next, it performs the test by doing real time image comparison and DOM analysis of both web pages (Original domain and phishing domain). Original domains should be registered in project database before it goes for testing. Once it guess the domain name then it compares the real time images of both web pages (phishing site and original website) . Attacker will try to make the web page design look similar to original website as much as possible. If both images are similar to each other, then next step is to compare DOM of both web pages. If it does not match then there is a high chance that it is a phishing site. Because, we can say two web pages are similar only when every elements of the pages are identical. In this case, we have similar looking two websites but their DOM is different. We take the phishing domain and try to match it with all the registered domain names. If similar looking domain name exist in project database then we do the real time image checking of both web pages. If image check is true, we proceed with real time DOM analysis of both web pages. Here we are assuming that phishing domain name is similar to original domain, which may not be the case always. In future, We can implement services like text analysis in a webpage and find the focused keywords, it will help us to get the idea of original domain. LicensingOWASP Phishycat is free to use. It is licensed under the Apache License 2.0 https://apache.org/licenses/LICENSE-2.0.html |
Project ResourcesThis is where you can link to the key locations for project files, including setup programs, the source code repository, online documentation, a Wiki Home Page, threaded discussions about the project, and Issue Tracking system, etc. Current github: https://github.com/abhijitio/ Project LeaderA project leader is the individual who decides to lead the project throughout its lifecycle. The project leader is responsible for communicating the project’s progress to the OWASP Foundation, and he/she is ultimately responsible for the project’s deliverables. The project leader must provide OWASP with his/her real name and contact e-mail address for his/her project application to be accepted, as OWASP prides itself on the openness of its products, operations, and members. Project leader's name Related ProjectsThis is where you can link to other OWASP Projects that are similar to yours. Classifications |
News and EventsThis is where you can provide project updates, links to any events like conference presentations, Project Leader interviews, case studies on successful project implementations, and articles written about your project.
|
How can I participate in your project?
All you have to do is make the Project Leader's aware of your available time to contribute to the project. It is also important to let the Leader's know how you would like to contribute and pitch in to help the project meet it's goals and milestones. There are many different ways you can contribute to an OWASP Project, but communication with the leads is key.
If I am not a programmer can I participate in your project?
Yes, you can certainly participate in the project if you are not a programmer or technical. The project needs different skills and expertise and different times during its development. Currently, we are looking for researchers, writers, graphic designers, and a project administrator.
Volunteers
The OWASP Security Principles project is developed by a worldwide team of volunteers. A live update of project contributors is found here.
The first contributors to the project were:
- Abhijit Chatterjee who created the MVP.
Main idea behind this framework is to guess the original domain that attacker is trying to phish and then compare.
Original domains should be registered in our database before we go for testing. Once it guess the domain name then it compares the real time images of both web pages (phishing site and original website) . Attacker will try to make the web page design look similar to original website as much as possible. If both images are similar to each other, then next step is to compare DOM of both web pages. If it does not match then there is a high chance that it is a phishing site. Because, we can say two web pages are similar only when every elements of the pages are identical. In this case, we have similar looking two websites but their DOM is different.
Currently, this project also include a Chrome Browser plugin “PhishBlocker” to communicate between browser and back end server. It sends the URL to backend Python Flask server. Server perform the phishing test and respond. User get a javascript alert box with written “Phishing Detected” in the browser.
It can be integrated with any other platform. You just need to send a POST request to running server and receive the response.
Here are the steps that Phishycat follows :-
1. Suppose an organization already resgistered their domain (in most cases, it is an appropiate login page or any other important url) in Phishycat Database. Let’s say your website domain is facebook.com.
2. Attacker is using the domain “ifaceboook.com” to phish and this domain look like original “facebook.com”. We will fetch the word just “ifaceboook” from domain name (without domain extension, this is tricky, as we are only considering the TLDs. It is difficult to remove domain extension in case of non TLDs. We can keep adding the domain extensions in the array whenever we find a new one) and send back to server. In the database, we already have the keyword “facebook” with other keywords of other domains. We used Bayesian approach to find the similar matching keyword of “ ifaceboook”. The keyword “ ifaceboook” returns “facebook”. So, now we know that it is trying to phish Facebook domain.
This will not work always, because not necessarily attacker will host the phishing page in the index of domain and it is also not necessary that phishing domain should look similar to original domain. In future, we can analyze the text or images in the webpage and try to find out which original domain attacker is trying to phish. Probably, find the most frequently used words and then reach to a conclusion by using a machine learning algorithm can help us.
Database integration is not done yet. We are storing keywords in a single text file. After doing database integration, we can map the resgistered domain URLs by keywords. In our experiment model, we are adding “.com” extension after the matched keyword and forwarding it to next step. That means, suppose if our data text file contains three words, “facebook”, “google” and “yahoo”. Here, “facebook” is the only keyword that is similar to “ifaceboook”. So, it is returning “facebook” and then we are adding “.com” , which becomes “facebook.com”.
So, we found that “facebook.com” is the domain that attacker is trying to copy. Domains should be registered in our database. Here, we have to improve. Because, we are not sure that “facebook.com” is the only important page of Facebook, it might be “facebook.com/login” . So, when we are done with database integration, we can point the returned keyword “facebook” to “facebook.com/login”.
3. Just domain name analysis will not give us correct result. To avoid false positives, we have to do something more. Next we are going to compare the real time images of two URLs, phishing URL (“ifaceboook.com”) and original URL (“facebook.com/login”). We are using selenium to render the webpages and take a real time picture of both. Now, we compare both images and calculate the average Norm. If both images look exactly same then their Norm should be equal to zero and their real time DOM should be same as well. If images does match but DOM is different, then there is a chance that web page has been modified (I mean functionality) by keeping the design same as original website. Hence, it’s a phishing website.
Here is the logic,
“comparing images” #Some function here
If Norm ==0
“perform DOM check here” #Some function here
else
return “Both images are different. We are skipping the test”
We have challenges here too. Attacker can bypass the image check by doing a very slight change in the design. Also, even original Web page may contain animation which change everytime you load the page. We have to find a threshold value, so that we can detect and continue the test if there is very few changes in website design.
4. Next, we are parsing the DOM of two web pages, phising page (“ifacebook.com) and original page (“facebook.com/login”). Then we are calculating a hash value by using Simhash for both of them. If hash does not match then they are not the same page but they look similar in design. That means, something phishy is going on.
A very little change in the web page can give us very different hash values. There could be some elements or scripts in the web page, which are not always same in each render and it will give us false result. We need to apply a threshold value here too, so that we can ignore upto a certain range.
Here is the logic,
“comparing images” #Some function here
If Norm==0
“perform Dom hash value check here” #Some function here
if “phishing webpage DOM hash value”==”original web page DOM hash value”
return “No issues, it should be same”
else
return “Something Phishy is going on”
else
return “Both images are different. We are skipping the test”
End
Roadmap
As of November, 2013, the highest priorities for the next 6 months are:
- Complete the first draft of the Code Project Template
- Get other people to review the Code Project Template and provide feedback
- Incorporate feedback into changes in the Code Project Template
- Finalize the Code Project template and have it reviewed to be promoted from an Incubator Project to a Lab Project
Subsequent Releases will add
- Internationalization Support
- Additional Unit Tests
- Automated Regression tests
Getting Involved
Involvement in the development and promotion of Code Project Template is actively encouraged! You do not have to be a security expert or a programmer to contribute. Some of the ways you can help are as follows:
Coding
We could implement some of the later items on the roadmap sooner if someone wanted to help out with unit or automated regression tests
Localization
Are you fluent in another language? Can you help translate the text strings in the Code Project Template into that language?
Testing
Do you have a flair for finding bugs in software? We want to product a high quality product, so any help with Quality Assurance would be greatly appreciated. Let us know if you can offer your help.
Feedback
Please use the Code Project Template project mailing list for feedback about:
- What do like?
- What don't you like?
- What features would you like to see prioritized on the roadmap?
This page is where you should indicate what is the minimum set of functionality that is required to make this a useful product that addresses your core security concern. Defining this information helps the project leader to think about what is the critical functionality that a user needs for this project to be useful, thereby helping determine what the priorities should be on the roadmap. And it also helps reviewers who are evaluating the project to determine if the functionality sufficiently provides the critical functionality to determine if the project should be promoted to the next project category.
The Code Project Template must specify the minimum set of tabs a project should have, provide some an example layout on each tab, provide instructional text on how a project leader should modify the tab, and give some example text that illustrates how to create an actual project.
It would also be ideal if the sample text was translated into different languages.