This site is the archived OWASP Foundation Wiki and is no longer accepting Account Requests.
To view the new OWASP Foundation website, please visit https://owasp.org

Difference between revisions of "OWASP Phishycat Project"

From OWASP
Jump to: navigation, search
(Main)
 
(8 intermediate revisions by the same user not shown)
Line 9: Line 9:
  
 
==Description==
 
==Description==
OWASP Phishycat is a phishing detection framework. Main idea here is to guess the original domain that attacker is trying to phish. Next, it performs the test by doing real time image comparison and DOM analysis of both web pages (Original domain and phishing domain).
+
[[File:P-1.jpg|thumb]]OWASP Phishycat is a phishing detection framework. Main idea here is to guess the original domain that attacker is trying to phish. Next, it performs the test by doing real time image comparison and DOM analysis of both web pages (Original domain and phishing domain).
  
 
Original domains should be registered in project database before it goes for testing. Once it guess the domain name then it compares the real time images of both web pages (phishing site and original website) . Attacker will try to make the web page design look similar to original website as much as possible. If both images are similar to each other, then next step is to compare DOM of both web pages. If it does not match then there is a high chance that it is a phishing site. Because, we can say two web pages are similar only when every elements of the pages are identical. In this case, we have similar looking two websites but their DOM is different.
 
Original domains should be registered in project database before it goes for testing. Once it guess the domain name then it compares the real time images of both web pages (phishing site and original website) . Attacker will try to make the web page design look similar to original website as much as possible. If both images are similar to each other, then next step is to compare DOM of both web pages. If it does not match then there is a high chance that it is a phishing site. Because, we can say two web pages are similar only when every elements of the pages are identical. In this case, we have similar looking two websites but their DOM is different.
  
We take the phishing domain and try to match it with all the registered domain names. If similar looking domain name exist in project database then we do the real time image checking of both web pages. If image check is true, we proceed with real time DOM analysis of both web pages.  
+
We take the phishing domain and try to match it with all the registered domain names. If similar looking domain name exist in project database then we do the real time image compare of both web pages. If image compare is true, we proceed with real time DOM analysis of both web pages.  
  
 
Here we are assuming that phishing domain name is similar to original domain, which may not be the case always. In future, We can implement services like text analysis in a webpage and find the focused keywords, it will help us to get the idea of original domain.  
 
Here we are assuming that phishing domain name is similar to original domain, which may not be the case always. In future, We can implement services like text analysis in a webpage and find the focused keywords, it will help us to get the idea of original domain.  
 +
 +
In our MVP, we are not considering threshold for image comparing and DOM analysis. A little change in image and DOM can give us wrong result. There might be elements in web page which are designed to load dynamically every time we refresh the page. To avoid this problem, threshold values will be added in next release of OWASP Phishycat.
  
 
==Licensing==
 
==Licensing==
Line 23: Line 25:
  
 
== Project Resources ==
 
== Project Resources ==
<span style="color:#ff0000">
 
This is where you can link to the key locations for project files, including setup programs, the source code repository, online documentation, a Wiki Home Page, threaded discussions about the project, and Issue Tracking system, etc.
 
</span>
 
  
Current github: https://github.com/abhijitio/
+
Current github: [https://github.com/abhijitio/OWASP-Phishycat https://github.com/abhijitio/]
  
[https://github.com/SamanthaGroves Compiled DLLs]
+
[https://github.com/abhijitio/OWASP-Phishycat Source Code]
  
[https://github.com/SamanthaGroves Source Code]
+
[https://github.com/abhijitio/OWASP-Phishycat/wiki/About-OWASP-PhishyCat Documentation]
  
[https://github.com/SamanthaGroves Documentation]
+
[https://github.com/abhijitio/OWASP-Phishycat/wiki Wiki Home Page]
  
[https://github.com/SamanthaGroves Wiki Home Page]
+
[https://github.com/abhijitio/OWASP-Phishycat/issues Issue Tracker]
  
[https://github.com/SamanthaGroves Issue Tracker]
+
[https://www.owasp.org/index.php/Kolkata OWASP Kolkata]
 
 
[https://github.com/SamanthaGroves Slide Presentation]
 
 
 
[https://github.com/SamanthaGroves Video]
 
  
 
== Project Leader ==
 
== Project Leader ==
<span style="color:#ff0000">
 
A project leader is the individual who decides to lead the project throughout its lifecycle. The project leader is responsible for communicating the project’s progress to the OWASP Foundation, and he/she is ultimately responsible for the project’s deliverables. The project leader must provide OWASP with his/her real name and contact e-mail address for his/her project application to be accepted, as OWASP prides itself on the openness of its products, operations, and members.
 
</span>
 
  
Project leader's name
+
[[User:Abhijitio|Abhijit Chatterjee]]
  
 
== Related Projects ==
 
== Related Projects ==
<span style="color:#ff0000">
+
* [https://www.owasp.org/index.php/Phishing Phishing]
This is where you can link to other OWASP Projects that are similar to yours.  
+
* [https://www.owasp.org/index.php/Content_Spoofing Content_Spoofing]
</span>
 
 
 
* [[OWASP_Code_Tool_Template]]
 
* [[OWASP_Documentation_Project_Template]]
 
  
 
==Classifications==
 
==Classifications==
Line 75: Line 63:
  
 
== News and Events ==
 
== News and Events ==
<span style="color:#ff0000">
+
 
This is where you can provide project updates, links to any events like conference presentations, Project Leader interviews, case studies on successful project implementations, and articles written about your project.
+
* [18 May 2017] 1.0 Release Candidate is available for download. Any feedback (good or bad) in the next few weeks would be greatly appreciated.
</span>
 
* [18 Dec 2013] 1.0 Release Candidate is available for download. This release provides final bug fixes and product stabilization.  Any feedback (good or bad) in the next few weeks would be greatly appreciated.
 
* [20 Nov 2013] 1.0 Beta 2 Release is available for download. This release offers several bug fixes, a few performance improvements, and addressed all outstanding issues from a security audit of the code.
 
* [30 Sep 2013] 1.0 Beta 1 Release is available for download.  This release offers the first version with all of the functionality for a minimum viable product.   
 
  
 
|}
 
|}
Line 103: Line 87:
 
= Road Map and Getting Involved =
 
= Road Map and Getting Involved =
  
<!-- Instructions are in RED and should be removed from your document by deleting the text with the span tags.-->
+
'''WHAT'S COMING'''
Main idea behind this framework is to guess the original domain that attacker is trying to phish and then compare.
 
 
 
Original domains should be registered in our database before we go for testing. Once it guess the domain name then it compares the real time images of both web pages (phishing site and original website) . Attacker will try to make the web page design look similar to original website as much as possible. If both images are similar to each other, then next step is to compare DOM of both web pages. If it does not match then there is a high chance that it is  a phishing site. Because, we can say two web pages are similar only when every elements of the pages are identical. In this case, we have similar looking two websites but their DOM is different.
 
 
 
Currently, this project also include a Chrome Browser plugin “PhishBlocker” to communicate between browser and back end server. It sends the URL to backend Python Flask server. Server perform the phishing test and respond. User get a javascript alert box with written “Phishing Detected” in the browser.
 
 
 
It can be integrated with any other platform. You just need to send a POST request to running server and receive the response.
 
 
 
Here are the steps that Phishycat follows :-
 
 
 
1. Suppose an organization already  resgistered their domain (in most cases, it is an appropiate login page or any other important url) in Phishycat Database. Let’s say your website domain is facebook.com.
 
 
 
2. Attacker is using the domain “ifaceboook.com” to phish and this domain look like original “facebook.com”. We will fetch the word just “ifaceboook” from domain name (without domain extension, this is tricky, as we are only considering the TLDs. It is difficult to remove domain extension in case of non TLDs. We can keep adding the domain extensions in the array whenever we find a new one) and send back to server. In the database, we already have the keyword “facebook” with other keywords of other domains. We used Bayesian approach to find the similar matching keyword of “ ifaceboook”. The keyword “ ifaceboook” returns “facebook”. So, now we know that it is trying to phish Facebook domain.
 
 
 
This will not work always, because not necessarily attacker will host the phishing page in the index  of domain and it is also not necessary that phishing domain should look similar to original domain. In future, we can analyze the text or images in the webpage and try to find out which original domain attacker is trying to phish. Probably, find the most frequently used words and then reach to a conclusion by using a machine learning algorithm can help us.
 
 
 
Database integration is not done yet. We are storing keywords in a single text file. After doing database integration, we can map the resgistered domain URLs by keywords. In our experiment model, we are adding “.com” extension after the matched keyword and forwarding it to next step. That means, suppose if our data text file contains three words, “facebook”, “google” and “yahoo”. Here, “facebook” is the only keyword that is similar to “ifaceboook”. So, it is returning “facebook” and then we are adding “.com” , which becomes “facebook.com”.
 
 
 
So, we found that “facebook.com” is the domain that attacker is trying to copy. Domains should be registered in our database. Here, we have to improve. Because, we are not sure that “facebook.com” is the only important page of Facebook, it might be “facebook.com/login” . So, when we are done with database integration, we can point the returned keyword “facebook” to “facebook.com/login”.
 
 
 
3. Just domain name analysis will not give us correct result. To avoid false positives, we have to do something more. Next we are going to compare the real time images of two URLs, phishing URL (“ifaceboook.com”) and original URL (“facebook.com/login”). We are using selenium to render the webpages and take a real time picture of both. Now, we compare both images and calculate the average Norm. If both images look exactly same then their Norm should be equal to zero and their real time DOM should be same as well. If images does match but DOM is different, then there is a chance that web page has been modified (I mean functionality) by keeping the design same as original website. Hence, it’s a phishing website.
 
 
 
Here is the logic,
 
 
 
“comparing images” #Some function here
 
 
 
If Norm ==0
 
 
 
“perform DOM check here” #Some function here
 
 
 
else
 
 
 
return “Both images are different. We are skipping the test”
 
 
 
We have challenges here too. Attacker can bypass the image check by doing a very slight change in the design. Also, even original Web page may contain animation which change everytime you load the page. We have to find a threshold value, so that we can detect and continue the test if there is very few changes in website design.
 
 
 
4. Next, we are parsing the DOM of two web pages, phising page (“ifacebook.com) and original page (“facebook.com/login”). Then we are calculating a hash value by using Simhash for both of them. If hash does not match then they are not the same page but they look similar in design. That means, something phishy is going on.
 
 
 
A very little change in the web page can give us very different hash values. There could be some elements or scripts in the web page, which are not always same in each render and it will give us false result. We need to apply a threshold value here too, so that we can ignore upto a certain range.
 
 
 
Here is the logic,
 
 
 
“comparing images” #Some function here
 
 
 
If Norm==0
 
 
 
“perform Dom hash value check here” #Some function here
 
 
 
if “phishing webpage DOM hash value”==”original web page DOM hash value”
 
 
 
return “No issues, it should be same”
 
 
 
else
 
 
 
return “Something Phishy is going on”
 
 
 
else
 
 
 
return “Both images are different. We are skipping the test”
 
 
 
End
 
  
 
==Roadmap==
 
==Roadmap==
As of <strong>November, 2013, the highest priorities for the next 6 months</strong> are:
+
As of '''June''' <strong>2017, the highest priorities for the next 3 months</strong> are:
 
<strong>
 
<strong>
* Complete the first draft of the Code Project Template
+
* Complete the text analysis to find the focused keyword and less dependency on domain name analysis.
* Get other people to review the Code Project Template and provide feedback
+
* Database integration to properly maintain the registered domains.
* Incorporate feedback into changes in the Code Project Template
+
* Get other people to review the Code Project Template and provide feedback.
* Finalize the Code Project template and have it reviewed to be promoted from an Incubator Project to a Lab Project
+
* Develop plugin for Firefox.
 +
* Mobile Application support.
 
</strong>
 
</strong>
  
 
Subsequent Releases will add
 
Subsequent Releases will add
 
<strong>
 
<strong>
* Internationalization Support
+
* Threshold value for image checking to avoid false positives.
* Additional Unit Tests
+
* Threshold value for DOM analysis to avoid false positives.
* Automated Regression tests
 
 
</strong>
 
</strong>
  
Line 202: Line 125:
  
 
=Minimum Viable Product=
 
=Minimum Viable Product=
<span style="color:#ff0000">
+
Minimum Viable Product of OWASP Phishycat Project can be found here :- https://github.com/abhijitio/OWASP-Phishycat
This page is where you should indicate what is the minimum set of functionality that is required to make this a useful product that addresses your core security concern.
 
Defining this information helps the project leader to think about what is the critical functionality that a user needs for this project to be useful, thereby helping determine what the priorities should be on the roadmap. And it also helps reviewers who are evaluating the project to determine if the functionality sufficiently provides the critical functionality to determine if the project should be promoted to the next project category. 
 
</span>
 
 
 
The Code Project Template must specify the minimum set of tabs a project should have, provide some an example layout on each tab, provide instructional text on how a project leader should modify the tab, and give some example text that illustrates how to create an actual project.
 
  
It would also be ideal if the sample text was translated into different languages.
+
Wiki Home :- https://github.com/abhijitio/OWASP-Phishycat/wiki
 
__NOTOC__ <headertabs></headertabs>  
 
__NOTOC__ <headertabs></headertabs>  
  

Latest revision as of 20:55, 5 June 2017

OWASP Project Header.jpg

OWASP Phishycat Project

OWASP Phishycat is a phishing detection framework. It detects a phishing page and show alert to the user. Currently, it only supports chrome browser. This project also includes a plugin called "Phishblocker" to communicate with browser and back-end server. It can be extended in other platforms as well.

Description

P-1.jpg
OWASP Phishycat is a phishing detection framework. Main idea here is to guess the original domain that attacker is trying to phish. Next, it performs the test by doing real time image comparison and DOM analysis of both web pages (Original domain and phishing domain).

Original domains should be registered in project database before it goes for testing. Once it guess the domain name then it compares the real time images of both web pages (phishing site and original website) . Attacker will try to make the web page design look similar to original website as much as possible. If both images are similar to each other, then next step is to compare DOM of both web pages. If it does not match then there is a high chance that it is a phishing site. Because, we can say two web pages are similar only when every elements of the pages are identical. In this case, we have similar looking two websites but their DOM is different.

We take the phishing domain and try to match it with all the registered domain names. If similar looking domain name exist in project database then we do the real time image compare of both web pages. If image compare is true, we proceed with real time DOM analysis of both web pages.

Here we are assuming that phishing domain name is similar to original domain, which may not be the case always. In future, We can implement services like text analysis in a webpage and find the focused keywords, it will help us to get the idea of original domain.

In our MVP, we are not considering threshold for image comparing and DOM analysis. A little change in image and DOM can give us wrong result. There might be elements in web page which are designed to load dynamically every time we refresh the page. To avoid this problem, threshold values will be added in next release of OWASP Phishycat.

Licensing

OWASP Phishycat is free to use. It is licensed under the Apache License 2.0 https://apache.org/licenses/LICENSE-2.0.html

Project Resources

Current github: https://github.com/abhijitio/

Source Code

Documentation

Wiki Home Page

Issue Tracker

OWASP Kolkata

Project Leader

Abhijit Chatterjee

Related Projects

Classifications

Project Type Files CODE.jpg
Incubator Project Owasp-builders-small.png
Owasp-defenders-small.png
Affero General Public License 3.0

News and Events

  • [18 May 2017] 1.0 Release Candidate is available for download. Any feedback (good or bad) in the next few weeks would be greatly appreciated.

How can I participate in your project?

All you have to do is make the Project Leader's aware of your available time to contribute to the project. It is also important to let the Leader's know how you would like to contribute and pitch in to help the project meet it's goals and milestones. There are many different ways you can contribute to an OWASP Project, but communication with the leads is key.

If I am not a programmer can I participate in your project?

Yes, you can certainly participate in the project if you are not a programmer or technical. The project needs different skills and expertise and different times during its development. Currently, we are looking for researchers, writers, graphic designers, and a project administrator.

Volunteers

The OWASP Security Principles project is developed by a worldwide team of volunteers. A live update of project contributors is found here.

The first contributors to the project were:

WHAT'S COMING

Roadmap

As of June 2017, the highest priorities for the next 3 months are:

  • Complete the text analysis to find the focused keyword and less dependency on domain name analysis.
  • Database integration to properly maintain the registered domains.
  • Get other people to review the Code Project Template and provide feedback.
  • Develop plugin for Firefox.
  • Mobile Application support.

Subsequent Releases will add

  • Threshold value for image checking to avoid false positives.
  • Threshold value for DOM analysis to avoid false positives.

Getting Involved

Involvement in the development and promotion of Code Project Template is actively encouraged! You do not have to be a security expert or a programmer to contribute. Some of the ways you can help are as follows:

Coding

We could implement some of the later items on the roadmap sooner if someone wanted to help out with unit or automated regression tests

Localization

Are you fluent in another language? Can you help translate the text strings in the Code Project Template into that language?

Testing

Do you have a flair for finding bugs in software? We want to product a high quality product, so any help with Quality Assurance would be greatly appreciated. Let us know if you can offer your help.

Feedback

Please use the Code Project Template project mailing list for feedback about:

  • What do like?
  • What don't you like?
  • What features would you like to see prioritized on the roadmap?

Minimum Viable Product of OWASP Phishycat Project can be found here :- https://github.com/abhijitio/OWASP-Phishycat

Wiki Home :- https://github.com/abhijitio/OWASP-Phishycat/wiki