This site is the archived OWASP Foundation Wiki and is no longer accepting Account Requests.
To view the new OWASP Foundation website, please visit https://owasp.org

Difference between revisions of "Benchmark"

From OWASP
Jump to: navigation, search
m
(Added Kiuwan to the tool scanning tips section)
 
(13 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
= Main =  
 
= Main =  
  <div style="width:100%;height:100px;border:0,margin:0;overflow: hidden;">[[File:Incubator_big.jpg|link=OWASP_Project_Stages#tab.3DLab_Projects]]</div>
+
  <div style="width:100%;height:100px;border:0,margin:0;overflow: hidden;">[[File:Lab_big.jpg|link=OWASP_Project_Stages#tab.3DLab_Projects]]</div>
 
{| style="padding: 0;margin:0;margin-top:10px;text-align:left;" |-
 
{| style="padding: 0;margin:0;margin-top:10px;text-align:left;" |-
 
| valign="top"  style="border-right: 1px dotted gray;padding-right:25px;" |
 
| valign="top"  style="border-right: 1px dotted gray;padding-right:25px;" |
Line 7: Line 7:
 
The OWASP Benchmark for Security Automation (OWASP Benchmark) is a free and open test suite designed to evaluate the speed, coverage, and accuracy of automated software vulnerability detection tools and services (henceforth simply referred to as 'tools'). Without the ability to measure these tools, it is difficult to understand their strengths and weaknesses, and compare them to each other. Each version of the OWASP Benchmark contains thousands of test cases that are fully runnable and exploitable, each of which maps to the appropriate CWE number for that vulnerability.
 
The OWASP Benchmark for Security Automation (OWASP Benchmark) is a free and open test suite designed to evaluate the speed, coverage, and accuracy of automated software vulnerability detection tools and services (henceforth simply referred to as 'tools'). Without the ability to measure these tools, it is difficult to understand their strengths and weaknesses, and compare them to each other. Each version of the OWASP Benchmark contains thousands of test cases that are fully runnable and exploitable, each of which maps to the appropriate CWE number for that vulnerability.
  
You can use the OWASP Benchmark with [[Source_Code_Analysis_Tools | Static Application Security Testing (SAST)]] tools, [[:Category:Vulnerability_Scanning_Tools | Dynamic Application Security Testing (DAST)]] tools like OWASP [[ZAP]] and Interactive Application Security Testing (IAST) tools. The current version of the Benchmark is implemented in Java.  Future versions may expand to include other languages.
+
You can use the OWASP Benchmark with [[Source_Code_Analysis_Tools | Static Application Security Testing (SAST)]] tools, [[:Category:Vulnerability_Scanning_Tools | Dynamic Application Security Testing (DAST)]] tools like OWASP [[ZAP]] and Interactive Application Security Testing (IAST) tools. Benchmark is implemented in Java.  Future versions may expand to include other languages.
  
 
==Benchmark Project Scoring Philosophy==
 
==Benchmark Project Scoring Philosophy==
Line 51: Line 51:
  
 
Anyone can use this Benchmark to evaluate vulnerability detection tools. The basic steps are:
 
Anyone can use this Benchmark to evaluate vulnerability detection tools. The basic steps are:
# Download the Benchmark from github
+
# Download the Benchmark from GitHub
 
# Run your tools against the Benchmark
 
# Run your tools against the Benchmark
 
# Run the BenchmarkScore tool on the reports from your tools
 
# Run the BenchmarkScore tool on the reports from your tools
Line 393: Line 393:
 
'''Free Static Application Security Testing (SAST) Tools:'''
 
'''Free Static Application Security Testing (SAST) Tools:'''
  
* [http://pmd.sourceforge.net/ PMD] (which really has no security rules) - .xml results file
+
* [https://pmd.github.io/ PMD] (which really has no security rules) - .xml results file
* [http://findbugs.sourceforge.net/ Findbugs] - .xml results file
+
* [http://findbugs.sourceforge.net/ FindBugs] - .xml results file (Note: FindBugs hasn't been updated since 2015. Use SpotBugs instead (see below))
* FindBugs with the [http://h3xstream.github.io/find-sec-bugs/ FindSecurityBugs plugin] - .xml results file
+
* [https://www.sonarqube.org/downloads/ SonarQube] - .xml results file
* [http://www.sonarqube.org/downloads/ SonarQube] - .xml results file
+
* [https://spotbugs.github.io/ SpotBugs] - .xml results file. This is the successor to FindBugs.
* [http://www.rigs-it.net/index.php/product.html XANITIZER] - (Requires registration to download) - xml results file
+
* SpotBugs with the [http://find-sec-bugs.github.io/ FindSecurityBugs plugin] - .xml results file
  
 
Note: We looked into supporting [http://checkstyle.sourceforge.net/ Checkstyle] but it has no security rules, just like PMD. The [http://fb-contrib.sourceforge.net/ fb-contrib] FindBugs plugin doesn't have any security rules either. We did test [http://errorprone.info/ Error Prone], and found that it does report some use of [http://errorprone.info/bugpattern/InsecureCipherMode) insecure ciphers (CWE-327)], but that's it.
 
Note: We looked into supporting [http://checkstyle.sourceforge.net/ Checkstyle] but it has no security rules, just like PMD. The [http://fb-contrib.sourceforge.net/ fb-contrib] FindBugs plugin doesn't have any security rules either. We did test [http://errorprone.info/ Error Prone], and found that it does report some use of [http://errorprone.info/bugpattern/InsecureCipherMode) insecure ciphers (CWE-327)], but that's it.
Line 403: Line 403:
 
'''Commercial SAST Tools:'''
 
'''Commercial SAST Tools:'''
  
* [http://www.castsoftware.com/products/application-intelligence-platform CAST Application Intelligence Platform (AIP)] - .xml results file
+
* [https://www.castsoftware.com/products/application-intelligence-platform CAST Application Intelligence Platform (AIP)] - .xml results file
* [https://www.checkmarx.com/technology/static-code-analysis-sca/ Checkmarx CxSAST] - .xml results file
+
* [https://www.checkmarx.com/products/static-application-security-testing/ Checkmarx CxSAST] - .xml results file
* [https://www.synopsys.com/software-integrity/resources/datasheets/coverity.html Synopsys Static Analysis (Formerly Coverity Code Advisor) (On-Demand and stand-alone versions)] - .json results file
+
* [https://www.ibm.com/us-en/marketplace/ibm-appscan-source IBM AppScan Source (Standalone and Cloud)] - .ozasmt or .xml results file
* [https://software.microfocus.com/en-us/software/sca Micro Focus (Formally HPE) Fortify (On-Demand and stand-alone versions)] - .fpr results file
+
* [https://juliasoft.com/solutions/julia-for-security/ Julia Analyzer] - .xml results file
* [http://www-03.ibm.com/software/products/en/appscan-source IBM AppScan Source (Standalone and Cloud)] - .ozasmt or .xml results file
+
* [https://www.kiuwan.com/code-security-sast/ Kiuwan Code Security] - .threadfix results file
* [https://www.juliasoft.com/eng/solutions/overview Julia Analyzer] - .xml results file
+
* [https://software.microfocus.com/en-us/products/static-code-analysis-sast/overview Micro Focus (Formally HPE) Fortify (On-Demand and stand-alone versions)] - .fpr results file
* [https://www.parasoft.com/product/jtest/ Parasoft Jtest] - .xml results file
+
* [https://www.parasoft.com/products/jtest/ Parasoft Jtest] - .xml results file
 +
* [https://semmle.com/lgtm Semmle LGTM] - .sarif results file
 +
* [https://www.shiftleft.io/product/ ShiftLeft SAST] - .sl results file (Benchmark specific format. Ask vendor how to generate this)
 +
* [https://snappycodeaudit.com/category/static-code-analysis Snappycode Audit's SnappyTick Source Edition (SAST)] - .xml results file
 
* [https://www.sourcemeter.com/features/ SourceMeter] - .txt results file of ALL results from VulnerabilityHunter
 
* [https://www.sourcemeter.com/features/ SourceMeter] - .txt results file of ALL results from VulnerabilityHunter
* [http://www.veracode.com/products/binary-static-analysis-sast Veracode SAST] - .xml results file
+
* [https://www.synopsys.com/content/dam/synopsys/sig-assets/datasheets/SAST-Coverity-datasheet.pdf Synopsys Static Analysis (Formerly Coverity Code Advisor) (On-Demand and stand-alone versions)] - .json results file (You can scan Benchmark w/Coverity for free. See: https://scan.coverity.com/)
 +
* [https://www.defensecode.com/thunderscan.php Thunderscan SAST] - .xml results file
 +
* [https://www.veracode.com/products/binary-static-analysis-sast Veracode SAST] - .xml results file
 +
* [https://www.rigs-it.com/xanitizer/ XANITIZER] - xml results file ([https://www.rigs-it.com/wp-content/uploads/2018/03/howtosetupxanitizerforowaspbenchmarkproject.pdf Their white paper on how to setup Xanitizer to scan Benchmark.]) (Free trial available)
  
We are looking for results for other commercial static analysis tools like: [http://www.grammatech.com/codesonar Grammatech CodeSonar], [http://www.klocwork.com/products-services/klocwork Klocwork], etc. If you have a license for any static analysis tool not already listed above and can run it on the Benchmark and send us the results file that would be very helpful.  
+
We are looking for results for other commercial static analysis tools like: [https://www.grammatech.com/products/codesonar Grammatech CodeSonar], [https://www.roguewave.com/products-services/klocwork RogueWave's Klocwork], etc. If you have a license for any static analysis tool not already listed above and can run it on the Benchmark and send us the results file that would be very helpful.  
  
 
The free SAST tools come bundled with the Benchmark so you can run them yourselves. If you have a license for any commercial SAST tool, you can also run them against the Benchmark. Just put your results files in the /results folder of the project, and then run the BenchmarkScore script for your platform (.sh / .bat) and it will generate a scorecard in the /scorecard directory for all the tools you have results for that are currently supported.
 
The free SAST tools come bundled with the Benchmark so you can run them yourselves. If you have a license for any commercial SAST tool, you can also run them against the Benchmark. Just put your results files in the /results folder of the project, and then run the BenchmarkScore script for your platform (.sh / .bat) and it will generate a scorecard in the /scorecard directory for all the tools you have results for that are currently supported.
Line 429: Line 435:
 
'''Commercial DAST Tools:'''
 
'''Commercial DAST Tools:'''
  
* [https://www.acunetix.com/vulnerability-scanner/ Acunetix Web Vulnerability Scanner (WVS)] - .xml results file (Generated using [http://www.acunetix.com/blog/docs/acunetix-wvs-cli-operation/ command line interface] /ExportXML switch)
+
* [https://www.acunetix.com/vulnerability-scanner/ Acunetix Web Vulnerability Scanner (WVS)] - .xml results file (Generated using [https://www.acunetix.com/resources/wvs7manual.pdf command line interface (see Chapter 10.)] /ExportXML switch)
* [https://portswigger.net/burp/ Burp Pro] - .xml results file
+
* [https://portswigger.net/burp Burp Pro] - .xml results file
**You must use Burp Pro v1.6.30+ to scan the Benchmark due to a limitation fixed in v1.6.30.
+
* [https://www.ibm.com/us-en/marketplace/appscan-standard IBM AppScan] - .xml results file
* [https://software.microfocus.com/en-us/software/webinspect Micro Focus (Formally HPE) WebInspect] - .xml results file
+
* [https://software.microfocus.com/en-us/products/webinspect-dynamic-analysis-dast/overview Micro Focus (Formally HPE) WebInspect] - .xml results file
* [http://www-03.ibm.com/software/products/en/appscan IBM AppScan] - .xml results file
 
 
* [https://www.netsparker.com/web-vulnerability-scanner/ Netsparker] - .xml results file
 
* [https://www.netsparker.com/web-vulnerability-scanner/ Netsparker] - .xml results file
 +
* [https://www.qualys.com/apps/web-app-scanning/ Qualys Web App Scanner] - .xml results file
 
* [https://www.rapid7.com/products/appspider/ Rapid7 AppSpider] - .xml results file
 
* [https://www.rapid7.com/products/appspider/ Rapid7 AppSpider] - .xml results file
 
* Qualys - We ran Qualys against v1.2 of the Benchmark and it found none of the vulnerabilities we test for as far as we could tell. So we haven't implemented a scorecard generator for it. If you get results where you think it does find some real issues, send us the results file and, if confirmed, we'll produce a scorecard generator for it.
 
  
 
If you have access to other DAST Tools, PLEASE RUN THEM FOR US against the Benchmark, and send us the results file so we can build a scorecard generator for that tool.
 
If you have access to other DAST Tools, PLEASE RUN THEM FOR US against the Benchmark, and send us the results file so we can build a scorecard generator for that tool.
Line 443: Line 447:
 
'''Commercial Interactive Application Security Testing (IAST) Tools:'''
 
'''Commercial Interactive Application Security Testing (IAST) Tools:'''
  
* [https://www.contrastsecurity.com/interactive-application-security-testing-iast Contrast Assess] - .zip results file
+
* [https://www.contrastsecurity.com/interactive-application-security-testing-iast Contrast Assess] - .zip results file (You can scan Benchmark w/Contrast for free. See: https://www.contrastsecurity.com/contrast-community-edition)
 
* [https://hdivsecurity.com/interactive-application-security-testing-iast Hdiv Detection (IAST)] - .hlg results file
 
* [https://hdivsecurity.com/interactive-application-security-testing-iast Hdiv Detection (IAST)] - .hlg results file
 +
* [https://www.synopsys.com/software-integrity/security-testing/interactive-application-security-testing.html Seeker IAST] - .csv results file
  
 
'''Commercial Hybrid Analysis Application Security Testing Tools:'''
 
'''Commercial Hybrid Analysis Application Security Testing Tools:'''
Line 494: Line 499:
  
 
  GIT: http://git-scm.com/ or https://github.com/
 
  GIT: http://git-scm.com/ or https://github.com/
  Maven: https://maven.apache.org/  (Version: 3.2.3 or newer works. We heard that 3.0.5 throws an error.)
+
  Maven: https://maven.apache.org/  (Version: 3.2.3 or newer works.)
  Java: http://www.oracle.com/technetwork/java/javase/downloads/index.html (Java 7 or 8) (64-bit) - Takes ALOT of memory to compile the Benchmark.
+
  Java: http://www.oracle.com/technetwork/java/javase/downloads/index.html (Java 7 or 8) (64-bit)
  
 
==Getting, Building, and Running the Benchmark==
 
==Getting, Building, and Running the Benchmark==
Line 513: Line 518:
 
We have several preconstructed VMs or instructions on how to build one that you can use instead:
 
We have several preconstructed VMs or instructions on how to build one that you can use instead:
  
* Docker: A Dockerfile is checked into the project [https://github.com/OWASP/Benchmark/blob/master/VMs/Dockerfile here]. This Docker file should automatically produce a Docker VM that has the latest and greatest version of the Benchmark project files. After you have Docker installed, navigate to this directory and do the following:  
+
* Docker: A Dockerfile is checked into the project [https://github.com/OWASP/Benchmark/blob/master/VMs/Dockerfile here]. This Docker file should automatically produce a Docker VM with the latest Benchmark project files. After you have Docker installed, cd to /VMs then run:  
  docker build -t benchmark:v1.2 .   --> This builds the Docker Benchmark VM (This will take a WHILE)
+
  ./buildDockerImage.sh --> This builds the Docker Benchmark VM (This will take a WHILE)
  docker images   --> You should see this new image in the list provided
+
  docker images --> You should see the new benchmark:latest image in the list provided
  # The above 2 steps only have to be done once. Then, to run the Benchmark in your Docker VM, just do this:
+
  # The Benchmark Docker Image only has to be created once.  
docker run -p 8443:8443 -it benchmark:v1.2 /benchmark/bench.sh  --> Clones Benchmark from github, builds everything, and starts a remotely accessible Benchmark web app.
+
 
 +
To run the Benchmark in your Docker VM, just run:
 +
  ./runDockerImage.sh  --> This pulls in any updates to Benchmark since the Image was built, builds everything, and starts a remotely accessible Benchmark web app.
 
  If successful, you should see this at the end:
 
  If successful, you should see this at the end:
 
   [INFO] [talledLocalContainer] Tomcat 8.x started on port [8443]
 
   [INFO] [talledLocalContainer] Tomcat 8.x started on port [8443]
 
   [INFO] Press Ctrl-C to stop the container...
 
   [INFO] Press Ctrl-C to stop the container...
  docker-machine ls (in a different window) --> To get IP Docker VM is exporting (e.g., tcp://192.168.99.100:2376)
+
  Then simply navigate to: https://localhost:8443/benchmark from the machine you are running Docker
In a browser, navigate to: https://192.168.99.100:8443/benchmark (using the above IP as an example)
+
 +
Or if you want to access from a different machine:
 +
  docker-machine ls (in a different terminal) --> To get IP Docker VM is exporting (e.g., tcp://192.168.99.100:2376)
 +
  Navigate to: https://192.168.99.100:8443/benchmark in your browser (using the above IP as an example)
 +
 
 
* Amazon Web Services (AWS) - Here's how you set up the Benchmark on an AWS VM:
 
* Amazon Web Services (AWS) - Here's how you set up the Benchmark on an AWS VM:
 
 
  sudo yum install git
 
  sudo yum install git
 
  sudo yum install maven
 
  sudo yum install maven
Line 627: Line 637:
  
 
[http://h3xstream.github.io/find-sec-bugs/ FindSecurityBugs] is a great plugin for FindBugs that significantly increases the ability for FindBugs to find security issues. We include this free tool in the Benchmark and its all dialed in. Simply run the script: ./script/runFindSecBugs.(sh or bat). If you want to run a different version of FindSecBugs, just change the version number of the findsecbugs-plugin artifact in the Benchmark pom.xml file.
 
[http://h3xstream.github.io/find-sec-bugs/ FindSecurityBugs] is a great plugin for FindBugs that significantly increases the ability for FindBugs to find security issues. We include this free tool in the Benchmark and its all dialed in. Simply run the script: ./script/runFindSecBugs.(sh or bat). If you want to run a different version of FindSecBugs, just change the version number of the findsecbugs-plugin artifact in the Benchmark pom.xml file.
 +
 +
=== Kiuwan Code Security ===
 +
 +
Kiuwan Code Security includes a predefined model for executing the OWASP benchmark. Refer to the [https://www.kiuwan.com/blog/owasp-benchmark-diy/ step-by-step instructions] on the Kiuwan website.
  
 
=== Micro Focus (Formally HP) Fortify ===
 
=== Micro Focus (Formally HP) Fortify ===
Line 726: Line 740:
  
 
* In Terminal 1, launch the Benchmark application and wait until it starts
 
* In Terminal 1, launch the Benchmark application and wait until it starts
   '''$  ./runBenchmark_wContrast.sh''' (.bat on Windows)
+
   '''$ cd tools/Contrast  
 +
  '''$ ./runBenchmark_wContrast.sh''' (.bat on Windows)
 
   '''[INFO] Scanning for projects...
 
   '''[INFO] Scanning for projects...
 
   '''[INFO]                                                                         
 
   '''[INFO]                                                                         
Line 754: Line 769:
 
   '''Copying Contrast report to results directory'''
 
   '''Copying Contrast report to results directory'''
  
* Generate scorecards in /Benchmark/scorecard
+
* In Terminal 2, generate scorecards in /Benchmark/scorecard
 
   '''$ ./createScorecards.sh''' (.bat on Windows)
 
   '''$ ./createScorecards.sh''' (.bat on Windows)
 
   '''Analyzing results from Benchmark_1.2-Contrast.log
 
   '''Analyzing results from Benchmark_1.2-Contrast.log
Line 783: Line 798:
 
While we don't have hard and fast rules of exactly what we are going to do next, enhancements in the following areas are planned for the next release:
 
While we don't have hard and fast rules of exactly what we are going to do next, enhancements in the following areas are planned for the next release:
  
* Add new vulnerability categories (e.g., Hibernate Injection)
+
* Add new vulnerability categories (e.g., XXE, Hibernate Injection)
 
* Add support for popular server side Java frameworks (e.g., Spring)
 
* Add support for popular server side Java frameworks (e.g., Spring)
 
* Add web services test cases
 
* Add web services test cases

Latest revision as of 03:05, 20 December 2019

Lab big.jpg

OWASP Benchmark Project

The OWASP Benchmark for Security Automation (OWASP Benchmark) is a free and open test suite designed to evaluate the speed, coverage, and accuracy of automated software vulnerability detection tools and services (henceforth simply referred to as 'tools'). Without the ability to measure these tools, it is difficult to understand their strengths and weaknesses, and compare them to each other. Each version of the OWASP Benchmark contains thousands of test cases that are fully runnable and exploitable, each of which maps to the appropriate CWE number for that vulnerability.

You can use the OWASP Benchmark with Static Application Security Testing (SAST) tools, Dynamic Application Security Testing (DAST) tools like OWASP ZAP and Interactive Application Security Testing (IAST) tools. Benchmark is implemented in Java. Future versions may expand to include other languages.

Benchmark Project Scoring Philosophy

Security tools (SAST, DAST, and IAST) are amazing when they find a complex vulnerability in your code. But with widespread misunderstanding of the specific vulnerabilities automated tools cover, end users are often left with a false sense of security.

We are on a quest to measure just how good these tools are at discovering and properly diagnosing security problems in applications. We rely on the long history of military and medical evaluation of detection technology as a foundation for our research. Therefore, the test suite tests both real and fake vulnerabilities.

There are four possible test outcomes in the Benchmark:

  1. Tool correctly identifies a real vulnerability (True Positive - TP)
  2. Tool fails to identify a real vulnerability (False Negative - FN)
  3. Tool correctly ignores a false alarm (True Negative - TN)
  4. Tool fails to ignore a false alarm (False Positive - FP)

We can learn a lot about a tool from these four metrics. Consider a tool that simply flags every line of code as vulnerable. This tool will perfectly identify all vulnerabilities! But it will also have 100% false positives and thus adds no value. Similarly, consider a tool that reports absolutely nothing. This tool will have zero false positives, but will also identify zero real vulnerabilities and is also worthless. You can even imagine a tool that flips a coin to decide whether to report whether each test case contains a vulnerability. The result would be 50% true positives and 50% false positives. We need a way to distinguish valuable security tools from these trivial ones.

If you imagine the line that connects all these points, from 0,0 to 100,100 establishes a line that roughly translates to "random guessing." The ultimate measure of a security tool is how much better it can do than this line. The diagram below shows how we will evaluate security tools against the Benchmark.

Wbe guide.png

A point plotted on this chart provides a visual indication of how well a tool did considering both the True Positives the tool reported, as well as the False Positives it reported. We also want to compute an individual score for that point in the range 0 - 100, which we call the Benchmark Accuracy Score.

The Benchmark Accuracy Score is essentially a Youden Index, which is a standard way of summarizing the accuracy of a set of tests. Youden's index is one of the oldest measures for diagnostic accuracy. It is also a global measure of a test performance, used for the evaluation of overall discriminative power of a diagnostic procedure and for comparison of this test with other tests. Youden's index is calculated by deducting 1 from the sum of a test’s sensitivity and specificity expressed not as percentage but as a part of a whole number: (sensitivity + specificity) – 1. For a test with poor diagnostic accuracy, Youden's index equals 0, and in a perfect test Youden's index equals 1.

 So for example, if a tool has a True Positive Rate (TPR) of .98 (i.e., 98%) 
   and False Positive Rate (FPR) of .05 (i.e., 5%)
 Sensitivity = TPR (.98)
 Specificity = 1-FPR (.95)
 So the Youden Index is (.98+.95) - 1 = .93
 
 And this would equate to a Benchmark score of 93 (since we normalize this to the range 0 - 100)

On the graph, the Benchmark Score is the length of the line from the point down to the diagonal “guessing” line. Note that a Benchmark score can actually be negative if the point is below the line. This is caused when the False Positive Rate is actually higher than the True Positive Rate.

Benchmark Validity

The Benchmark tests are not exactly like real applications. The tests are derived from coding patterns observed in real applications, but the majority of them are considerably simpler than real applications. That is, most real world applications will be considerably harder to successfully analyze than the OWASP Benchmark Test Suite. Although the tests are based on real code, it is possible that some tests may have coding patterns that don't occur frequently in real code.

Remember, we are trying to test the capabilities of the tools and make them explicit, so that users can make informed decisions about what tools to use, how to use them, and what results to expect. This is exactly aligned with the OWASP mission to make application security visible.

Generating Benchmark Scores

Anyone can use this Benchmark to evaluate vulnerability detection tools. The basic steps are:

  1. Download the Benchmark from GitHub
  2. Run your tools against the Benchmark
  3. Run the BenchmarkScore tool on the reports from your tools

That's it!

Full details on how to do this are at the bottom of the page on the Quick_Start tab.

We encourage both vendors, open source tools, and end users to verify their application security tools against the Benchmark. In order to ensure that the results are fair and useful, we ask that you follow a few simple rules when publishing results. We won't recognize any results that aren't easily reproducible:

  1. A description of the default “out-of-the-box” installation, version numbers, etc…
  2. Any and all configuration, tailoring, onboarding, etc… performed to make the tool run
  3. Any and all changes to default security rules, tests, or checks used to achieve the results
  4. Easily reproducible steps to run the tool

Reporting Format

The Benchmark includes tools to interpret raw tool output, compare it to the expected results, and generate summary charts and graphs. We use the following table format in order to capture all the information generated during the evaluation.

Code Repo and Build/Run Instructions

See the Getting Started and Getting, Building, and Running the Benchmark sections on the Quick Start tab.

Licensing

The OWASP Benchmark is free to use under the GNU General Public License v2.0.

Mailing List

OWASP Benchmark Mailing List

Project Leaders

Dave Wichers @

Project References

Related Projects

Quick Download

All test code and project files can be downloaded from OWASP GitHub.

Project Intro Video

BenchmarkPodcastTitlePage.jpg

News and Events

  • LOOKING FOR VOLUNTEERS!! - We are looking for individuals and organizations to join and make this a much more community driven project, including additional coleaders to help take this project to the next level. Contributors could work on things like new test cases, additional tool scorecard generators, adding support for languages beyond Java, and a host of other improvements. Please contact me if you are interested in contributing at any level.
  • June 5, 2016 - Benchmark Version 1.2 Released
  • Sep 24, 2015 - Benchmark introduced to broader OWASP community at AppSec USA
  • Aug 27, 2015 - U.S. Dept. of Homeland Security (DHS) is financially supporting the Benchmark project.
  • Aug 15, 2015 - Benchmark Version 1.2beta Released with full DAST Support. Checkmarx and ZAP scorecard generators also released.
  • July 10, 2015 - Benchmark Scorecard generator and open source scorecards released
  • May 23, 2015 - Benchmark Version 1.1 Released
  • April 15, 2015 - Benchmark Version 1.0 Released

Classifications

Owasp-incubator-trans-85.png Owasp-builders-small.png
Owasp-defenders-small.png
GNU General Public License v2.0
Project Type Files CODE.jpg