|
|
(62 intermediate revisions by 4 users not shown) |
Line 2: |
Line 2: |
| <div style="width:100%;height:160px;border:0,margin:0;overflow: hidden;">[[File:Cheatsheets-header.jpg|link=]]</div> | | <div style="width:100%;height:160px;border:0,margin:0;overflow: hidden;">[[File:Cheatsheets-header.jpg|link=]]</div> |
| | | |
− | {| style="padding: 0;margin:0;margin-top:10px;text-align:left;" |-
| + | The Cheat Sheet Series project has been moved to [https://github.com/OWASP/CheatSheetSeries GitHub]! |
− | | valign="top" style="border-right: 1px dotted gray;padding-right:25px;" |
| |
− | Last revision (mm/dd/yy): '''{{REVISIONMONTH}}/{{REVISIONDAY}}/{{REVISIONYEAR}}'''
| |
− | <br/>
| |
− | __TOC__{{TOC hidden}}
| |
− | = Introduction =
| |
| | | |
− | The rationale for third party vendor javascript ('''tags''') is to provide data from the user's browser DOM to the vendor for some form of marketing analysis. This data can be anything available in the DOM. The data is used for user navigation and clickstream analysis, identification of the user to determine further content to display etc., and various marketing analysis functions.
| + | Please visit [https://cheatsheetseries.owasp.org/cheatsheets/Third_Party_Javascript_Management_Cheat_Sheet.html 3rd Party Javascript Management Cheat Sheet] to see the |
− | | |
− | The term '''host''' refers to the original site the user goes to, such as a shopping or news site, that contains or retrieves and executes third party javascript tag for marketing analysis of the user actions.
| |
− | | |
− | <nowiki><!-- Some host, e.g. foobar.com, HTML code here --></nowiki>
| |
− | <html>
| |
− | <head></head>
| |
− | <body>
| |
− | ...
| |
− | <nowiki><!-- 3rd party vendor javascript --> </nowiki>
| |
− | <script src="https://analytics.vendor.com/v1.1/script.js"></script>
| |
− | <nowiki><!-- /3rd party vendor javascript --> </nowiki>
| |
− | </body>
| |
− | </html>
| |
− | | |
− | = Major risks =
| |
− | | |
− | The invocation of 3rd party JS code in a web application requires consideration for 3 risks in particular:
| |
− | # The loss of control over changes to the client application,
| |
− | # The execution of arbitrary code on client systems,
| |
− | # The disclosure or leakage of sensitive information to 3rd parties.
| |
− | | |
− | == Risk 1: Loss of control over changes to the client application ==
| |
− | This risk arises from the fact that there is usually no guaranty that the code hosted at the 3rd party will remain the same as seen from the developers and testers: new features may be pushed in the 3rd party code at any time, thus potentially breaking the interface or data-flows and exposing the availability of your application to its users/customers.
| |
− | | |
− | Typical defenses include, but are not restricted to: in-house script mirroring (to prevent alterations by 3rd parties), sub-resource integrity (to enable browser-level interception) and secure transmission of the 3rd party code (to prevent modifications while in-transit). See below for more details.
| |
− | | |
− | == Risk 2: Execution of arbitrary code on client systems ==
| |
− | This risk arises from the fact that 3rd party JavaScript code is rarely reviewed by the invoking party prior to its integration into a website/application. As the client reaches the hosting website/application, this 3rd party code gets executed, thus granting the 3rd party the exact same privileges that were granted to the user (similar to [[Cross-site_Scripting_(XSS)|XSS attacks]]).
| |
− | | |
− | Any testing performed prior to entering production loses some of its validity, including *AST testing (IAST, RAST, SAST, DAST, etc.). While it is widely accepted that the probability of having rogue code intentionally injected by the 3rd party is low, there are still cases of malicious injections in 3rd party code after the organization's servers were compromised (Yahoo, January 2014). This risk should therefore still be evaluated, in particular when the 3rd party does not show any documentation that it is enforcing better security measures than the invoking organization itself, or at least equivalent. Another example is that the domain hosting the 3rd party JavaScript code expires because the company maintaining it is bankrupt or the developers have abandoned the project. A malicious actor can then re-register the domain and publish malicious code.
| |
− | | |
− | Typical defenses include, but are not restricted to: in-house script mirroring (to prevent alterations by 3rd parties), sub-resource integrity (to enable browser-level interception), secure transmission of the 3rd party code (to prevent modifications while in-transit) and various types of sandboxing. See below for more details.
| |
− | | |
− | == Risk 3: Disclosure of sensitive information to 3rd parties ==
| |
− | When a 3rd party script is invoked in a website/application, the browser directly contacts the 3rd party servers. By default, the request includes all regular HTTP headers. In addition to the originating IP address of the browser, the 3rd party also obtains other data such as the referrer (in non-https requests) and any cookies previously set by the 3rd party, for example when visiting another organization's website that also invokes the 3rd party script.
| |
− | | |
− | In many cases, this grants the 3rd party primary access to information on the organization's users / customers / clients. Additionally, if the 3rd party is sharing the script with other entities, it also collects secondary data from all the other entities, thus knowing who the organization's visitors are but also what other organizations they interact with.
| |
− | | |
− | A typical case is the current situation with major news/press sites that invoke 3rd party code (typically for ad engines, statistics and JavaScript APIs): any user visiting any of these websites also informs the 3rd parties of the visit. In many cases, the 3rd party also gets to know what news articles each individual user is clicking specifically (leakage occurs through the HTTP referrer field) and thus can establish deeper personality profiles.
| |
− | | |
− | Typical defenses include, but are not restricted to: in-house script mirroring (to prevent leakage of HTTP requests to 3rd parties). Users can reduce their profiling by random clicking links on leaking websites/applications (such as press/news websites) to reduce profiling. See below for more details.
| |
− | | |
− | = 3rd Party JavaScript Deployment Architectures =
| |
− | | |
− | There are three basic deployment mechanisms for third party vendor javascript, or 'tags'.
| |
− | | |
− | == Vendor Javascript on page ==
| |
− | | |
− | This is where the vendor provides the host with the javascript and the host puts it on the host page. To be secure the host company must review the code for any vulnerabilities like cross site scripting or malicious actions such as sending sensitive data from the DOM to a malicious site. This is often difficult because the javascript is commonly obfuscated.
| |
− | | |
− | == Javascript Request to Vendor ==
| |
− | | |
− | This is where one or a few lines of code on the host page each request a javascript file or URL directly from the vendor site. When the host page is being created, the developer includes the lines of code provided by the vendor that will request the vendor javascript.
| |
− | Each time the page is accessed the requests are made to the vendor site for the javascript, which then executes on the user browser.
| |
− | | |
− | == Indirect request to Vendor through Tag Manager ==
| |
− | | |
− | This is where one or a few lines of code on the host page each request a javascript file or URL from a tag aggregator or tag manager site; not from the javascript vendor site. The tag aggregator or tag manager site returns whatever third party javascript files that the host company has configured to be returned. Each file or URL request to the tag manager site can return lots of other javascript files from multiple vendors.
| |
− | | |
− | The actual content that is returned from the aggregator or manager (i.e. the specific javascript files as well as exactly what they do) can be dynamically changed by host site employees using a graphical user interface for development, hosted on the tag manager site that non-technical users can work with, such as the marketing part of the business. The changes can be either 1. get a different javascript file from the 3rd party vendor for the same request; 2. change what DOM object data is read, and when, to send to the vendor.
| |
− | | |
− | The tag manager developer user interface will generate code that does what the marketing functionality requires, basically determining what data to get from the browser DOM and when to get it. The tag manager always returns a 'container' javascript file to the browser which is basically a set of javascript functions that are used by the code generated by the user interface to implement the required functionality. Similar to java frameworks that provide functions and global data to the developer, the container javascript executes on the browser and lets the business user use the tag manager developer user interface to specify high level functionality without needing to know javascript.
| |
− | | |
− | === Security Problems with requesting Tags ===
| |
− | | |
− | Previously described mechanisms are difficult to make secure because you can only see the code if you proxy the requests or if you get access to the GUI and see what is configured. The javascript is generally obfuscated so even seeing it is usually not useful. It is also instantly deployable because each new page request from a browser executes the requests to the aggregator which gets the javascript from the third party vendor. So as soon as any javascript files are changed on the vendor, or modified on the aggregator, the next call for them from any browser will get the changed javascript. This risk can be managed with the <i>Subresource Integrity</i> standard described below.
| |
− | | |
− | == Server Direct ==
| |
− | | |
− | The tag manager developer user interface can be used to get data from anywhere in the browser DOM. This can allow vulnerabilities because the interface can be used to generate code to get unvalidated data from the DOM (e.g. URL parameters) and store it in some page location that would execute javascript.
| |
− | | |
− | The best way to constrain the generated code is to confine it to getting DOM data from a host defined data layer.
| |
− | | |
− | With this mechanism, [[only the javascript you (the host site owner) generate]] using the tag manager developer user interface will get and send data values to the tag manager or tag aggregator site which then sends the data to vendors. '''This the most secure technique''' because only your javascript executes on your users browser, and only the data you decide on is sent to the vendor.
| |
− | | |
− | This requires cooperation between the host, the aggregator or tag manager and the vendors.
| |
− | | |
− | The host developers have to work with the vendor in order to know what type of data the vendor needs to do their analysis. Then the host programmer determines what DOM element will have that data.
| |
− | | |
− | The host developers have to work with the tag manager or aggregator to agree on the protocol to send the data to the aggregator: what URL, parameters, format etc.
| |
− | | |
− | The tag manager or aggregator has to work with the vendor to agree on the protocol to send the data to the vendor: what URL, parameters, format etc. Does the vendor have an API?
| |
− | | |
− | = Security Defense Considerations =
| |
− | | |
− | == Server Direct Data Layer ==
| |
− | | |
− | The server direct mechanism is a good security standard for third party javascript management, deployment and execution. A good practice for the host page is to create a data layer of DOM objects. The data layer is either (1) a DIV object with attribute values that have the marketing or user behavior data that the 3rd party wants or (2) a set of JSON objects with the same data. Each variable or attribute contains the value of some DOM element. The data layer is the complete set of values that all vendors need for that page.
| |
− | | |
− | The data layer can also perform any validation of the values, especially values from DOM objects exposed to the user like URL parameters and input fields, if these are required for the marketing analysis.
| |
− | | |
− | An example statement for a corporate standard document is 'The tag javascript can only access values in the host data layer. The tag javascript can never access a URL parameter.
| |
− | | |
− | You the host page developer have to agree with the 3rd party vendors or the tag manager what attribute in the data layer will have what value so they can create the javascript to read that value.
| |
− | | |
− | == Indirect Requests ==
| |
− | | |
− | For indirect requests to tag manager/aggregator sites that offer the GUI to configure the javascript, they may also implement technical controls such as only allowing the javascript to access the data layer values, no other DOM element. The host company should also verify the security practices of the tag manager site such as access controls to the tag configuration for the host company.
| |
− | | |
− | Letting the marketing folks decide where to get the data they want can result in XSS because they may get it from a URL parameter and put it into a variable that is in a scriptable location on the page.
| |
− | | |
− | == Sandboxing Content ==
| |
− | | |
− | Both of these tools be used by sites to sandbox/clean DOM data.
| |
− | | |
− | * [https://github.com/cure53/DOMPurify DOMPurify] is a fast, tolerant XSS sanitizer for HTML, MathML and SVG. DOMPurify works with a secure default, but offers a lot of configurability and hooks.
| |
− | * [https://github.com/hackvertor/MentalJS MentalJS] is a JavaScript parser and sandbox. It whitelists JavaScript code by adding a "$" suffix to variables and accessors.
| |
− | | |
− | == Subresource Integrity ==
| |
− | | |
− | [https://www.w3.org/TR/SRI/ Subresource Integrity] will ensure that only the code that has been reviewed is executed. The developer generates integrity metadata for the vendor javascript, and adds it to the script element like this:
| |
− | | |
− | <script src="https://analytics.vendor.com/v1.1/script.js"
| |
− | integrity="sha384-MBO5IDfYaE6c6Aao94oZrIOiC7CGiSNE64QUbHNPhzk8Xhm0djE6QqTpL0HzTUxk"
| |
− | crossorigin="anonymous"></script>
| |
− | | |
− | It is important to know that in order for SRI to work, the vendor host needs [https://www.w3.org/TR/cors/ CORS] enabled.
| |
− | | |
− | == Keeping JavaScript libraries updated ==
| |
− | | |
− | [https://www.owasp.org/index.php/Top_10_2013-A9-Using_Components_with_Known_Vulnerabilities OWASP Top 10 2013 A9] describes the problem of using components with known vulnerabilities. This includes JavaScript libraries. JavaScript libraries must be kept up to date, as previous version can have known vulnerabilities which can lead to the site typically being vulnerable to Cross Site Scripting. There are several tools out there that can help identify such libraries. One such tool is the free open source tool [https://retirejs.github.io RetireJS]
| |
− | | |
− | == Sandboxing with iframe ==
| |
− | | |
− | You can also put vendor javascript into iframe from different domain (e.g. static data host). It will work as a "jail"
| |
− | and vendor javascript will not have direct access to the host page DOM and cookies. The host main page and sandbox iframe can communicate between each other via [https://developer.mozilla.org/en-US/docs/Web/API/Window/postMessage postMessage mechanism].
| |
− | | |
− | == Vendor Agreements ==
| |
− | | |
− | You can have the agreement with the 3rd parties say that they have to implement and prove secure coding and general corporate security.
| |
− | | |
− | = References =
| |
− | | |
− | https://randywestergren.com/widespread-xss-vulnerabilities-ad-network-code-affecting-top-tier-publishers-retailers/
| |
− | | |
− | = Authors and Primary Editors =
| |
− | | |
− | | |
− | | |
− | = Other Cheatsheets =
| |
− | | |
− | {{Cheatsheet_Navigation_Body}}
| |
− | [[Category:Cheatsheets]]
| |