Buffer Overflows

This Page has been flagged for review. Please help OWASP and review this Page to FixME.
Comment: Not edited for over a year, may need SME review

Development Guide Table of Contents

Objective

To ensure that:

Applications do not expose themselves to faulty components.

Applications create as few buffer overflows as possible.

Developers are encouraged to use languages and frameworks that are relatively immune to buffer overflows.

Platforms Affected

Almost every platform, with the following notable exceptions:

Java/J2EE – as long as native methods or system calls are not invoked.

.NET – as long as unsafe or unmanaged code is not invoked (such as the use of P/Invoke or COM Interop).

PHP, Python, Perl – as long as external programs or vulnerable extensions are not used.

Relevant COBIT Topics

DS11.9 – Data processing integrity.

Description

Attackers generally use buffer overflows to corrupt the execution stack of a web application. By sending carefully crafted input to a web application, an attacker can cause the web application to execute arbitrary code, possibly taking over the machine. Attackers have managed to identify buffer overflows in a staggering array of products and components.

Buffer overflow flaws can be present in both the web server and application server products that serve the static and dynamic portions of a site, or in the web application itself. Buffer overflows found in commonly-used server products are likely to become widely known and can pose a significant risk to users of these products. When web applications use libraries, such as a graphics library to generate images or a communications library to send e-mail, they open themselves to potential buffer overflow attacks. Literature detailing buffer overflow attacks against commonly-used products is readily available, and newly discovered vulnerabilities are reported almost daily.

Buffer overflows can also be found in custom web application code, and may even be more likely, given the lack of scrutiny that web applications typically go through. Buffer overflow attacks against customized web applications can sometimes lead to interesting results. In some cases, we have discovered that sending large inputs can cause the web application or the back-end database to malfunction. It is possible to cause a denial of service attack against the web site, depending on the severity and specific nature of the flaw. Overly large inputs could cause the application to display a detailed error message, potentially leading to a successful attack on the system.

Buffer overflow attacks generally rely upon two techniques (and usually the combination):

Writing data to particular memory addresses

Having the operating system mishandle data types

This means that strongly-typed programming languages (and environments) that disallow direct memory access usually prevent buffer overflows from happening.

Language/Environment	Compiled or Interpreted	Strongly Typed	Direct Memory Access	Safe or Unsafe
Java, Java Virtual Machine (JVM)	Both	Yes	No	Safe
.NET	Both	Yes	No	Safe
Perl	Both	Yes	No	Safe
Python - interpreted	Intepreted	Yes	No	Safe
Ruby	Interpreted	Yes	No	Safe
C/C++	Compiled	No	Yes	Unsafe
Assembly	Compiled	No	Yes	Unsafe
COBOL	Compiled	Yes	No	Safe

Table 8.1: Language descriptions

General Prevention Techniques

A number of general techniques to prevent buffer overflows include:

Code auditing (automated or manual)

Developer training – bounds checking, use of unsafe functions, and group standards

Non-executable stacks – many operating systems have at least some support for this

Compiler tools – StackShield, StackGuard, and Libsafe, among others

Safe functions – use strncat instead of strcat, strncpy instead of strcpy, etc

Patches – Be sure to keep your web and application servers fully patched, and be aware of bug reports relating to applications upon which your code is dependent.

Periodically scan your application with one or more of the commonly available scanners that look for buffer overflow flaws in your server products and your custom web applications.

Stack Overflow

Stack overflows are the best understood and the most common form of buffer overflows. The basics of a stack overflow is simple:

There are two buffers, a source buffer containing arbitrary input (presumably from the attacker), and a destination buffer that is too small for the attack input. The second buffer resides on the stack and somewhat adjacent to the function return address on the stack.

The faulty code does not check that the source buffer is too large to fit in the destination buffer. It copies the attack input to the destination buffer, overwriting additional information on the stack (such as the function return address).

When the function returns, the CPU unwinds the stack frame and pops the (now modified) return address from the stack.

Control does not return to the function as it should. Instead, arbitrary code (chosen by the attacker when crafting the initial input) is executed.

The following example, written in C, demonstrates a stack overflow exploit.

#include <string.h>

void f(char* s) {
    char buffer[10];
    strcpy(buffer, s);
}

void main(void) {
    f("01234567890123456789");
}

[root /tmp]# ./stacktest

Segmentation fault

How to determine if you are vulnerable

If your program:

is written in a language (or depends upon a program that is written in a language) that allows buffer overflows to be created (see Table 8.1) AND

copies data from one buffer on the stack to another without checking sizes first AND

does not use techniques such as canary values or non-executable stacks to prevent buffer overflows THEN

it is likely that the application is vulnerable to attack.

How to protect yourself

Deploy on systems capable of using non-executable stacks, such as:
1. AMD and Intel x86-64 chips with associated 64-bit operating systems
2. Windows XP SP2 (both 32- and 64-bit)
3. Windows 2003 SP1 (both 32- and 64-bit)
4. Linux after 2.6.8 on AMD and x86-64 processors in 32- and 64-bit mode
5. OpenBSD (w^x on Intel, AMD, SPARC, Alpha and PowerPC)
6. Solaris 2.6 and later with the “noexec_user_stack” flag enabled
Use higher-level programming languages that are strongly typed and that disallow direct memory access.
Validate input to prevent unexpected data from being processed, such as being too long, of the wrong data type, containing "junk" characters, etc.
If relying upon operating system functions or utilities written in a vulnerable language, ensure that they:
1. use the principle of least privilege
2. use compilers that protect against stack and heap overflows
3. are current in terms of patches

Heap Overflow

Heap overflows are problematic in that they are not necessarily protected by CPUs capable of using non-executable stacks. A heap is an area of memory allocated by the application at run-time to store data. The following example, written in C, shows a heap overflow exploit.

 #include <stdio.h>
 #include <stdlib.h>
 #include <unistd.h>
 #include <string.h>

 #define BSIZE 16
 #define OVERSIZE 8 /* overflow buf2 by OVERSIZE bytes */

 void main(void) {
    u_long b_diff;
    char *buf0 = (char*)malloc(BSIZE);		// create two buffers
    char *buf1 = (char*)malloc(BSIZE);

    b_diff = (u_long)buf1 - (u_long)buf0;	// difference between locations
    printf("Initial values:  ");
    printf("buf0=%p, buf1=%p, b_diff=0x%x bytes\n", buf0, buf1, b_diff);

    memset(buf1, 'A', BUFSIZE-1), buf1[BUFSIZE-1] = '\0';
    printf("Before overflow: buf1=%s\n", buf1);

    memset(buf0, 'B', (u_int)(diff + OVERSIZE));
    printf("After overflow:  buf1=%s\n", buf1);
}

[root /tmp]# ./heaptest

Initial values:  buf0=0x9322008, buf1=0x9322020, diff=0xff0 bytes
Before overflow: buf1=AAAAAAAAAAAAAAA
After overflow:  buf1=BBBBBBBBAAAAAAA

The simple program above shows two buffers being allocated on the heap, with the first buffer being overflowed to overwrite the contents of the second buffer.

How to determine if you are vulnerable

If your program:

is written in a language (or depends upon a program that is written in a language) that allows buffer overflows to be created (see Table 8.1) AND

copies data from one buffer on the stack to another without checking sizes first AND

does not use techniques such as canary values to prevent buffer overflows THEN

it is likely that the application is vulnerable to attack.

How to protect yourself

Use higher-level programming languages that are strongly typed and that disallow direct memory access.
Validate input to prevent unexpected data from being processed, such as being too long, of the wrong data type, containing "junk" characters, etc.
If relying upon operating system functions or utilities written in a vulnerable language, ensure that they:
1. use the principle of least privilege
2. use compilers that protect against stack and heap overflows
3. are current in terms of patches

Format String

Format string buffer overflows (usually called "format string vulnerabilities") are highly specialized buffer overflows that can have the same effects as other buffer overflow attacks. Basically, format string vulnerabilities take advantage of the mixture of data and control information in certain functions, such as C/C++'s printf. The easiest way to understand this class of vulnerability is with an example:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>

void main(void) {
    char str[100] = scanf("%s");
    printf("%s", str);
}

This simple program takes input from the user and displays it back on the screen. The string %s means that the other parameter, str, should be displayed as a string. This example is not vulnerable to a format string attack, but if one changes the last line, it becomes exploitable:

   printf(str);

To see how, consider the user entering the special input:

%08x.%08x.%08x.%08x.%08x

By constructing input as such, the program can be exploited to print the first five entries from the stack.

How to determine if you are vulnerable

If your program:

uses functions such as printf, snprintf directly, or indirectly through system services (such as syslog) or other AND

the use of such functions allows input from the user to contain control information interpreted by the function itself

it is highly likely that the application is vulnerable to attack.

How to protect yourself

Use higher-level programming languages that are strongly typed and that disallow direct memory access.
Validate input to prevent unexpected data from being processed, such as being too long, of the wrong data type, containing "junk" characters, etc. Specifically check for control information (meta-characters like '%')
Avoid the use of functions like printf that allow user input to contain control information
If relying upon operating system functions or utilities written in a vulnerable language, ensure that they:
1. use the principle of least privilege
2. use compilers that protect against stack and heap overflows
3. are current in terms of patches

Unicode Overflow

Unicode exploits are a bit more difficult to do than typical buffer overflows as demonstrated in Anley’s 2002 paper, but it is wrong to assume that by using Unicode, you are protected against buffer overflows. Examples of Unicode overflows include Code Red, a devastating Trojan with an estimated economic cost in the billions of dollars.

How to determine if you are vulnerable

If your program:

is written in a language (or depends upon a program that is written in a language) that allows buffer overflows to be created (see Table 8.1) AND

takes Unicode input from a user AND

fails to sanitize the input AND

does not use techniques such as canary values to prevent buffer overflows THEN

it is highly likely that the application is vulnerable to attack.

How to protect yourself

Deploy on systems capable of using non-executable stacks, such as:
1. AMD and Intel x86-64 chips with associated 64-bit operating systems
2. Windows XP SP2 (both 32- and 64-bit)
3. Windows 2003 SP1 (both 32- and 64-bit)
4. Linux after 2.6.8 on AMD and x86-64 processors in 32- and 64-bit mode
5. OpenBSD (w^x on Intel, AMD, SPARC, Alpha and PowerPC)
6. Solaris 2.6 and later with the “noexec_user_stack” flag enabled
Use higher-level programming languages that are strongly typed and that disallow direct memory access.
Validate input to prevent unexpected data from being processed, such as being too long, of the wrong data type, containing "junk" characters, etc.
If relying upon operating system functions or utilities written in a vulnerable language, ensure that they:
1. use the principle of least privilege
2. use compilers that protect against stack and heap overflows
3. are current in terms of patches

Integer Overflow

When an application takes two numbers of fixed word size and perform an operation with them, the result may not fit within the same word size. For example, if the two 8-bit numbers 192 and 208 are added together and stored into another 8-bit byte, the result will not fit into an 8-bit result:

1100 0000

+ 1101 0000

= 0001 1001 0000

Although such an operation will usually cause some type of exception, your application must be coded to check for such an exception and take proper action. Otherwise, your application would report that 192 + 208 equals 144.

The following code demonstrates a buffer overflow, and was adapted from Blexim's Phrack article:

#include <stdio.h>
#include <string.h>

void main(int argc, char *argv[]) {
    int i = atoi(argv[1]);         // input from user
    unsigned short s = i;          // truncate to a short
    char buf[50];                  // large buffer

    if (s > 10) {                  // check we're not greater than 10
        return;
    }

    memcpy(buf, argv[2], i);       // copy i bytes to the buffer
    buf[i] = '\0';                 // add a null byte to the buffer
    printf("%s\n", buf);           // output the buffer contents

    return;
} 

[root /tmp]# ./inttest 65580 foobar
Segmentation fault

The above code is exploitable because the validation does not occur on the input value (65580), but rather the value after it has been converted to an unsigned short (45).

Integer overflows can be a problem in any language and can be exploited when integers are used in array indices and implicit short math operations.

How to determine if you are vulnerable

Examine use of signed integers, bytes, and shorts.

Are there cases where these values are used as array indices after performing an arithmetic operation (+, -, *, /, or % (modulo))?

How would your program react to a negative or zero value for integer values, particular during array lookups?

How to protect yourself

If using .NET, use David LeBlanc’s SafeInt class or a similar construct. Otherwise, use a "BigInteger" or "BigDecimal" implementation in cases where it would be hard to validate input yourself.

If your compiler supports the option, change the default for integers to be unsigned unless otherwise explicitly stated. Use unsigned integers whenever you don't need negative values.

Use range checking if your language or framework supports it, or be sure to implement range checking yourself after all arithmetic operations.

Be sure to check for exceptions if your language supports it.

Buffer Overflows

Objective

Platforms Affected

Relevant COBIT Topics

Description

General Prevention Techniques

Stack Overflow

How to determine if you are vulnerable

How to protect yourself

Heap Overflow

How to determine if you are vulnerable

How to protect yourself

Format String

How to determine if you are vulnerable

How to protect yourself

Unicode Overflow

How to determine if you are vulnerable

How to protect yourself

Integer Overflow

How to determine if you are vulnerable

How to protect yourself

Further reading

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Reference

Tools