When executed, the program first prompts for a username:
Enter user name: rcs
The program prompts users for a password, which is read into the password variable declared on line 11 as an automatic array of 10 characters. In Example 1, if you examine the data starting at the address of the array on the stack (0x002DF3D4, in this example) before the password is read from standard input, you can see the storage allocated for password (shown in bold) and the return address on the stack (shown in red). The return address is Little Endian.
002DF3D4 <b>00 00 00 00 04 f4 2d 00 a0 1b</b> e7 79 80 63 54 00 ......-....y.cT. 002DF3E4 04 f4 2d 00 <font color="#FF0000">f9 0f 0a 02</font> 01 00 00 00 79 3a 4e 00 ..-.........y:N. 002DF3F4 a8 2b 2f 00 38 f4 2d 00 da c4 fc 79 78 f4 2d 00 .+/.8.-....yx.-. 002DF404 48 f4 2d 00 60 13 40 00 01 00 00 00 50 53 54 00 H.-.`[email protected].
By providing more characters than can fit in the storage allocated for the password character array, an attacker can overflow the buffer and overwrite the return address with the address of shellcode (arbitrary code inserted into memory by the attacker). For the purpose of this example, we'll assume this code has already been injected in a separate operation (that does not need to involve a buffer overflow or other vulnerability) and is located at 0x00408130. To execute arbitrary code, the attacker need only enter the following string as the password:
Enter 8 character password: 123456789012345678900|@
This input string is copied into the password character array, overflowing the buffer and overwriting memory up to and including the return address. The three characters in the 0|@ string overwrite the first three bytes of the return address. The last byte of the return address is overwritten by a null-termination character as a result of the gets() function. Notice that if the null byte were located anywhere but in the last byte, it would be impossible to copy the entire string because the gets() function interprets the null character as the end of the string. So why these three characters? The hexadecimal representation of these characters provides the values required to represent the address in memory. The ASCII hexadecimal code for "0" is 0x30, "|" is 0x81, and "@" is 0x40. By concatenating these three characters in the order { '0', '|', '@' }, you can write the Little Endian representation of the address of the shellcode (0x00408130) into memory. Again, the final null byte is supplied by the null termination byte for the string (see Example 2).
002DF3D4 31 32 33 34 35 36 37 38 39 30 31 32 33 34 35 36 1234567890123456 002DF3E4 37 38 39 30 <font color="#FF0000">30 81 40 00</font> 01 00 00 00 79 3a 4e 00 [email protected]:N. 002DF3F4 a8 2b 2f 00 38 f4 2d 00 da c4 fc 79 78 f4 2d 00 .+/.8.-....yx.-. 002DF404 48 f4 2d 00 60 13 40 00 01 00 00 00 50 53 54 00 H.-.`[email protected].
When the checkpassword() function returns, control is passed to the shellcode instead of returning to the original return address in the main() function.
To simplify the exploit, buffer security checks using the /GS option were disabled. If this option had not been disabled, the compiler would have inserted a canary after any array (buffer) declared on the stack; see Figure 1.
To overwrite the return address (EIP), base pointer (EBP), or other values in a stack frame protected by a canary using an unbounded string copy operation such as gets() or strcpy(), an attacker would need to first overwrite the canary. If the canary is modified, an error is generated when the function returns and the attack will fail (unless the goal of the attack is a denial of service). It is possible to defeat the canary by brute force, predicting the value, or other means. However, the difficulty of launching a successful exploit increases.