Lecture 4: Memory Safety Vulnerabilities¶
Integer Memory Safety Vulnerabilities¶
bad formal:
C | |
---|---|
1 |
|
good formal:
C | |
---|---|
1 2 |
|
PAGE_SIZE
,size
, andlen
are unsigned- If
size
is larger thanPAGE_SIZE
, thenPAGE_SIZE - 2 - size
will trigger a negative overflow to0xFFFFFFFF
- Result: An attacker can bypass the length check and write data into the kernel
Format String Vulnerabilities¶
printf
accepts a variable number of arguments- How does it know how many arguments that it received?
- It infers it from the first argument: the format string!
- Example:
printf("One %s costs %d", fruit, price)
;
What happens if the arguments are mismatched?
Normal Arguments - OK¶
C | |
---|---|
1 2 3 4 |
|
printf
assumes that there is 1 more argument because there is one format specifier (%d), so it will look 4 bytes up the stack for the argument.
No Arguments - Problem¶
C | |
---|---|
1 2 3 4 |
|
Because the format string contains the %d
, it will still look 4 bytes up -- and print the value of secret
!
No Format String - Problem¶
In this part, we mainly focus on problem raised by "no the format string"
Basic Elements¶
Similar to how the %d
format modifier simply makes the printf()
function print the value located at the expected address, various format string modifiers have different uses. Here are a couple of examples that might be useful:
%s
→ Treat the argument as an address and print the string at that address up until the first null byte%n
→ Treat the argument as an address and write the number of characters that have been printed so far to that address%c
→ Treat the argument as a value and print it out as a character%x
→ Look at the stack and read the first variable after the format string%[b]u
→ Print out[b]
bytes starting from the argument
What is Arg[0]¶
C | |
---|---|
1 2 3 4 5 6 7 |
|
Note that strings are passed by reference in C, so the argument to printf
is actually a pointer to buf
, which is in static
memory.
How to Attack¶
In this way, we can easily attack:
C | |
---|---|
1 2 3 4 5 6 7 |
|
We can let buf = "%d%s"
, then printf
will print the value of secret_number
and the string after it.
Actually, we’re calling printf("%d%s")
. printf reads its first argument (arg0), sees two format specifiers, and expects two more arguments (arg1 and arg2).
- The first format specifier
%d
says to treat the next argument (arg1) as an integer and print it out. %s
will dereference the pointer at arg2 and print until it sees a null byte ('\0'
)
In this situation, it will print "secret_number" and "secret_string" after it.
Utilize %n
- Attack¶
printf
can also write values using the %n
specifier
%n
treats the next argument as a pointer and writes the number of bytes printed so far to the address pointed to (usually used to calculate output spacing)
printf("item %d:%n", 3, &val)
- stores 7 in val.
%d
is just a signal, it refers to 3 which is 1 int.- "item"(4) + blankspace(1) + "3"(1) + ":"(1) = 7.
printf("item %d:%n", 987, &val)
- stores 9 in val printf("000%n")
%d
is just a signal, it refers to 987 which is 3 ints.
printf("000%n")
- writes the value 3 to the memory location pointed to by address located 8 bytes above the RIP of
printf
- why 3? because there are 0 0 0 before
%n
- writes the value 3 to the memory location pointed to by address located 8 bytes above the RIP of
In this part, we attack by "no the format string" + "%n"
C | |
---|---|
1 2 3 4 5 6 7 8 9 10 |
|
Input: %d%n
In fact, we’re calling printf("%d%n")
. printf
reads its first argument (arg0), sees two format specifiers, and expects two more arguments (arg1 and arg2).
- The first format specifier
%d
says to treat the next argument (arg1) as an integer and print it out. %n
says to treat the next argument (arg2) as a pointer, and write the number of bytes printed so far to the address at arg2.- We've printed 2 bytes so far, so the number 2 gets written to secret_string.
Why there printed 2 bytes
For we use "%d%n"
, %d
will print the value of secret_number
which is 42, so it will print 2 ints (4+2).
Review how to count
printf("item %d:%n", 3, &val)
- stores 7 in val.
%d
is just a signal, it refers to 3 which is 1 int.- "item"(4) + blankspace(1) + "3"(1) + ":"(1) = 7.
printf("item %d:%n", 987, &val)
- stores 9 in val printf("000%n")
%d
is just a signal, it refers to 987 which is 3 ints.
printf("000%n")
- writes the value 3 to the memory location pointed to by address located 8 bytes above the RIP of
printf
- why 3? because there are 0 0 0 before
%n
- writes the value 3 to the memory location pointed to by address located 8 bytes above the RIP of
Format Strings: Stack Diagram¶
How to attack?¶
Attack scenario: Write the number 100 to memory address 0xdeadbeef.
What input should the attacker supply?
When printf sees the %n
, two things need to be true:
- Control where we write: The next unused argument on the stack should be
0xdeadbeef
. - Control what we write: The number of bytes printed so far should be 100.
Attack String
Format String Vulnerabilities: Defense¶
C | |
---|---|
1 2 3 4 5 6 |
|
Now the attacker can't make the number of arguments mismatched!
Heap Vulnerabilities¶
C++ vtable¶
Heap Overflow¶
Heap overflow
- Objects are allocated in the heap (using
malloc
in C or new in C++) - A write to a buffer in the heap is not checked
- The attacker overflows the buffer and overwrites the vtable pointer of the next object to point to a malicious vtable, with pointers to malicious code
- The next object’s function is called, accessing the vtable pointer
Use-after-free
- An object is deallocated too early (using
free
in C or delete in C++) - The attacker allocates memory, which returns the memory freed by the object
- The attacker overwrites a vtable pointer under the attacker’s control to point to a malicious vtable, with pointers to malicious code
- The deallocated object’s function is called, accessing the vtable pointer
Off-by-One Exploit¶
Goal: Execute shellcode located at 0xdeadbeef
.
What parts of memory is an attacker able to overwrite in this piece of code?
If the attacker can change where the SFP of vulnerable points to, how can they use this to execute shellcode?
The C program now thinks that the SFP of main and the RIP of main are inside name.
The attacker controls these values, so the attacker can now overwrite where the program thinks the RIP of main is.