Lecture 5: Mitigating Memory Safety Vulnerabilities¶

Writing Memory-Safe Code

Defensive programming: Always add checks in your code just in case

Example: Always check a pointer is not null before dereferencing it, even if you’re sure the pointer is going to be valid
Before writing to an array or buffer, check that the write operation will be in-bounds
Relies on programmer discipline and tracking the length of every array/buffer/memory region

Use safe libraries (that do these checks for you)

Use functions that check bounds
Example: Use fgets instead of gets
Example: Use strncpy or strlcpy instead of strcpy
Example: Use snprintf instead of sprintf
Relies on programmer discipline or tools that check your program

Building Secure Software

Run-time checks
- Automatic bounds-checking
- May involve performance overhead
Monitor code for run-time misbehavior
- Look for illegal calling sequences
- Example: Your code never calls execve, but you notice that your code is executing execve
Contain potential damage
- Run system components in sandboxes or virtual machines (VMs)
- Think about privilege separation
Bug-finding tools
Code review
Vulnerability scanning
Penetration testing (“pen-testing”), 渗透测试

Exploit Mitigation¶

Aim: Add mitigations that make it harder to exploit common vulnerabilities.

Exploit mitigations (code hardening)

Compiler and runtime defenses that make common exploits harder

Find ways to turn attempted exploits into program crashes
Crashing is safer than exploitation: The attacker can crash our system, but at least they can’t execute arbitrary code
Mitigations are cheap (low overhead) but not free (some costs associated with them)

Mitigation: Non-Executable Pages¶

Idea

Most programs don’t need memory that is both written to and executed, so make portions of memory either executable or writable but not both

Stack, heap, and static data: Writable but not executable
Code: Executable but not writable

Page table entries have a writable bit and an executable bit that can be set to achieve this behavior

Page tables convert virtual addresses to physical addresses
Implemented in hardware, so effectively 0 overhead!

Also known as:

W^X (write XOR execute)
DEP (Data Execution Prevention, name used by Windows)
No-execute bit (the name of the bit itself)

How to Attack¶

Attackers are so clever, though they can not insert bad code, they can utilize existing code :)

Problem

Non-executable pages doesn’t prevent an attacker from leveraging existing code in memory as part of the exploit

Most programs have many functions loaded into memory that can be used for malicious behavior

Return-to-libc:
- An exploit technique that overwrites the RIP to jump to a functions in the standard C library (libc) or a common operating system function.
Return-oriented programming (ROP):
- Constructing custom shellcode using pieces of code that already exist in memory.

Return-to-libc¶

Recall: Per the x86 calling convention, each program expects arguments to be placed directly above the RIP (SFP + 4)

Consider the system function, which executes a shell command. We want to execute it like this:

C
char cmd[] = "rm -rf /"; 
system(cmd);

We utilize this scenario:

C
int system(char *command);

void vulnerable(void) { 
    char name[20]; 
    gets(name); 
}

int main(void) { 
    vulnerable(); 
    return 0; 
}

初始栈空间：

alt text

C
call gets // EIP: addl $4, %esp

alt text

C
addl $4, %esp // EIP: movl %ebp, %esp

alt text

C
movl %ebp, %esp // EIP: popl %ebp

alt text

C
popl %ebp // EIP: return

review popl %ebp

An action when callee return to caller

-> original %ebp: Back to previous frame pointer (ebp) from the stack.
%esp + 4: Back to caller's original %esp.

alt text

Now we jumped into the system function, and it expects the first argument to be 4 bytes above the ESP: rm -rf /

alt text

ROP¶

alt text

If we jump 25 bytes after the start of bar then 10 bytes after the start of foo, we get the result we want!

Step 1: call gets

alt text

Step 2: addl $4, %esp

alt text

Step 3: movl %ebp, %esp

alt text

Step 4: popl %ebp

alt text

Step 5: return into <bar+25>

alt text

Step 6: movl $1, %eax

alt text

Step 7: return into <foo+10>

alt text

Step 8: xorl %eax, %ebx

alt text

ret will bring

The ret instruction always pops off the bottom of the stack, so execution continues based on the chain of addresses!

What will ret bring to us?

move EIP to corresponding cmd according to <address>
%esp + 4

Limitation of ROP

You must use the cmd right 1-above ret (otherwise we can not form a chain)

That's a big limitation!

Mitigation: Stack Canaries¶

Idea: Add a sacrificial value on the stack, and check if it has been changed

When the program runs, generate a random secret value and save it in the canary storage
In the function prologue(序幕), place the canary value on the stack right below the SFP/RIP
In the function epilogue(结语), check the value on the stack and compare it against the value in canary storage

The canary value is never actually used by the function, so if it changes, somebody is probably attacking our system!

Properties¶

alt text

Example¶

alt text

Mitigation: Pointer Authentication¶

32-Bit and 64-Bit Processors

32-bit processor: integers and pointers are 32 bits long
- Can address $2^{32}$ bytes ≈ 4 GB of memory
64-bit processor: integers and pointers are 64 bits long
- Can address $2^{64}$ bytes ≈ 18 exabytes ≈ 18 billion GB of memory
- No modern computer can support this much memory
- At most 42 bits are needed to address all of memory
- 22 bits are left unused (the top 22 bits in the address are always 0)

Design Idea¶

Recall stack canaries: A secret value stored in memory

If the secret value changes, detect an attack
One canary per function on the stack

Idea: Instead of placing the secret value below the pointer, store a value in the pointer itself!

When storing a pointer in memory, replace the unused bits with a pointer authentication code (PAC)

Before using the pointer in memory, check if the PAC is still valid

If the PAC is invalid, crash the program
If the PAC is valid, restore the unused bits and use the address normally

Includes the RIP, SFP, any other pointers on the stack, and any other pointers outside of the stack (e.g. on the heap)

alt text

Each possible address has its own PAC

Example: The PAC for the address 0x000000007ffffec0 is different from the PAC for 0x000000007ffffec4
If an attacker changes the address without changing the PAC, the PAC will no longer be valid

Only someone who knows the CPU’s master secret can generate a PAC for an address

An attacker cannot generate a PAC for their malicious pointer without the master secret
An attacker cannot generate a PAC using a PAC for a different address

CPU’s master secret is not accessible to the program

Leaking program memory will not leak the master secret
Contrast with canaries, which can be leaked

How to Attack¶

Find a vulnerability to trick the program to generating a PAC for any address
Learn the master secret
Guess a PAC: Brute-force (暴力遍历破解)
Pointer reuse: If the CPU already generated a PAC for for some address, we can copy it and use it elsewhere

why we can reuse Pointer

在PAC（Pointer Authentication Code）机制中，指针的签名是基于地址、密钥和上下文生成的。通常情况下，PAC与特定的地址和上下文是一一对应的。然而，在某些情况下，攻击者可能会尝试重用PAC。这种重用的可能性取决于以下几个因素：

为什么可以重用PAC

相同的上下文和密钥：
- 如果两个地址在相同的上下文下使用相同的密钥生成PAC，攻击者可能会尝试将一个地址的PAC用于另一个地址。
- 这种情况通常需要特定的漏洞或错误配置才能实现。
签名小工具（Signing Gadgets）：
- 攻击者可能利用系统中的某些功能（如签名小工具）来生成或重用PAC。
- 这些小工具可能允许攻击者在特定条件下生成有效的PAC。
上下文和密钥的对称性：
- 在某些实现中，不同的上下文或特权级别可能使用相同的密钥，这使得PAC重用成为可能。

Mitigation: Address Space Layout Randomization (ASLR)¶

alt text