跳转至

Lecture 5: Mitigating Memory Safety Vulnerabilities

Writing Memory-Safe Code

Defensive programming: Always add checks in your code just in case

  • Example: Always check a pointer is not null before dereferencing it, even if you’re sure the pointer is going to be valid
  • Before writing to an array or buffer, check that the write operation will be in-bounds
  • Relies on programmer discipline and tracking the length of every array/buffer/memory region

Use safe libraries (that do these checks for you)

  • Use functions that check bounds
  • Example: Use fgets instead of gets
  • Example: Use strncpy or strlcpy instead of strcpy
  • Example: Use snprintf instead of sprintf
  • Relies on programmer discipline or tools that check your program

Building Secure Software

  1. Run-time checks
    • Automatic bounds-checking
    • May involve performance overhead
  2. Monitor code for run-time misbehavior
    • Look for illegal calling sequences
    • Example: Your code never calls execve, but you notice that your code is executing execve
  3. Contain potential damage
    • Run system components in sandboxes or virtual machines (VMs)
    • Think about privilege separation
  4. Bug-finding tools
  5. Code review
  6. Vulnerability scanning
  7. Penetration testing (“pen-testing”), 渗透测试

Exploit Mitigation

Aim: Add mitigations that make it harder to exploit common vulnerabilities.

Exploit mitigations (code hardening)

Compiler and runtime defenses that make common exploits harder

  • Find ways to turn attempted exploits into program crashes
  • Crashing is safer than exploitation: The attacker can crash our system, but at least they can’t execute arbitrary code
  • Mitigations are cheap (low overhead) but not free (some costs associated with them)

Mitigation: Non-Executable Pages

Idea

Most programs don’t need memory that is both written to and executed, so make portions of memory either executable or writable but not both

  • Stack, heap, and static data: Writable but not executable
  • Code: Executable but not writable

Page table entries have a writable bit and an executable bit that can be set to achieve this behavior

  1. Page tables convert virtual addresses to physical addresses
  2. Implemented in hardware, so effectively 0 overhead!

Also known as:

  • W^X (write XOR execute)
  • DEP (Data Execution Prevention, name used by Windows)
  • No-execute bit (the name of the bit itself)

How to Attack

Attackers are so clever, though they can not insert bad code, they can utilize existing code :)

Problem

Non-executable pages doesn’t prevent an attacker from leveraging existing code in memory as part of the exploit

Most programs have many functions loaded into memory that can be used for malicious behavior

  1. Return-to-libc:

    • An exploit technique that overwrites the RIP to jump to a functions in the standard C library (libc) or a common operating system function.
  2. Return-oriented programming (ROP):

    • Constructing custom shellcode using pieces of code that already exist in memory.

Return-to-libc

Recall: Per the x86 calling convention, each program expects arguments to be placed directly above the RIP (SFP + 4)

Consider the system function, which executes a shell command. We want to execute it like this:

C
1
2
char cmd[] = "rm -rf /"; 
system(cmd);

We utilize this scenario:

C
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
int system(char *command);

void vulnerable(void) { 
    char name[20]; 
    gets(name); 
}

int main(void) { 
    vulnerable(); 
    return 0; 
}

初始栈空间:

alt text

C
1
call gets // EIP: addl $4, %esp

alt text

C
1
addl $4, %esp // EIP: movl %ebp, %esp

alt text

C
1
movl %ebp, %esp // EIP: popl %ebp

alt text

C
1
popl %ebp // EIP: return
review popl %ebp

An action when callee return to caller

  1. -> original %ebp: Back to previous frame pointer (ebp) from the stack.
  2. %esp + 4: Back to caller's original %esp.

alt text

Now we jumped into the system function, and it expects the first argument to be 4 bytes above the ESP: rm -rf /

alt text

ROP

alt text

If we jump 25 bytes after the start of bar then 10 bytes after the start of foo, we get the result we want!

Step 1: call gets

alt text

Step 2: addl $4, %esp

alt text

Step 3: movl %ebp, %esp

alt text

Step 4: popl %ebp

alt text

Step 5: return into <bar+25>

alt text

Step 6: movl $1, %eax

alt text

Step 7: return into <foo+10>

alt text

Step 8: xorl %eax, %ebx

alt text

ret will bring

The ret instruction always pops off the bottom of the stack, so execution continues based on the chain of addresses!

What will ret bring to us?

  1. move EIP to corresponding cmd according to <address>
  2. %esp + 4
Limitation of ROP

You must use the cmd right 1-above ret (otherwise we can not form a chain)

That's a big limitation!

Mitigation: Stack Canaries

Idea: Add a sacrificial value on the stack, and check if it has been changed

  • When the program runs, generate a random secret value and save it in the canary storage
  • In the function prologue(序幕), place the canary value on the stack right below the SFP/RIP
  • In the function epilogue(结语), check the value on the stack and compare it against the value in canary storage

The canary value is never actually used by the function, so if it changes, somebody is probably attacking our system!

Properties

alt text

Example

alt text

Mitigation: Pointer Authentication

32-Bit and 64-Bit Processors
  1. 32-bit processor: integers and pointers are 32 bits long
    • Can address \(2^{32}\) bytes ≈ 4 GB of memory
  2. 64-bit processor: integers and pointers are 64 bits long
    • Can address \(2^{64}\) bytes ≈ 18 exabytes ≈ 18 billion GB of memory
    • No modern computer can support this much memory
    • At most 42 bits are needed to address all of memory
    • 22 bits are left unused (the top 22 bits in the address are always 0)

Design Idea

Recall stack canaries: A secret value stored in memory

  1. If the secret value changes, detect an attack
  2. One canary per function on the stack

Idea: Instead of placing the secret value below the pointer, store a value in the pointer itself!

When storing a pointer in memory, replace the unused bits with a pointer authentication code (PAC)

Before using the pointer in memory, check if the PAC is still valid

  • If the PAC is invalid, crash the program
  • If the PAC is valid, restore the unused bits and use the address normally

Includes the RIP, SFP, any other pointers on the stack, and any other pointers outside of the stack (e.g. on the heap)

alt text

Each possible address has its own PAC

  1. Example: The PAC for the address 0x000000007ffffec0 is different from the PAC for 0x000000007ffffec4
  2. If an attacker changes the address without changing the PAC, the PAC will no longer be valid

Only someone who knows the CPU’s master secret can generate a PAC for an address

  1. An attacker cannot generate a PAC for their malicious pointer without the master secret
  2. An attacker cannot generate a PAC using a PAC for a different address

CPU’s master secret is not accessible to the program

  1. Leaking program memory will not leak the master secret
  2. Contrast with canaries, which can be leaked

How to Attack

  1. Find a vulnerability to trick the program to generating a PAC for any address
  2. Learn the master secret
  3. Guess a PAC: Brute-force (暴力遍历破解)
  4. Pointer reuse: If the CPU already generated a PAC for for some address, we can copy it and use it elsewhere
why we can reuse Pointer

在PAC(Pointer Authentication Code)机制中,指针的签名是基于地址、密钥和上下文生成的。通常情况下,PAC与特定的地址和上下文是一一对应的。然而,在某些情况下,攻击者可能会尝试重用PAC。这种重用的可能性取决于以下几个因素:

为什么可以重用PAC

  1. 相同的上下文和密钥

    • 如果两个地址在相同的上下文下使用相同的密钥生成PAC,攻击者可能会尝试将一个地址的PAC用于另一个地址。
    • 这种情况通常需要特定的漏洞或错误配置才能实现。
  2. 签名小工具(Signing Gadgets)

    • 攻击者可能利用系统中的某些功能(如签名小工具)来生成或重用PAC。
    • 这些小工具可能允许攻击者在特定条件下生成有效的PAC。
  3. 上下文和密钥的对称性

    • 在某些实现中,不同的上下文或特权级别可能使用相同的密钥,这使得PAC重用成为可能。

Mitigation: Address Space Layout Randomization (ASLR)

alt text

alt text