Lecture 2: x86 Assembly and Call Stack¶
Compiler, Assembler, Linker, Loader¶

- Compiler: Converts C code into assembly code (RISC-V, x86)
- Assembler: Converts assembly code into machine code (raw bits)
- Think 61C’s RISC-V “green sheet”
- Linker: Deals with dependencies and libraries
- You can ignore this part for 161
- Loader: Sets up memory space and runs the machine code
Endianness¶
Word (32-bit machine)¶
On each row of the grid, we put 4 bytes = 1 word.
We mainly focus on 32-bits machine in this class :)

We can combine all the bytes on a row to form a word.
We need to care 2 things:
- depict a byte
- depict a word
To be specific:
- If I ask for the byte at address
0x00000000, you should say0x11. - If I ask for the word at that same address, you should say
0x44332211.
Little-endian words¶
- We can combine four bytes on a row to form a word.
- However, x86 is little-endian, which means the word formed from the first four bytes is actually
0x44332211! - This is just like the dates: each group of 4 bytes is a word. The only difference is how you interpret those bytes (the order they appear).

Why called Little-Endian word?
The least significant byte is stored at the smallest address.
0x44332211
LSB is 0x11, and it's stored in the smallest address.
You can understand "Smallest Address" in two ways:
1) The picture above is a memory, and the smallest address is the leftmost one.
2) The Digital table
Memory Layout¶

Register
Registers are located on the CPU
This is different from the memory layout
Memory: addresses are 32-bit numbers
Registers are referred to by names (ebp, esp, eip), not addresses



Why Struct is so weird???
Ask David for help :(
X86 Architecture¶
-
Little-endian
- The least-significant byte of multi-byte numbers is placed at the first/lowest memory address
- Same as RISC-V
-
Variable-length instructions
- When assembled into machine code, instructions can be anywhere from 1 to 16 bytes long
- Contrast with RISC-V, which has fixed-length, 4-byte instructions

Register¶

Syntax¶
- Register references are preceded with a percent sign
%- Example:
%eax, %esp, %edi
- Example:
- Immediates (constant values) are preceded with a dollar sign
$- Example:
$1, $161, $0x4
- Example:
- Memory references use parentheses and can have immediate offsets
- Example:
(%esp)dereferences memory at the address contained in ESP - Example:
8(%esp)dereferences memory 8 bytes above the address contained in ESP
- Example:


Stack Layout¶
You can see the whole process here :)

One thing to note:
The sequence of coming into Stack:
- local variables: ...
StructObjects: ...

Steps to Function Calling
- Push arguments on the stack
- Push old eip (rip) on the stack
- Push old ebp (sfp) on the stack
- Adjust the stack frame
- Execute the function
- Restore everything
