Wait, How Does a Program Even Run?

I've been confused by "how programs run" for a long time. Stack pointer, stack frame, stack top—these words kept showing up, and I could never keep them straight. I hadn't really learned assembly back then, so every explanation felt like it was assuming something I didn't have.

But I'm starting to think it's not actually that complicated. This is me trying to walk through it—maybe it'll help someone else who's been stuck in the same place.

Everything Lives in Memory

If you think of memory as a container, then running a program is really about putting things into that container and letting the CPU work with them.

What things? The obvious answer is data—the numbers you're calculating, the strings you're processing, the pixels you're drawing. But here's the part that took me a while to internalize: the instructions themselves—the logic of your program, the "what to do"—are also just data sitting in memory. When you double-click an executable or type ./program, the operating system reads your program file and copies its contents into memory. Your functions, your if-statements, your loops—they all become bytes in memory, no different in nature from the integers and strings they operate on.

This is what the .text segment is: the region of memory where your compiled code lives. The name "text" is historical (as opposed to "data"), but the key insight is that code is data too. The CPU doesn't receive instructions through some special channel. It reads bytes from memory addresses, interprets them as instructions, and does what they say. Then it moves to the next address and repeats.

That's what "running a program" means. The CPU reads bytes from memory, treats some of them as instructions, and modifies other bytes according to those instructions. There's no magic. It's bytes all the way down.

What Memory Actually Looks Like

When your program loads, its memory space looks roughly like this:

High addresses
┌─────────────────────────┐
│      Kernel Space       │  ← You can't touch this
├─────────────────────────┤
│          ↓              │
│        Stack            │  ← Data allocates toward lower addresses
│                         │
│     (unmapped gap)      │
│                         │
│          ↑              │
│         Heap            │  ← Data allocates toward higher addresses
├─────────────────────────┤
│         .bss            │  ← Uninitialized globals (zeroed)
├─────────────────────────┤
│        .data            │  ← Initialized globals
├─────────────────────────┤
│        .text            │  ← Your actual code (instructions)
└─────────────────────────┘
Low addresses

Let me explain each region, because for years I knew these names without understanding what they actually were.

.text is where your compiled instructions live—I mentioned this above. Every function you wrote becomes a sequence of machine instructions stored here.

.data holds global and static variables that you initialized with a value. If you write static int x = 42; somewhere, that 42 lives in .data. The value is baked into the executable file itself—when the OS loads your program, it copies this section directly into memory.

.bss holds global and static variables that you didn't initialize, or initialized to zero. Here's the clever part: .bss doesn't actually take up space in your executable file. The file just records "I need 1000 bytes of .bss," and when the OS loads the program, it allocates that space and fills it with zeros. This is why uninitialized globals default to zero in C—they live in .bss, and .bss is zeroed by contract. The name stands for "Block Started by Symbol," a historical artifact from 1950s assemblers.

Above .bss, there's a large region of memory that's initially empty. This is where heap and stack live. They start at opposite ends—heap allocates toward higher addresses, stack allocates toward lower addresses. Why this design? At compile time, we don't know how much stack space we'll need (depends on how deep function calls go) or how much heap space we'll need (depends on runtime allocations). Having them at opposite ends means they can share the available space flexibly.

Walking Through a Real Program

Let me make this concrete. Here's a simple C program:

int counter = 0;              // global variable

int add(int a, int b) {
    int sum = a + b;          // local variable
    return sum;
}

int main() {
    int x = 10;               // local variable
    int* p = malloc(4);       // heap allocation
    *p = add(x, 5);
    counter++;
    free(p);
    return 0;
}

Let's trace what happens when this runs.

Compilation: The compiler translates add() and main() into machine instructions. These go into .text. The global variable counter, initialized to 0, goes into .bss (zero-initialized globals don't need space in the file).

Loading: You type ./program. The OS reads the executable, allocates a chunk of memory for this process, copies .text and .data into place, zeros out .bss, sets up the stack pointer to point at the top of the stack region, and jumps to the entry point (which eventually calls main).

main() starts: The CPU executes the first instruction of main. The stack pointer moves down to make room for main's local variables—x and p. The value 10 gets written into the memory location for x.

malloc(4): This function asks the heap allocator for 4 bytes. The allocator finds (or requests from the OS) a suitable chunk of memory in the heap region, marks it as used, and returns the address. That address gets stored in p, which lives on the stack.

add(x, 5): The CPU jumps to the add function. But first, it pushes the return address onto the stack (so it knows where to come back to), and the stack pointer moves down again to make room for add's parameters (a, b) and local variable (sum). The values 10 and 5 get copied into a and b. The addition happens, result goes into sum, then into the return value location.

add() returns: The stack pointer moves back up. The memory that held a, b, and sum is now "freed"—not erased, just abandoned. The stack pointer is above it now, so the next function call will overwrite it.

counter++: The CPU reads the value at counter's address (in .bss), adds 1, writes it back.

free(p): The heap allocator marks those 4 bytes as available again. The memory isn't erased—it's just noted as reusable.

main() returns: The stack pointer moves back up. The program ends.

Notice what happened: .text never changed (it's just instructions being read). .bss changed (counter went from 0 to 1). The stack grew and shrank as functions were called and returned. The heap had memory allocated and freed. Each region has its own lifecycle.

Why the Stack Works the Way It Does

The stack's design follows directly from how function calls work. Think about it: when main calls add, add must finish before main can continue. When add calls something else, that must finish before add can continue. Function calls are inherently last-in, first-out. The most recently called function must return first.

This is why the stack is a stack—not because someone thought stacks were elegant, but because function calls are a stack. The data structure matches the problem. And because of this LIFO property, we don't need complex memory management. We just need one pointer (the stack pointer) that moves down when we need space and moves up when we're done. No fragmentation, no searching for free blocks, no garbage collection. Just pointer arithmetic.

Now let's tackle the terminology that confused me for years.

Stack region vs stack frame: The stack region is the entire memory area reserved for the stack—on Linux, typically 8MB per thread. A stack frame is the slice of that space used by a single function call. When main calls add, you have two frames stacked on top of each other. The region is the container; frames are what's inside.

Stack pointer: Just an address. It points to the current "top" of the stack (which is actually at the bottom of the diagram, because addresses grow downward). When a function is called, the stack pointer decreases. When it returns, the stack pointer increases. That's all it does.

"The stack grows downward": This confuses everyone because we draw high addresses at the top. So "growing" means the stack pointer gets smaller. If the stack pointer was at address 1000 and a function needs 32 bytes, the new stack pointer is 968. We say it "grew" 32 bytes, but the number went down. This is just convention—on some architectures it grows up instead. It doesn't matter which direction; what matters is that it grows and shrinks in a predictable, LIFO pattern.

Here's what this looks like in assembly. When the compiler processes a function, it counts up how much stack space is needed (local variables, saved registers, padding for alignment) and emits something like:

foo:
    addi sp, sp, -32    ; Reserve 32 bytes (move stack pointer down)
    sw   ra, 28(sp)     ; Save return address
    ; ... function body ...
    lw   ra, 28(sp)     ; Restore return address
    addi sp, sp, 32     ; Release 32 bytes (move stack pointer up)
    ret

That -32 is determined at compile time. The compiler knows exactly how much space each function needs. But how many times this code runs, how deep the call chain goes—that's only known at runtime. This is why stack overflow happens: each call adds a frame, and if you recurse too deep, you run out of that 8MB.

Why the Heap Exists

If the stack is so elegant, why do we need a heap at all?

Because the stack has a fundamental limitation: everything on the stack dies when the function returns. The moment add returns, its stack frame is gone. If you wanted to keep sum around for later, you couldn't—the stack pointer moved past it.

Sometimes you need data that outlives the function that created it. You create an object in function A, return it somehow, and function B uses it long after A is finished. The stack can't do this. It's structurally impossible—A's frame is gone.

You might ask: why not put that data in .data or .bss? Those regions also live for the entire program. The problem is that .data and .bss are fixed at compile time—the compiler decides exactly how many bytes they contain, and that number never changes. You can't add more at runtime, and you can't release any of it until the program ends.

Sometimes you don't know how much memory you need until runtime. A user uploads a file—how big? A network response arrives—how many bytes? You can't reserve stack space for "however many bytes" because the compiler needs to know the number. And you can't use .data/.bss for the same reason.

Sometimes you need a lot of memory. The stack is typically 8MB. If you need 100MB for a data structure, the stack simply can't hold it.

The heap solves all these problems. It's a region of memory where you explicitly ask for space (malloc) and explicitly release it (free). The memory lives until you say otherwise. You can allocate any amount (within system limits). The tradeoff is that you're now responsible for management—forget to free, you leak memory; free too early, you get use-after-free bugs; free twice, you corrupt the allocator.

Stack and Heap Are Both Just RAM

At this point you might ask: aren't the stack and heap both just... places to store data? And you'd be right. They're both in RAM. To the CPU, they're just addresses—reading from stack address 0x7fff0100 is the same operation as reading from heap address 0x1234000. There's no hardware distinction.

The difference between stack and heap is a management strategy, not a hardware difference. Stack memory is managed automatically with a single pointer, because function calls are LIFO. Heap memory is managed by an allocator, because some data needs to outlive its creating function. Both are just bytes in RAM. The complexity is in the bookkeeping, not the storage.

That's basically it. There's more to dig into—how allocators actually work, virtual memory, all that—but the basic picture is just this: memory regions with different management strategies.