From Bitmasks to Types: Zig's Quest for Bit-Level Order

By Dylan7 min read

Recently, I’ve been spending my time building a few system tools in Zig—starting with a simple shell.

Writing a shell inevitably brings you face-to-face with file redirection and process communication, which means calling open quite a lot. While digging through the Zig standard library (specifically the Nightly 0.16.0-dev branch on Codeberg) to see how it wraps Linux system calls, I stumbled upon a curious piece of code:

pub fn open(path: [*:0]const u8, flags: O, perm: mode_t) usize {
    if (@hasField(SYS, "open")) {
        return syscall3(.open, @intFromPtr(path), @as(u32, @bitCast(flags)), perm);
    } else {
        return syscall4(
            .openat,
            @bitCast(@as(isize, AT.FDCWD)),
            @intFromPtr(path),
            @as(u32, @bitCast(flags)),
            perm,
        );
    }
}

What caught my eye was the parameter signature: flags: O. In most systems programming contexts, you expect flags to be a plain integer. Yet here, it has its own type O, and right before the syscall, it’s forcibly cast back to a u32 using @bitCast.

Following the definition of O, I realized it wasn't a set of integer constants at all. Instead, it was a packed struct(u32).

Why the name O? If you’ve spent any time with the C open syscall, you’ll recognize the constants starting with O_: O_RDONLY, O_CREAT, O_TRUNC. Since C lacks namespaces, developers were forced to use prefixes to manually categorize constants. Zig simply took that prefix and turned it into a type name. Thus, O_CREAT became O.CREAT. You see this naming DNA is all over the Zig standard library—like MAP (for MAP_PRIVATE) or PROT (for PROT_READ).

To my surprise, the definition of O is a massive switch block based on the target architecture (native_arch), with each branch providing a tailor-made packed struct(u32).

Take the common x86_64 and the uniquely laid-out sparc64 as examples:

pub const O = switch (native_arch) {
    .x86_64 => packed struct(u32) {
        ACCMODE: ACCMODE = .RDONLY, // bit 0-1
        _2: u4 = 0,
        CREAT: bool = false,        // bit 6
        // ...
    },
    .sparc64 => packed struct(u32) {
        ACCMODE: ACCMODE = .RDONLY,
        _2: u1 = 0,
        APPEND: bool = false,
        _4: u2 = 0,
        ASYNC: bool = false,
        _7: u2 = 0,
        CREAT: bool = false,        // bit 9
        // ...
    },
    // ... other architectures
};

This discovery kept me thinking for a long time. Why doesn't Zig just use a bunch of constants? Why go through the trouble of defining such a complex struct just to cast it back to an integer?

A Dust-Covered C Feature

To understand Zig’s approach, we first have to talk about a somewhat obscure feature in C: Bit-fields.

In C, if you want to be precise about memory, you can tell the compiler exactly how many bits a variable should occupy. It looks like this:

struct {
    unsigned int read_only : 1;
    unsigned int write_only : 1;
    unsigned int mode : 2;
} flags;

That : 1 is a bit-field. It’s intuitive; it lets you treat bits like struct fields without the hassle of bitwise operations.

But reality is harsh. The C standard is notoriously vague about the physical layout of bit-fields. In ISO/IEC 9899 (C99 Standard), section 6.7.2.1, the committee gives this rather non-committal definition:

C99 Bit-field Specification

Essentially, this description says: it doesn't mandate whether bits are allocated from high-order to low-order or vice versa. It doesn't even say what happens if a field doesn't fit in the remaining space of a byte.

This implementation-defined nature is practically fatal for systems programming. It means the committee handed over the decision-making power to compiler authors. If you need to map a hardware register or match a Linux kernel protocol across different architectures, you simply cannot trust C bit-fields. Change the compiler or the architecture, and your bit layout might fall apart. Consequently, generations of systems programmers have avoided this feature like the plague.

Layout Should Not Be a Black Box

Zig’s packed struct(u32) is effectively C’s bit-fields, but fixed.

When we write packed struct(u32), the u32 is critical. It explicitly tells the compiler that this entire struct is, physically, a single 32-bit unsigned integer. It’s no longer a loose collection of fields; it’s a bit-container of a fixed width. Crucially, Zig mandates the order of allocation: it must start from the Least Significant Bit (LSB).

This predictability brings a massive benefit: cross-architecture consistency.

Back to that switch (native_arch) I saw. In the Linux kernel, the numeric values of flags are often incompatible between architectures. For O_CREAT, it’s defined as 0100 (octal, bit 6) on x86, but 01000 (bit 9) on alpha, and 0400 (bit 8) on mips.

In the C system libraries, you’d often find a mess of macros to handle this fragmentation:

// Simplified logic from Linux kernel uapi/asm/fcntl.h
#if defined(__i386__) || defined(__x86_64__)
#define O_CREAT        00000100  /* bit 6 */
#define O_DIRECTORY    00200000  /* bit 16 */
#elif defined(__alpha__)
#define O_CREAT        00001000  /* bit 9 */
#define O_DIRECTORY    00010000  /* bit 12 */
#elif defined(__mips__)
#define O_CREAT        00000400  /* bit 8 */
#define O_DIRECTORY    00200000  /* bit 16 */
#endif

This fragmentation means that if you want to manipulate raw bits directly, you have to write completely different masking logic for each architecture. In Zig, however, regardless of the underlying bit layout, the programmer always sees the same struct field: .CREAT. Zig uses the packed struct to shield you from these messy architectural differences. The type system handles the "physical order," so you can focus on the logic.

I also realized why @bitCast was used in the source. At the physical level, this O struct is identical to a u32. @bitCast is simply telling the compiler: "re-interpret this 32-bit memory and pass it to the syscall register."

It Runs Just as Fast

I initially suspected this abstraction might be bloated. After all, writing 0x40 feels much closer to the metal.

To verify this, I ran a more rigorous experiment. I simulated a common scenario: setting multiple flags at once. In the packed struct version, I assigned values to three fields individually; in the manual version, I performed a single bitwise OR.

You can open my verification source code (/verify_bits.zig) and the Zig-generated assembly file (/verify_bits.s) in a new tab to compare them.

// verify_bits.zig snippet
export fn test_struct(flags: *O) void {
    flags.ACCMODE = 1;      // bits 0-1 set to 1
    flags.CREAT = true;     // bit 6 set to 1
    flags.TRUNC = true;     // bit 9 set to 1
}

To see exactly what the compiler was doing, I exported the assembly file with ReleaseFast optimization:

# -femit-asm tells the compiler to output a .s file
# -OReleaseFast enables full optimizations
zig build-lib verify_bits.zig -femit-asm -OReleaseFast

When I opened the resulting verify_bits.s, my worries vanished:

; test_manual (Apple Silicon / AArch64)
ldr     w8, [x0]            ; Load 32-bit value into register w8
and     w8, w8, #0xfffffffd ; Clear ACCMODE bits
mov     w9, #577            ; 577 is 0x241 in hex (bits 0, 6, 9)
orr     w8, w8, w9          ; Perform bitwise OR
str     w8, [x0]            ; Store back to memory
ret

; test_struct
; The compiler realized the logic is identical and reused the code
b       _test_manual

The compiler is incredibly smart. It realized that the three field assignments could be merged into a single "read-modify-write" operation. The machine code it generated was not just consistent with my manual bitmasking code—it achieved full reuse at the binary level.

This "zero-cost" evolution is fascinating. Zig isn't rejecting the efficiency of C; it’s just giving us a more honest way to describe the bit-level logic we’ve been keeping in our heads for decades.

We used to stare at hex codes like 0x40 and cross-reference them with manuals. Now, the code is the manual. This shift from "magic numbers" to "strong types" brings a certain peace of mind to low-level debugging. You no longer have to squint at hex values—because every bit finally has a name.


The Footnotes That Inspired Me

[1] Zig Language Reference: packed struct - The official documentation on packed semantics and LSB ordering.

[2] std/os/linux.zig (Zig Codeberg) - The starting point of this essay, featuring the massive cross-architecture switch for O.