Understanding Memory


Note: This section is a work-in-progress.

The "Dunning-Kruger effect" 1 is an ironic phenomenon: people tend to overestimate their own understanding and abilities, particularly in areas where they have little knowledge or experience.

Yet even veteran programmers can fall victim to some variation of this effect. Our egos often convince us that we know precisely what will happen when our code executes. We wrote it, after all.

In reality a modern program is an incredibly complex apparatus built atop an even more complex hierarchy of hardware and software abstractions. All operating in miraculous unison. Few among us actually understand program execution from the physics of logic gates to the corner cases of network protocols. Most of the time we can't even get the layer we're working at right. Hence version numbers.

The good news is we don't have to know it all. In Chapter 1's Dreyfus Model, the "Competent" stage was marked by the rude realization that the learner's knowledge is remarkably limited. In response, the learner needs to de-prioritize the less relevant details and focus on those pertinent to their end goals. To separate the wheat from the chaff.

Systems programming requires a mental model of "what the computer is doing", but it doesn't have to be exhaustive. In truth, programming languages which give developers "full control"2 over the hardware - like C and Rust - deal primary with the concepts and mechanisms of one thing: memory.

If you understand the bulk of how and why memory works, you're well on your way to mastery of low-level programming.

Isn't systems programming more than just managing memory?

Certainly. Remember the three hypothetical engineers introduced in Chapter 1, when discussing what defines a "system program"?

Each engineer held a different view, because each came from a specialization requiring unique expertise. For example:

  • Distributed systems developers understand consensus protocols and fault tolerance.

  • Device driver developers work with kernel APIs and interrupt handling.

  • Microcontroller firmware developers interface with analog components and read device datasheets.

But these facets of systems programming are largely domain-specific. Effective use of memory is a sort of universal bottleneck, it's necessary but not sufficient for writing performant applications. Regardless of domain.

This chapter will cover universal computer architecture principles relevant to controlling memory. The meat of what every systems programmer ought to know. We'll put these principles into practice in Chapter 6, implementing a stack storage abstraction that maximizes both safety and portability.

Memory Knowledge Dump

Memory is perhaps the most single important topic in this book. This is our final conceptual chapter, the rest of our adventure will focus on writing a Rust library. Grab a coffee now (Yerba Mate if you must) because we're gonna really get into the mechanical details here.

We'll start by looking at memory from a software perspective, the model of most systems programmers work with day-to-day. Then we'll dig into the attacker's perspective, learning about how memory corruption bugs are turned into devastating exploits. Once you can subvert rules and assumptions, you truly understand how something works. At least when it comes to security.

Armed with a deeper understanding of memory, we'll examine how Rust provides memory safety guarantees. In detail.

Then we'll tackle two more narrow but nonetheless important topics: integer representation issues and Rust's !#[no_std] attribute.

We'll conclude our memory world tour by stepping into the shoes of an attacker: looking at real-world Rust CVEs, learning reverse dynamic debugging, and writing an exploit or two.

What about the hardware perspective?

The Fundamentals: Memory Hierarchy section of the Appendix takes a hardware-centric view, looking at performance tradeoffs within the modern memory hierarchy. Highly recommend it as a supplement to this section.

Learning Outcomes

  • Develop a mental model of memory and program execution
  • Understand how Rust actually provides memory safety, including current limitations
  • Learn to debug Rust code using Mozilla rr3 (an enhanced variant of gdb4)
  • Understand how attackers exploit memory corruption bugs, step-by-step
  • Write your first exploit: bypass modern protections to get an interactive shell!

2

In both programming and modern life, you never quite have full control. In programming that's because both compilers and interpreters make oft-inscrutable decisions for you (e.g. aggressive optimization5) and, rarely, even contain bugs6.

3

rr. Mozilla (Accessed 2022).

4

GDB: The GNU Project Debugger. GNU project (Accessed 2022).

6

One particular funny case is CVE-2020-246587, a failed compiler-inserted stack protection. As an aside, vulnerabilities patched by new compiler versions are an interesting category. Which can include hardware vulnerabilities (e.g. CVE-2021-354658).

7

CVE-2020-24658 Detail. National Institute of Standards and Technology (2020).