Pseudocode is a lie

And what I use instead

Jun 04, 2026

Monday’s post was about a single algorithm — Union-Find, α(n) ≈ constant. This is the technical companion: what it took to publish that algorithm. Both pieces are part of the June sprint for The Algorithm Codex, which you can also get in print at Gumroad or as part of the Compendium. More at the end.

A kangaroo is eating from someone's hand. — Photo by Ellephant on Unsplash

Pseudocode is informal enough to be ambiguous and formal enough to be annoying. Every advantage it claims is better served by something else.

Three claims for pseudocode, all false

The canonical defense of pseudocode is Introduction to Algorithms (Cormen, Leiserson, Rivest, and Stein), known as CLRS. Thirty years. Four generations of computer science students. The single most widely-sold technical book in the field. If you’ve studied algorithms, you learned from CLRS or from someone who did. And CLRS uses pseudocode throughout, defends it explicitly, and makes the case as well as it can be made.

The argument for pseudocode rests on three things. First, it hides details that don’t matter for the explanation — variable types, secondary operations, boilerplate. Second, it speaks at the same conceptual level as the narrative; you can invent operations, give them names, use notation that fits the idea you’re explaining. Third, it survives time: programming languages rise and fall, but pseudocode is language-agnostic, and a textbook meant to teach the same concepts to students who will write Python, Java, Rust, or something that doesn’t exist yet can’t be tied to any of them.

Problem 1: Pseudocode is not intuitive. If you want to convey an idea intuitively — how a process unfolds, why an algorithm works, what the conceptual structure is — use a diagram, a narrative, an informal sketch. These are genuinely intuitive; they don’t impose formal constraints. Pseudocode has syntax. It requires for, while, if, formatting conventions, notation that the reader has to parse before they can extract the idea. The claim that pseudocode is easier to read than real code is false. It’s different from real code, but not simpler. For genuinely intuitive explanation, there are better tools.

Problem 2: Pseudocode is not formal. Real code has formal semantics. There is a defined mapping from source to behavior; a compiler or interpreter enforces it. Pseudocode has no such mapping. The reader is left to translate it into whatever language they’ll actually use, and that translation is entirely implicit. It’s undocumented and un-enforced. The gap between “swap the elements at positions i and j” and “no, you can’t, you forgot you were working with an immutable list” is not a minor translation detail. That gap is where the data structure’s mutability, the cost model, the choice of representation actually live. If your pseudocode rules were formal enough to specify the translation unambiguously, you’d have a compiler. It would just be code.

Problem 3: Pseudocode is not portable. The language-agnostic argument sounds compelling, but it proves too much. Pseudocode isn’t language-agnostic in any useful sense. It’s an additional language the reader has to learn. If you give students pseudocode, you’re giving none of them something they can run. You’re asking every reader to translate. Whereas if you pick one real language (a widely-used, readable one), many of your readers understand it directly, and the rest have a formal, precise text to translate from. Pseudocode doesn’t spare readers translation; it makes all of them translate.

My prior is strong: intuition first, formalization later. An algorithm, a formula, a mathematical concept: explain it intuitively first, then formally. Pseudocode tries to do both and delivers neither. It isn’t informal enough to convey ideas; it isn’t formal enough to prove properties. Pseudocode charges you for formal syntax and gives you no formal guarantee.

Two registers: diagrams for intuition, real code for formality

Use two kinds of explanation, not one.

For intuitive content, use diagrams and narration. Diagrams don’t have to be formal — informal sketches work fine: an arrow from one box to another, a tree branching as a recursion unfolds, a before/after pair showing what partition does to an array. They explain intuitively, expose the underlying reasoning, and don’t impose the cost of formal syntax.

For formal content, use real code. Every supposed advantage of pseudocode — the ability to invent operations, hide irrelevant detail, speak at a high conceptual level — exists in real programming languages too. Real languages have abstraction and encapsulation. To defer an operation, stub it: a named function with an empty body or a one-line comment. graph.connect(u, v), database.search(query), parser.parse(source). Real code already has the vocabulary for subjects, predicates, and attributes on those subjects. You don’t need bespoke notation. You need a sufficiently expressive language.

That language is Python. Of the widely-used general-purpose languages, Python is the most readable. There are more ergonomic languages (certain Lisps, Haskell, Racket), but their readability advantage comes at the cost of familiarity, and a textbook audience that doesn’t already know those languages won’t benefit from them. Python is known, readable, and expressive. There is no good reason not to use Python in a programming book, even a book that isn’t about Python.

Literate programming keeps prose and code from drifting

If we’re using real code anyway, formal and executable, why not make that code a real deliverable? A library with unit tests and an import statement, not just a code block in a document.

The obstacle: docs and code drift apart when they live in separate files. Change the explanation, and you have to remember to update the code. Change the code, and you have to remember to update the prose. In practice, one always lags the other. Pseudocode dodges this: it never has to match anything, so it’s never wrong. Real code is always either in sync or not.

Literate programming is Donald Knuth’s answer, from his 1984 paper. The original system (WEB, later CWEB) was built around Pascal and C. You wrote a single source file interleaving prose and code; a tool called tangle extracted the code into a runnable program, and a tool called weave rendered the prose into a readable document. The writing and the code are the same artifact. You can’t let them drift because they’re literally the same file.

Modern tools have rediscovered the idea. Jupyter notebooks mix prose and executable code. Quarto does the same at book scale. But neither fully extracts the code into a standalone packageable artifact: the notebook is the program, and you can’t easily ship it as a library.

`illiterate` makes the book and the package the same source

illiterate closes that gap. A build step extracts the tagged blocks from .qmd files ({export=...}) and assembles them into actual Python source files. The book renders; the package builds from the same source. Three things follow automatically: the documentation can’t describe an algorithm that doesn’t compile; the tests run against the same code the reader sees; and the book ships a working library as a side effect of being written.

In the Algorithm Codex (a hundred algorithms across sorting, graphs, data structures, and dynamic programming), every chapter follows this pattern. No pseudocode anywhere.

Quicksort, with the package built from the page

A Quicksort chapter opens with the public entry point, the function the reader calls and the one that lands in the package:

```python {export=codex/sorting/quick.py}

from typing import Sequence
from codex.types import Ordering, default_order

def quicksort[T](items: Sequence[T], order: Ordering[T] = default_order) -> list[T]:
    items = list(items)  # don't mutate the input
    _quicksort(items, 0, len(items), order)
    return items
```

The recursive helper is two cases: a base case that bails on slices of length zero or one, and a recursive case that partitions and recurses on both halves:

```python {export=codex/sorting/quick.py}

def _quicksort[T](items: list[T], lo: int, hi: int, order: Ordering[T]) -> None:
    if hi - lo < 2:
        return
    pivot_index = partition(items, lo, hi, order)
    _quicksort(items, lo, pivot_index, order)
    _quicksort(items, pivot_index + 1, hi, order)
```

Both blocks export to the same file; illiterate appends them in document order. Next in that order is the partition step, its own block and its own paragraph. Each conceptual piece earns its own prose frame and its own extracted function.

When the book renders, Quarto executes every demo block. If any algorithm is wrong, the build fails before that sentence can appear on the page. The build verifies every algorithm in the Codex, every time.

The book is the primary source. The algorithms are real code that runs. The two have never disagreed.

Until next time, stay curious.

illiterate is open-source at github.com/apiad/illiterate, a small Python tool that extracts code from Markdown and assembles it into runnable source files. If you write a book, a tutorial, or a course where the code has to match the text, the setup described here is twenty minutes to replicate.

The algorithms it enforces are in The Algorithm Codex, available in print on Gumroad, or free to read online at matcom.github.io/codex/. Both the Codex and my other books are bundled at apiad.gumroad.com/l/compendium.