<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[The Computist Journal]]></title><description><![CDATA[Educational articles, essays, and tutorials on Computer Science research, practice, and education.]]></description><link>https://blog.apiad.net</link><image><url>https://substackcdn.com/image/fetch/$s_!qNGT!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F582c72c0-c120-4ea8-ae6b-376a025250bb_1024x1024.png</url><title>The Computist Journal</title><link>https://blog.apiad.net</link></image><generator>Substack</generator><lastBuildDate>Sat, 16 May 2026 00:12:20 GMT</lastBuildDate><atom:link href="https://blog.apiad.net/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Alejandro Piad Morffis]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[apiad@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[apiad@substack.com]]></itunes:email><itunes:name><![CDATA[Alejandro Piad Morffis]]></itunes:name></itunes:owner><itunes:author><![CDATA[Alejandro Piad Morffis]]></itunes:author><googleplay:owner><![CDATA[apiad@substack.com]]></googleplay:owner><googleplay:email><![CDATA[apiad@substack.com]]></googleplay:email><googleplay:author><![CDATA[Alejandro Piad Morffis]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[How Large Languages Models Are Really Made]]></title><description><![CDATA[The full road from text data to reasoning models, explained visually with zero math or code.]]></description><link>https://blog.apiad.net/p/mhai-llms</link><guid isPermaLink="false">https://blog.apiad.net/p/mhai-llms</guid><dc:creator><![CDATA[Alejandro Piad Morffis]]></dc:creator><pubDate>Mon, 11 May 2026 15:51:34 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!gdqf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb83eb00a-2839-4312-8300-5fbf71095e63_1376x768.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gdqf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb83eb00a-2839-4312-8300-5fbf71095e63_1376x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gdqf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb83eb00a-2839-4312-8300-5fbf71095e63_1376x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!gdqf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb83eb00a-2839-4312-8300-5fbf71095e63_1376x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!gdqf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb83eb00a-2839-4312-8300-5fbf71095e63_1376x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!gdqf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb83eb00a-2839-4312-8300-5fbf71095e63_1376x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gdqf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb83eb00a-2839-4312-8300-5fbf71095e63_1376x768.jpeg" width="1376" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b83eb00a-2839-4312-8300-5fbf71095e63_1376x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1376,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gdqf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb83eb00a-2839-4312-8300-5fbf71095e63_1376x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!gdqf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb83eb00a-2839-4312-8300-5fbf71095e63_1376x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!gdqf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb83eb00a-2839-4312-8300-5fbf71095e63_1376x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!gdqf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb83eb00a-2839-4312-8300-5fbf71095e63_1376x768.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>After Joseph Racknitz&#8217;s 1789 <a href="https://en.wikipedia.org/wiki/Mechanical_Turk">cutaway engraving</a> of Wolfgang von Kempelen&#8217;s Mechanical Turk; generated with Nano Banana 2.</em></figcaption></figure></div><p>You type a message to an AI assistant and it answers. The answer isn&#8217;t looked up. It isn&#8217;t scripted. The model generated it, character by character, guided by a single mathematical question: <em>what comes next?</em></p><p>That question is the foundation of every language model ever built. A <strong>language model</strong> is a probability distribution over text &#8212; a function that, given a sequence of words, assigns a probability to every possible continuation. &#8220;The cat sat on the mat&#8221; scores higher than &#8220;the mat sat on the cat&#8221; not because a language model understands what cats do, but because the first sequence appears in human text and the second doesn&#8217;t. The model has compressed the co-occurrence patterns of an enormous corpus into its weights, and that compression is what produces a score.</p><p>Generative: you give the model a prefix, it samples the highest-probability next token, appends it, samples again, and repeats until a stop token arrives. The multi-paragraph response you got from ChatGPT this morning was that loop running a few hundred times. No lookup table. No if-else tree. No pre-scripted answers. Just: <em>given all of this text, what is most likely to come next?</em></p><p>Here&#8217;s the thing I find quietly strange about this: it works. A procedure this simple &#8212; assign probabilities, sample the maximum &#8212; has produced the most influential technology of the last decade. What makes it work isn&#8217;t the procedure. It&#8217;s everything that goes into building a probability distribution that&#8217;s actually <em>good</em>. Good enough to write coherent paragraphs. Good enough to reason about code. Good enough to pass the bar exam and explain quantum mechanics in language your parents can follow.</p><p>Getting there took decades of compounding ideas. The arc is what this piece covers &#8212; from the crudest possible approximation of &#8220;probability over text&#8221; to the current frontier, where models are learning to think.</p><p>Each section of what follows is best understood as a response to the failure of the previous one. N-gram models worked until they didn&#8217;t. Neural embeddings fixed the part that broke. Pretraining scaled the fix to the size where it became genuinely impressive. Instruction tuning made the result useful for the first time. Preference learning fixed what instruction tuning couldn&#8217;t. Reasoning models added something nobody was sure was trainable at all.</p><p>Seven steps. One direction.</p><blockquote><p><em>Every post on the blog this month is on the theme of agent reliability, anchored on the second edition of <a href="https://apiad.gumroad.com/l/ai">Mostly Harmless AI</a>, where the engineering details that don&#8217;t fit a blog post live. You can also <a href="https://books.apiad.net/books/mhai/">read the whole book online for free</a> in a custom reader I built. More at the end.</em></p></blockquote><div><hr></div><blockquote><p>This post may be truncated in your email. <a href="https://blog.apiad.net/p/mhai-llms">Read it online</a> for the best experience.</p></blockquote><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.apiad.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.apiad.net/subscribe?"><span>Subscribe now</span></a></p><h2>The n-gram intuition</h2><p>The simplest possible implementation of &#8220;probability over text&#8221; is a lookup table.</p><p>Take a large corpus &#8212; a hundred million words will do to start. For every three-word sequence (<strong>trigram</strong>) in that corpus, record which word follows it most often, and with what frequency. &#8220;The quick brown&#8221; &#8594; &#8220;fox,&#8221; nine times out of ten, because Project Gutenberg is full of that particular sentence. &#8220;The capital of&#8221; &#8594; &#8220;France&#8221; thirty percent of the time, &#8220;Germany&#8221; twelve, &#8220;England&#8221; eleven, and so on through the geography. For every trigram you&#8217;ve seen, you have a probability distribution over what comes next.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!CQfE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60bf3e3f-4e4b-4115-9ccf-54ee5a394136_1600x1021.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!CQfE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60bf3e3f-4e4b-4115-9ccf-54ee5a394136_1600x1021.png 424w, https://substackcdn.com/image/fetch/$s_!CQfE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60bf3e3f-4e4b-4115-9ccf-54ee5a394136_1600x1021.png 848w, https://substackcdn.com/image/fetch/$s_!CQfE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60bf3e3f-4e4b-4115-9ccf-54ee5a394136_1600x1021.png 1272w, https://substackcdn.com/image/fetch/$s_!CQfE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60bf3e3f-4e4b-4115-9ccf-54ee5a394136_1600x1021.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!CQfE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60bf3e3f-4e4b-4115-9ccf-54ee5a394136_1600x1021.png" width="1456" height="929" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/60bf3e3f-4e4b-4115-9ccf-54ee5a394136_1600x1021.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:929,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!CQfE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60bf3e3f-4e4b-4115-9ccf-54ee5a394136_1600x1021.png 424w, https://substackcdn.com/image/fetch/$s_!CQfE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60bf3e3f-4e4b-4115-9ccf-54ee5a394136_1600x1021.png 848w, https://substackcdn.com/image/fetch/$s_!CQfE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60bf3e3f-4e4b-4115-9ccf-54ee5a394136_1600x1021.png 1272w, https://substackcdn.com/image/fetch/$s_!CQfE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60bf3e3f-4e4b-4115-9ccf-54ee5a394136_1600x1021.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>A trigram model is a frequency dictionary: each three-word context maps to a distribution over possible next words. The catch is that a 50,000-word vocabulary admits ~126 trillion possible trigrams, most of which never appear in any corpus. Made with <a href="https://github.com/apiad/tesserax">tesserax</a>.</em></figcaption></figure></div><p>Now generate text. Give the model &#8220;The quick brown&#8221; and it continues: &#8220;fox.&#8221; Give it &#8220;jumped over the&#8221; and it continues: &#8220;lazy.&#8221; Three words in and it&#8217;s generated &#8220;The quick brown fox jumped over the lazy&#8221; &#8212; and if you&#8217;re lucky, it lands &#8220;dog&#8221; and you&#8217;ve reproduced a famous sentence entirely from corpus statistics. Locally, it&#8217;s plausible. You could read a sentence of this and not immediately know you&#8217;re looking at a machine.</p><p>The problems start fast. By the third sentence, the model has no idea it was talking about a fox. It only remembers the last three words. You ask it to continue &#8220;The fox had been running from the&#8221; and it has no idea that a fox is involved, or that running happened, or that there&#8217;s a pursuit in progress. It just has three words and a lookup table. The output is <em>statistically English</em>. It is not coherent.</p><p>This is the <strong>Markov assumption</strong>: the next word depends only on the last N words, not on the full history of the text. For N=3, it&#8217;s a trigram model. You can increase N &#8212; five-gram models were standard in commercial speech recognition for years &#8212; but the table explodes. Fifty thousand words in the English vocabulary gives 50,000&#179; possible trigram contexts, roughly 125 trillion entries. At N=10, the number of possible sequences exceeds the estimated number of atoms in the observable universe. The table can never be complete enough to cover the distribution.</p><p>There&#8217;s a real engineering solution to the &#8220;we haven&#8217;t seen this exact trigram&#8221; problem: smoothing and interpolation. Estimate the probability of an unseen N-gram from shorter sub-sequences. <strong>Hidden Markov models</strong> formalised this in a probabilistic framework that, by the 1990s, had enough polish to power industrial speech recognition and early machine translation. I don&#8217;t want to undersell it &#8212; it worked. It was genuinely useful. It just topped out.</p><p>The wall is fundamental. Real language has dependencies that can be arbitrarily long. <em>&#8220;The man who sold the car that had been parked in front of the house where my grandmother lived was finally found.&#8221;</em> The subject of &#8220;was finally found&#8221; is seventeen words and three nested clauses back. No N-gram model reaches it. You need something that can condition on the full context &#8212; or at least compress the full context intelligently &#8212; rather than amnesiacally forget everything more than N words ago.</p><p>You need a model that generalises from sequences it has seen to sequences it hasn&#8217;t. A lookup table can only interpolate from what it&#8217;s seen before. What you need is something that has <em>understood the pattern</em> deeply enough to extrapolate.</p><p>N-gram models work until they don&#8217;t &#8212; and they don&#8217;t beyond a few words.</p><h2>Words as numbers</h2><p>Neural networks can learn the compression n-gram models can&#8217;t achieve. But they have a hard prerequisite: they operate on numbers. Words are symbols. Before a neural network can do anything useful with text, you need to represent words as vectors. The naive approach throws away everything that matters.</p><p>The obvious first attempt is <strong>one-hot encoding</strong>. Vocabulary of 50,000 words; each word is a vector of length 50,000 with a single 1 and 49,999 zeros. &#8220;Cat&#8221; is at position 4,312; &#8220;dog&#8221; is at position 17,846; &#8220;carburetor&#8221; is somewhere else entirely. The problem: nothing in this representation suggests that &#8220;cat&#8221; and &#8220;dog&#8221; are more similar to each other than either is to &#8220;carburetor.&#8221; The distance between every pair of one-hot vectors is identical. You&#8217;ve handed the network a symbol system with no structure, and it has to reconstruct the structure from scratch &#8212; spending enormous capacity learning that cats and dogs are both animals, that both appear near &#8220;fur&#8221; and &#8220;vet,&#8221; that &#8220;cat food&#8221; and &#8220;dog food&#8221; are structurally related &#8212; before it can learn anything about how language actually works.</p><p>The key insight that resolved this came from linguistics, not machine learning, and I think it&#8217;s underrated as an idea. J.R. Firth, writing in 1957: <em>&#8220;you shall know a word by the company it keeps.&#8221;</em> The <strong>distributional hypothesis</strong>. Words that appear in similar contexts &#8212; near similar neighbouring words, in similar grammatical positions &#8212; tend to have similar meanings. &#8220;Cat&#8221; and &#8220;dog&#8221; both appear near &#8220;pet,&#8221; &#8220;feed,&#8221; &#8220;vet,&#8221; &#8220;owner,&#8221; &#8220;fur,&#8221; &#8220;collar.&#8221; The context is a fingerprint of the meaning. Encode that fingerprint in a vector and you have a representation where similar words land close together in space.</p><p><strong>Word2Vec</strong> (<a href="https://arxiv.org/abs/1301.3781">Mikolov et al., 2013</a>) turned this into a training procedure. Train a shallow neural network to predict a word from its surrounding context words, or vice versa. Force each word&#8217;s representation down into a dense vector of, say, 300 floating-point numbers. Train on a billion words of text. The network learns that words appearing in similar contexts should have similar representations, because that&#8217;s what makes the prediction task cheaper. Words with similar distributional patterns end up with similar vectors &#8212; not because anyone programmed that, but because it follows from the objective.</p><p>The result that made people pay attention: <strong>vector arithmetic encodes semantic relationships</strong>. Take the vector for &#8220;king,&#8221; subtract the vector for &#8220;man,&#8221; add the vector for &#8220;woman.&#8221; The nearest vector in the resulting space is &#8220;queen.&#8221; Paris minus France plus Italy is approximately Rome. Try it yourself: it works because the structural relationship between &#8220;king&#8221; and &#8220;queen&#8221; is parallel to the relationship between &#8220;man&#8221; and &#8220;woman&#8221; in how the four words co-occur with everything around them. No one wrote these analogies in. The geometry of the space mirrors the structure of meaning, because both are implicit in how words appear together in natural language.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KXh1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe75fb3c1-35e7-49f9-915b-6bdd8334bef2_1600x1387.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KXh1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe75fb3c1-35e7-49f9-915b-6bdd8334bef2_1600x1387.png 424w, https://substackcdn.com/image/fetch/$s_!KXh1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe75fb3c1-35e7-49f9-915b-6bdd8334bef2_1600x1387.png 848w, https://substackcdn.com/image/fetch/$s_!KXh1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe75fb3c1-35e7-49f9-915b-6bdd8334bef2_1600x1387.png 1272w, https://substackcdn.com/image/fetch/$s_!KXh1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe75fb3c1-35e7-49f9-915b-6bdd8334bef2_1600x1387.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KXh1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe75fb3c1-35e7-49f9-915b-6bdd8334bef2_1600x1387.png" width="458" height="396.97527472527474" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e75fb3c1-35e7-49f9-915b-6bdd8334bef2_1600x1387.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1262,&quot;width&quot;:1456,&quot;resizeWidth&quot;:458,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KXh1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe75fb3c1-35e7-49f9-915b-6bdd8334bef2_1600x1387.png 424w, https://substackcdn.com/image/fetch/$s_!KXh1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe75fb3c1-35e7-49f9-915b-6bdd8334bef2_1600x1387.png 848w, https://substackcdn.com/image/fetch/$s_!KXh1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe75fb3c1-35e7-49f9-915b-6bdd8334bef2_1600x1387.png 1272w, https://substackcdn.com/image/fetch/$s_!KXh1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe75fb3c1-35e7-49f9-915b-6bdd8334bef2_1600x1387.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>The relationship between</em> king <em>and</em> queen <em>is parallel to the one between</em> man <em>and</em> woman <em>&#8212; both fall out of how the four words co-occur with everything around them. Made with <a href="https://github.com/apiad/tesserax">tesserax</a>.</em></figcaption></figure></div><p>I find this genuinely strange, in the best possible way. You trained a network to do a simple word-guessing task on flat text, and the side effect was an algebra of concepts. The geometry was always latent in the co-occurrence patterns. Word2Vec just made it legible.</p><p>Modern language models don&#8217;t use Word2Vec as a separate preprocessing step &#8212; the embedding representations are learned jointly with the rest of the network during training on text. But Word2Vec&#8217;s intuition is <em>why</em> learned embeddings work at all. Once language is geometry, gradient descent has a surface to grip. You can compute distances, optimise them, stack arbitrarily deep networks on top, and train the whole thing end-to-end.</p><p>Embeddings are how we lie to neural networks in a useful way. We pretend words are points in space, so the math works out.</p><h2>Pretraining</h2><p>Now scale it.</p><p>Take a deep neural network &#8212; not the shallow two-layer thing in Word2Vec, but a transformer with dozens or hundreds of layers, billions of parameters, and an attention mechanism in every one of them. Feed it next-token prediction across the entire accessible internet: Wikipedia, GitHub, every book ever digitised, every forum thread, every research paper, every recipe, every political argument, every user manual for every piece of machinery ever manufactured. Same objective the n-gram model had: <em>given what came before, what comes next?</em> Except now the model has billions of parameters to compress the patterns into, the training signal is trillions of tokens, and the architecture is built to handle arbitrarily long context.</p><p>The architecture is what made everything else possible. The <strong>transformer</strong> (<a href="https://arxiv.org/abs/1706.03762">Vaswani et al., 2017</a>) uses <strong>self-attention</strong> as its core operation. For each token in the input, self-attention computes relevance weights over every other token in the sequence &#8212; learned weights, computed from the data, different for each token, different in each layer. A pronoun can attend strongly to the noun it refers to, twenty positions back. A closing argument can reach back to the premise from the opening paragraph. There is no fixed window; the model considers, in principle, the full context at every step.</p><p>This is what broke the n-gram scaling wall. Not a bigger lookup table. Not smarter interpolation. A learned, flexible attention mechanism that compresses long-range dependencies into the model&#8217;s weights rather than trying to enumerate every possible context sequence. The key property, and it&#8217;s the one I keep coming back to: <em>soft</em>. Self-attention doesn&#8217;t pick one relevant token; it blends all of them with learned weights. The whole sequence contributes to every prediction, with a learned notion of how much each part matters.</p><p>The other critical property is <strong>self-supervised learning</strong>. There are no human-provided labels anywhere in pretraining. The text itself is the training signal. Show the model &#8220;The capital of&#8221; and ask it to predict &#8220;France.&#8221; It&#8217;s wrong; the gradient flows; the weights update. Show it three trillion tokens; let the gradient flow three trillion times. The entire digitised corpus of human knowledge is your training set, with zero labelling cost, because the next token is always right there.</p><p><strong><a href="https://arxiv.org/abs/2001.08361">Kaplan et al., 2020</a></strong> measured loss as a function of model size, dataset size, and compute over seven orders of magnitude. The result: loss falls as a clean power law across all three dimensions. Double the parameters, get a predictable drop in loss. Double the training data, same. Scale is not a bet on something uncertain; it is a known return on investment, measured and re-measured across a staggering range.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-3Ei!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8adcc87d-c1ca-467b-b091-9e17e61cd52a_1600x944.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-3Ei!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8adcc87d-c1ca-467b-b091-9e17e61cd52a_1600x944.png 424w, https://substackcdn.com/image/fetch/$s_!-3Ei!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8adcc87d-c1ca-467b-b091-9e17e61cd52a_1600x944.png 848w, https://substackcdn.com/image/fetch/$s_!-3Ei!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8adcc87d-c1ca-467b-b091-9e17e61cd52a_1600x944.png 1272w, https://substackcdn.com/image/fetch/$s_!-3Ei!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8adcc87d-c1ca-467b-b091-9e17e61cd52a_1600x944.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-3Ei!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8adcc87d-c1ca-467b-b091-9e17e61cd52a_1600x944.png" width="490" height="289.08653846153845" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8adcc87d-c1ca-467b-b091-9e17e61cd52a_1600x944.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:859,&quot;width&quot;:1456,&quot;resizeWidth&quot;:490,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-3Ei!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8adcc87d-c1ca-467b-b091-9e17e61cd52a_1600x944.png 424w, https://substackcdn.com/image/fetch/$s_!-3Ei!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8adcc87d-c1ca-467b-b091-9e17e61cd52a_1600x944.png 848w, https://substackcdn.com/image/fetch/$s_!-3Ei!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8adcc87d-c1ca-467b-b091-9e17e61cd52a_1600x944.png 1272w, https://substackcdn.com/image/fetch/$s_!-3Ei!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8adcc87d-c1ca-467b-b091-9e17e61cd52a_1600x944.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>Schematic of the Kaplan finding: loss vs. compute on log-log axes. The points fall on a clean line over eight decades of compute &#8212; scale isn&#8217;t a bet, it&#8217;s a known return. Made with <a href="https://github.com/apiad/tesserax">tesserax</a>.</em></figcaption></figure></div><p><strong><a href="https://arxiv.org/abs/2203.15556">Hoffmann et al., 2022</a></strong> &#8212; the Chinchilla paper &#8212; corrected a real error in how the field had been applying Kaplan&#8217;s result. Most large models of 2020-21 had been trained on far fewer tokens than their parameter count justified. The Kaplan result suggested scaling up models as fast as possible; Hoffmann&#8217;s finding was that you need to scale model size <em>and</em> training data together, roughly equally, for a given compute budget. A 70B-parameter model trained on 1.4 trillion tokens outperformed a 280B-parameter model trained on fewer tokens at the same total compute cost. Smaller model, more data, better result. Scale both dimensions together.</p><p>What do you get at the end of all this? A <strong>base model</strong>. And this is the part that surprises people who haven&#8217;t seen one.</p><p>Type &#8220;What is the capital of France?&#8221; into a raw pretrained model and it continues the text. Maybe it writes out a geography quiz &#8212; &#8220;What is the capital of France? What is the capital of Germany? What is the capital of Italy?&#8221; Maybe it generates a fake Wikipedia article. Maybe it starts a trivia show transcript. It has not answered your question. It has found the most probable continuation of your prompt, given everything it absorbed during training.</p><p>The base model has absorbed more text than any human could read in a thousand lifetimes. The co-occurrence patterns of the entire digitised corpus of human writing are in those weights. It knows facts, relationships, styles, concepts, code, chemistry, poetry, legal prose, and every other form in which humans have arranged words.</p><p>It was trained to <em>continue</em>, not to <em>respond</em>. Ask it a direct question and it treats the question as the opening line of some text pattern &#8212; one it will extend in whatever direction seems most probable. It has no concept of &#8220;you asked me something and I should answer it.&#8221;</p><p>Pretraining gives a model knowledge. It does not give it manners, opinions, or any idea what you want from it.</p><h2>Instruction tuning</h2><p>Step one of making a base model useful: show it what &#8220;useful&#8221; looks like.</p><p>Collect thousands of demonstration pairs. A human writer sits with a prompt &#8212; &#8220;Explain the difference between supervised and unsupervised learning in plain English,&#8221; &#8220;Write a polite email declining this meeting invitation,&#8221; &#8220;Debug this Python function&#8221; &#8212; and writes the ideal response. Then fine-tune the pretrained base model on these (prompt, response) pairs using the same next-token objective, now applied to curated demonstrations rather than the open web.</p><p>This is <strong>supervised fine-tuning</strong>, or SFT. It is plain supervised learning &#8212; the same paradigm that has been in the machine learning textbooks since the 1980s. What&#8217;s new is only what it&#8217;s being applied to.</p><p>The headline result from <a href="https://arxiv.org/abs/2203.02155">Ouyang et al., 2022</a> &#8212; the InstructGPT paper &#8212; is still worth stating plainly: a 1.3 billion-parameter model, fine-tuned on human-written instruction-following demonstrations, was <em>preferred</em> by human evaluators over a raw 175 billion-parameter GPT-3. One percent of the parameters. Preferred.</p><p>Sit with that. The quality of the training signal matters more than raw scale. A carefully curated set of demonstrations of what &#8220;helpful answering&#8221; looks like is worth more, for the specific goal of being helpful, than a hundred times more parameters trained on unstructured internet text. The base model knows more. The instruction-tuned model is more useful. These are different things.</p><p>SFT teaches the <em>shape</em> of a helpful answer: addressed to the question asked, reasonably structured, proportionate in length, appropriate in tone. These are learnable patterns. The base model already has all the relevant knowledge in its weights; SFT is teaching it to retrieve and present that knowledge in a particular format.</p><p>Here&#8217;s the failure mode, and it matters for understanding everything that comes next.</p><p>SFT shows the model what good answers look like. It gives no mechanism for the model to evaluate, <em>at generation time</em>, which of two candidate continuations is more accurate, more honest, or less likely to cause harm. The model learned to imitate the shape of correct answers; it did not learn to <em>prefer</em> correctness over fluency when the two conflict. A confidently phrased wrong answer and a confidently phrased right answer can look identical from a format standpoint. SFT cannot distinguish them.</p><p>Teaching consistent refusals is especially brittle. To get a model to reliably refuse a class of harmful requests via SFT, you need human-written refusals for every phrasing variant you can anticipate. You will miss variants. The model has no general theory of harm. It has only pattern-matching against the phrasings it saw. Change the phrasing, add a fictional framing, ask in a different language, and the refusal can fail.</p><p>The deep limitation is this: SFT can teach what a good answer looks like, but it cannot teach which of two candidate answers is <em>better</em>. For that, you need to know something about <em>better</em> that you didn&#8217;t encode in any single example. You need preferences.</p><p>SFT teaches the shape of a good answer. It has no way to choose between two good shapes.</p><h2>From demonstrations to preferences</h2><p>The move that follows from SFT&#8217;s failure: what humans can do <em>faster</em> than writing demonstrations is ranking them.</p><p>Show a rater two model responses to the same prompt &#8212; response A and response B &#8212; and ask which is better. They can answer in seconds. Writing a response from scratch takes minutes. This means you can collect preference labels at much higher volume than demonstrations, and the preference label contains a different kind of information: not &#8220;here is the target,&#8221; but &#8220;this is closer to the target than that.&#8221;</p><p>Scale up the preference collection. Collect hundreds of thousands of (prompt, response A, response B, human ranking) tuples. Train a small auxiliary model &#8212; the <strong>reward model</strong> &#8212; to predict the human rankings: given a prompt and a response, output a scalar score. Then use <strong>reinforcement learning</strong> (specifically PPO) to push the language model toward generating responses the reward model scores highly.</p><p>This is <strong>RLHF</strong> &#8212; reinforcement learning from human feedback. <a href="https://arxiv.org/abs/2203.02155">Ouyang et al., 2022</a> used it as the third stage of the InstructGPT pipeline: pretraining &#8594; SFT &#8594; RLHF. ChatGPT&#8217;s characteristic tone &#8212; helpful, reliably cautious about harmful requests, good at hedging uncertainty, consistent about refusals &#8212; comes almost entirely from this stage.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!CtE_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c8d7d72-6a41-44dc-8429-be45acad45d1_1600x631.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!CtE_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c8d7d72-6a41-44dc-8429-be45acad45d1_1600x631.png 424w, https://substackcdn.com/image/fetch/$s_!CtE_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c8d7d72-6a41-44dc-8429-be45acad45d1_1600x631.png 848w, https://substackcdn.com/image/fetch/$s_!CtE_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c8d7d72-6a41-44dc-8429-be45acad45d1_1600x631.png 1272w, https://substackcdn.com/image/fetch/$s_!CtE_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c8d7d72-6a41-44dc-8429-be45acad45d1_1600x631.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!CtE_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c8d7d72-6a41-44dc-8429-be45acad45d1_1600x631.png" width="1456" height="574" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1c8d7d72-6a41-44dc-8429-be45acad45d1_1600x631.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:574,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!CtE_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c8d7d72-6a41-44dc-8429-be45acad45d1_1600x631.png 424w, https://substackcdn.com/image/fetch/$s_!CtE_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c8d7d72-6a41-44dc-8429-be45acad45d1_1600x631.png 848w, https://substackcdn.com/image/fetch/$s_!CtE_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c8d7d72-6a41-44dc-8429-be45acad45d1_1600x631.png 1272w, https://substackcdn.com/image/fetch/$s_!CtE_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c8d7d72-6a41-44dc-8429-be45acad45d1_1600x631.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>RLHF in five stages: the model samples two responses, a human ranks them, a reward model learns to predict the ranking, and PPO updates the policy to maximise that reward. DPO collapses the loop &#8212; the preference becomes a loss applied directly on the policy, skipping both the reward model and PPO. Made with <a href="https://github.com/apiad/tesserax">tesserax</a>.</em></figcaption></figure></div><p>The deepest shift over SFT: the model now has feedback about <em>direction</em>, not just target. SFT says &#8220;produce something like this example.&#8221; RLHF says &#8220;of the things you just produced, this kind is better than that kind &#8212; adjust accordingly.&#8221; A direction is a richer signal than a target. It can propagate to novel situations no demonstration ever covered.</p><p>The practical problem is that RLHF is a genuine engineering challenge. PPO is unstable. The reward model can be <strong>gamed</strong>: the policy learns to produce outputs that score highly on the reward model without actually being better, because the reward model is an imperfect proxy for true quality. Over long training runs, the policy finds exploitable features in the reward model and optimises for those rather than for what humans actually wanted. Balancing the RL update against the pretrained base (so the model doesn&#8217;t drift into incoherence while chasing reward) requires careful tuning. RLHF works, but it&#8217;s expensive, brittle, and hard to reproduce without a dedicated ML infrastructure team behind it.</p><p><a href="https://arxiv.org/abs/2305.18290">Rafailov et al., 2023</a> found something that, in retrospect, looks almost obvious: you can skip the reward model entirely.</p><p>The paper, &#8220;Direct Preference Optimization: Your Language Model is Secretly a Reward Model,&#8221; makes a mathematical observation. The preference-fitting problem that RLHF solves via a reward model + PPO can be reformulated as a classification loss directly on the language model policy. Given a preferred response and a dispreferred response to the same prompt, you want the model to be more likely to produce the preferred one. You don&#8217;t need a separate reward model to express that preference. You don&#8217;t need PPO to optimise it. The preference is a loss; the loss can be minimised directly on the policy.</p><p><strong>DPO</strong> is computationally lighter and far easier to get working. The abstract says it &#8220;eliminates the need for sampling from the LM during fine-tuning or performing significant hyperparameter tuning.&#8221; In practice, the gap between &#8220;has the resources of a frontier lab&#8221; and &#8220;fine-tuned a model that actually behaves well&#8221; narrowed substantially after DPO. Within a year of publication, it became the default alignment approach for most open-weight instruct models &#8212; the Llama family, Mistral, Qwen, and most of their derivatives. When you read that a model is &#8220;instruction-tuned&#8221; in 2024 or later, DPO or one of its descendants is almost always in the pipeline.</p><p>Both RLHF and DPO require human preference labels. In 2022-23, this was tractable. By 2024, at frontier scale, it was a real bottleneck. You need raters capable of judging quality on maths, code, medicine, science. You can&#8217;t hire enough such raters to keep pace with the rate at which models can generate candidate outputs.</p><p>The field&#8217;s response was predictable in retrospect: use the models themselves.</p><p><strong>RLAIF</strong> (<a href="https://arxiv.org/abs/2309.00267">Lee et al., 2023</a>) replaces human raters with a strong language model as the preference oracle. Head-to-head comparisons with RLHF showed that AI-generated preference labels are competitive with human ones on summarisation and dialogue. The reward model trained on AI labels performs comparably to the one trained on human labels. The human rater is no longer in the loop.</p><p><strong>Constitutional AI</strong> (<a href="https://arxiv.org/abs/2212.08073">Bai et al., 2022</a>, Anthropic) does something more principled. Write a list of principles &#8212; a constitution &#8212; stating what the model should and shouldn&#8217;t do. Ask the model to critique its own outputs against those principles and revise them. The critiques and revisions become training data. The RL stage uses the model&#8217;s own evaluations as the reward signal. Human preference labelling is replaced by explicit normative reasoning: the model has to argue about whether its outputs satisfy the stated principles, not just produce outputs that pattern-match to human-labelled examples.</p><p>The logic extends one step further. If models can generate reliable preference labels, can they generate <em>training data</em> directly? By 2024-25, the answer in a widening range of domains was yes. Maths problems with worked solutions. Code problems paired with passing test suites. Instruction-following demonstrations written by large models to train smaller ones &#8212; the distillation pipeline, where a 70B model generates training examples that improve a 7B model, and the better 7B model feeds the next iteration. By 2025, a substantial fraction of the data used to train frontier models isn&#8217;t scraped from the web. It&#8217;s generated by earlier versions of the models themselves.</p><p>Ilya Sutskever, speaking at <a href="https://www.youtube.com/watch?v=WGgDZOr1ph4">NeurIPS 2024</a>: <em>&#8220;Pre-training as we know it will end. Data is the fossil fuel of AI. We have but one internet.&#8221;</em> The scaling curve that had defined the field since Kaplan 2020 was visibly flattening. The field didn&#8217;t slow down. The growth frontier moved: from bigger pretraining to better post-training. The headline AI announcements of 2024 were not &#8220;we trained a bigger model on more of the internet.&#8221; They were &#8220;we trained a better model by using our previous models to generate, evaluate, and curate the training signal.&#8221;</p><p>The sharpest shift in AI in 2023-24 wasn&#8217;t a bigger model. It was figuring out how to use models to train better models.</p><h2>A new axis</h2><p>There is a 2024-25 discovery that changes the picture in a qualitatively different way. Not a refinement of post-training preference optimisation. Something new.</p><p>RL doesn&#8217;t just align models. It can teach them to <em>think</em>.</p><p>The observation that sets it up: language models already have scratch space. Their output is text; nothing prevents them from writing intermediate reasoning steps before writing a final answer. Chain-of-thought prompting &#8212; asking a model to &#8220;think step by step&#8221; &#8212; has been known since <a href="https://arxiv.org/abs/2201.11903">2022</a> to improve performance on reasoning tasks. The model writes out intermediate steps, and those steps help it arrive at a better final answer.</p><p>But chain-of-thought as a <em>prompting technique</em> has a persistent problem. The intermediate steps are generated by the same forward pass as the final answer. You can ask the model to think out loud, but you can&#8217;t verify that the scratchpad is doing reasoning work rather than <em>performing</em> reasoning for the reader. A model that writes plausible-sounding intermediate steps that happen to be wrong, then arrives at a wrong final answer, has not improved by being asked to show its work. The steps are decorative.</p><p>The <a href="https://openai.com/index/learning-to-reason-with-llms/">o-series models from OpenAI</a> in late 2024 made a conceptually simple training move: use RL where the reward is the correctness of the <em>final answer</em>, and leave the intermediate chain of thought entirely unsupervised. The model can write whatever it wants in the scratchpad. The only signal is whether the final answer is right.</p><p>What emerged from training was not what anyone programmed in. The model learned, without any explicit supervision of the intermediate steps, to <em>use</em> the scratchpad as actual working memory. Backtracking when an approach failed. Trying alternate formulations when one hit a wall. Verifying intermediate results before continuing. Restarting from scratch, several steps back, when it found an error in something it had already written. None of these behaviours appeared in labelled training examples. They fell out of the objective: over enough RL iterations, the training process found that careful scratchpad use led to more correct final answers, and it reinforced that.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gXIs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9da4f71b-6e76-41ad-93c4-9db3b062e8d3_1600x659.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gXIs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9da4f71b-6e76-41ad-93c4-9db3b062e8d3_1600x659.png 424w, https://substackcdn.com/image/fetch/$s_!gXIs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9da4f71b-6e76-41ad-93c4-9db3b062e8d3_1600x659.png 848w, https://substackcdn.com/image/fetch/$s_!gXIs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9da4f71b-6e76-41ad-93c4-9db3b062e8d3_1600x659.png 1272w, https://substackcdn.com/image/fetch/$s_!gXIs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9da4f71b-6e76-41ad-93c4-9db3b062e8d3_1600x659.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gXIs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9da4f71b-6e76-41ad-93c4-9db3b062e8d3_1600x659.png" width="1456" height="600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9da4f71b-6e76-41ad-93c4-9db3b062e8d3_1600x659.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:600,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gXIs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9da4f71b-6e76-41ad-93c4-9db3b062e8d3_1600x659.png 424w, https://substackcdn.com/image/fetch/$s_!gXIs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9da4f71b-6e76-41ad-93c4-9db3b062e8d3_1600x659.png 848w, https://substackcdn.com/image/fetch/$s_!gXIs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9da4f71b-6e76-41ad-93c4-9db3b062e8d3_1600x659.png 1272w, https://substackcdn.com/image/fetch/$s_!gXIs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9da4f71b-6e76-41ad-93c4-9db3b062e8d3_1600x659.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>Only the final answer receives reward. The scratchpad is entirely unsupervised &#8212; backtracking and verification fall out of the training loop, not from labelled examples of good reasoning. Made with <a href="https://github.com/apiad/tesserax">tesserax</a>.</em></figcaption></figure></div><p><strong>DeepSeek-R1</strong> (<a href="https://arxiv.org/abs/2501.12948">arXiv:2501.12948</a>, January 2025, open weights) replicated the result outside a closed lab. Pure RL on reasoning trajectories; no human-labelled chains required. The paper documents what they call the &#8220;aha moment&#8221; in training: a specific point where the model spontaneously began verifying its own intermediate steps and restarting when they failed. Not because the training data contained this behaviour as a pattern. Because the reward for correct final answers made careful intermediate reasoning instrumentally useful, and the RL training loop discovered it.</p><p>I think DeepSeek-R1 is the clearest published demonstration of something the field had suspected but hadn&#8217;t proven at scale: that reasoning, as a <em>behaviour</em>, is trainable from a simple outcome-based reward signal. You don&#8217;t need human annotations of good reasoning traces. You don&#8217;t need to supervise the scratchpad. You need to reward the right answer and run enough RL. The reasoning emerges.</p><p>Three things follow from this that are worth naming separately.</p><p><strong>Test-time compute</strong> is a new scaling axis. Pretraining scales with more data and more parameters &#8212; you pay at training time and get a more capable model. Reasoning models scale with more <em>inference compute</em> &#8212; you pay at generation time, by thinking longer, and get a better answer on the current problem. A smaller reasoning model that thinks for ten seconds can match or outperform a larger standard model answering in one pass. These axes are complementary, not competing. You can now trade training-time capability against inference-time deliberation, and that tradeoff is explicit and controllable in a way it wasn&#8217;t before.</p><p><strong>Diagnosability</strong> changes the failure mode. A standard model that gets a maths problem wrong gives you a wrong number. A reasoning model that gets it wrong gives you a chain of thought &#8212; readable, traceable, inspectable at every step. You can see exactly where the logic went off course: which intermediate claim was false, which inference was unwarranted, at what point the reasoning was solid and where it broke down. For systems where the reliability of the output matters &#8212; and in agent pipelines, it almost always does &#8212; this is the property that makes the difference. The failure is visible. Visible failures are debuggable. Black-box failures are not.</p><p>And the arc closes. The whole story of this piece &#8212; n-grams, embeddings, pretraining, instruction tuning, preference learning, and now reasoning &#8212; is one continuous story of making the training signal more specific. N-gram models encode raw co-occurrence statistics: this is what tends to follow that. Embeddings compress those statistics into geometry that neural networks can use. Pretraining scales that compression to the entire digitised corpus of human writing. Instruction tuning adds: here is what a helpful answer looks like. RLHF and DPO add: here is what <em>better</em> looks like, relative to what you just produced. RLAIF and synthetic data close the loop so models can teach each other. Reasoning models add the final turn: here is what <em>thinking carefully</em> looks like. Not by showing examples of good reasoning. By rewarding the right final answer, and letting the model figure out the rest.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lsXL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb53cda4d-3a30-4026-b518-9c551c891af5_1600x296.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lsXL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb53cda4d-3a30-4026-b518-9c551c891af5_1600x296.png 424w, https://substackcdn.com/image/fetch/$s_!lsXL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb53cda4d-3a30-4026-b518-9c551c891af5_1600x296.png 848w, https://substackcdn.com/image/fetch/$s_!lsXL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb53cda4d-3a30-4026-b518-9c551c891af5_1600x296.png 1272w, https://substackcdn.com/image/fetch/$s_!lsXL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb53cda4d-3a30-4026-b518-9c551c891af5_1600x296.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lsXL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb53cda4d-3a30-4026-b518-9c551c891af5_1600x296.png" width="1456" height="269" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b53cda4d-3a30-4026-b518-9c551c891af5_1600x296.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:269,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lsXL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb53cda4d-3a30-4026-b518-9c551c891af5_1600x296.png 424w, https://substackcdn.com/image/fetch/$s_!lsXL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb53cda4d-3a30-4026-b518-9c551c891af5_1600x296.png 848w, https://substackcdn.com/image/fetch/$s_!lsXL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb53cda4d-3a30-4026-b518-9c551c891af5_1600x296.png 1272w, https://substackcdn.com/image/fetch/$s_!lsXL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb53cda4d-3a30-4026-b518-9c551c891af5_1600x296.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption"><em>Seven paradigms, one direction. Each step exists because the previous step&#8217;s training signal wasn&#8217;t specific enough &#8212; and each one adds a kind of feedback the previous one couldn&#8217;t carry. Made with <a href="https://github.com/apiad/tesserax">tesserax</a>.</em></figcaption></figure></div><p>Reasoning models aren&#8217;t smarter than other models. They&#8217;re models that have learned to spend their intelligence more deliberately.</p><p>Each step in this story makes the feedback signal richer. Each step exists because the previous step&#8217;s signal wasn&#8217;t specific enough.</p><p>The direction is clear: we keep finding more precise ways to tell models what we want, and they keep using it.</p><p>Until next time, <strong>stay curious</strong>.</p><div><hr></div><p><em>The second edition of <a href="https://apiad.gumroad.com/l/ai">Mostly Harmless AI</a> goes deeper on what these training paradigms make possible in practice &#8212; why a reasoning model behaves differently as an agent core, what alignment actually means when you&#8217;re building a system rather than evaluating a benchmark, and the chapters that didn&#8217;t fit any blog post. The whole book is also available to <a href="https://books.apiad.net/books/mhai/">read online for free</a> in a reader I built and am rather fond of: dark mode, font controls, progress tracking, offline support, the works.</em></p><p><em>If you want everything I&#8217;ve written and everything I&#8217;m going to write, the <a href="https://apiad.gumroad.com/l/compendium">Compendium</a> bundles it all &#8212; one purchase, in perpetuity.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://store.apiad.net/l/compendiium&quot;,&quot;text&quot;:&quot;Check it out&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://store.apiad.net/l/compendiium"><span>Check it out</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[How to write a CLI an agent will actually use]]></title><description><![CDATA[Making tools that are safe by design, not convention.]]></description><link>https://blog.apiad.net/p/how-to-write-a-cli-an-agent-will</link><guid isPermaLink="false">https://blog.apiad.net/p/how-to-write-a-cli-an-agent-will</guid><dc:creator><![CDATA[Alejandro Piad Morffis]]></dc:creator><pubDate>Thu, 07 May 2026 11:00:58 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!QnD8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95862d5c-0883-4336-8d66-7f456f965db5_1376x768.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QnD8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95862d5c-0883-4336-8d66-7f456f965db5_1376x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QnD8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95862d5c-0883-4336-8d66-7f456f965db5_1376x768.png 424w, https://substackcdn.com/image/fetch/$s_!QnD8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95862d5c-0883-4336-8d66-7f456f965db5_1376x768.png 848w, https://substackcdn.com/image/fetch/$s_!QnD8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95862d5c-0883-4336-8d66-7f456f965db5_1376x768.png 1272w, https://substackcdn.com/image/fetch/$s_!QnD8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95862d5c-0883-4336-8d66-7f456f965db5_1376x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QnD8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95862d5c-0883-4336-8d66-7f456f965db5_1376x768.png" width="1376" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/95862d5c-0883-4336-8d66-7f456f965db5_1376x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1376,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1550097,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.apiad.net/i/196457123?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95862d5c-0883-4336-8d66-7f456f965db5_1376x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!QnD8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95862d5c-0883-4336-8d66-7f456f965db5_1376x768.png 424w, https://substackcdn.com/image/fetch/$s_!QnD8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95862d5c-0883-4336-8d66-7f456f965db5_1376x768.png 848w, https://substackcdn.com/image/fetch/$s_!QnD8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95862d5c-0883-4336-8d66-7f456f965db5_1376x768.png 1272w, https://substackcdn.com/image/fetch/$s_!QnD8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95862d5c-0883-4336-8d66-7f456f965db5_1376x768.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><blockquote><p><em>Monday I argued the next move in agent autonomy is the layered stack around the model. This is the technical companion: what the tools inside that stack should look like. Both pieces are part of May&#8217;s arc on agent reliability, anchored on the second edition of <a href="https://apiad.gumroad.com/l/ai">Mostly Harmless AI</a>, which you can also <a href="https://books.apiad.net/books/mhai/">read online for free</a>. More at the end.</em></p></blockquote><p>The first time you fire up a coding agent and walk away for an afternoon, you come back to a workspace that&#8217;s been touched in twenty places by something that doesn&#8217;t know what it doesn&#8217;t know. Most of the touches are fine. One of them, on the third such afternoon, will not be.</p><p>The reflex is to make the agent smarter. The fix is to make the <strong>tools it calls</strong> safer, by design, not by vigilance.</p><p>I was reminded of this an hour ago, building a small Python CLI to append entries to my Workspace journal. Dumb tool, three flags, ninety lines. I built it on top of <code>microcli</code>, a framework I&#8217;ve been using for about a year. The experience of writing it is what this post is about.</p><h2>What&#8217;s wrong with a normal CLI</h2><p>Sit with the unix-style ergonomics for a second. A normal CLI, written in argparse or click or typer, has a shape that&#8217;s been the default for fifty years.</p><p>It&#8217;s silent on success. You ran the command, you got back a shell prompt, you hope it worked. There&#8217;s no acknowledgement of what changed, no record of what the next step might be. If you&#8217;re a human at a terminal, you can run a verification command. If you&#8217;re an agent on a 30-call chain, you have to <em>infer</em> whether the action took.</p><p>It bitchslaps you on failure. A traceback, a single non-zero exit code, maybe an inscrutable error string from a subprocess three layers down. No suggestion of what to try instead. No indication of what already changed before the failure landed. If the command was halfway through writing a file when it died, the agent has no idea whether to retry, roll back, or move on.</p><p>It has no self-description. The only way to understand what the tool does is to read the source, or run it and observe the side effects, or both. There&#8217;s <code>--help</code>, but <code>--help</code> lists flags, not workflows. The agent doesn&#8217;t learn how to <em>use</em> the tool from <code>--help</code>; it learns the shape of the parser.</p><p>And worst of all for an autonomous loop: <strong>side effects are the default</strong>. You run <code>git push</code> and it pushes. You run <code>rm</code> and it removes. The dry-run is an opt-in flag at best, missing entirely at worst. Every call is a potentially load-bearing call, and the agent has to know in advance which ones will hurt.</p><p>This shape is fine for a human at a terminal who can hit Ctrl-C and read the docs. It&#8217;s a disaster recipe for an autonomous loop.</p><h2>What we actually need</h2><p>Invert all four:</p><ul><li><p><strong>Dry-run by default</strong>, real action opt-in. The model sees a preview <em>first</em>; the explicit <code>--save</code> flag is the consent gate. No call mutates state without the model having seen what would change.</p></li><li><p><strong>Predictable structured output for both happy and failed paths.</strong> Success says exactly what changed and where. Failure says exactly what failed, what state was already touched, and what to try next. No tracebacks, no silent zeros.</p></li><li><p><strong>Self-describing apps.</strong> The CLI carries its own learnable surface, not just a flag list. The workflows, the failure modes, the next-step commands all live inside the tool. The agent should be able to ask the tool <em>how am I supposed to use you</em> and get a real answer.</p></li><li><p><strong>Agents writing these apps themselves.</strong> This is the load-bearing one. You can do all of the above in plain typer or click. But the agent has to <em>remember</em> to do them. We need a framework where the safe shape is the path of least resistance, where it&#8217;s structurally hard to write a tool that <em>doesn&#8217;t</em> dry-run or <em>doesn&#8217;t</em> describe itself.</p></li></ul><h2>What this looks like with microcli</h2><p><code>microcli</code> is six hundred lines of Python and three principles, all enforced by the framework&#8217;s API surface rather than by convention. <strong>Validate before acting.</strong> <strong>Return descriptive messages.</strong> <strong>Default to dry-run.</strong> Each one is a behavior the API makes the path of least resistance, not a guideline you have to remember.</p><p>Rather than describe what writing one of these tools feels like, I&#8217;ll let the agent that wrote one tell you. Earlier today I asked Claude (running in this workspace, with file access and a shell) to write itself a small CLI for appending entries to my Workspace journal. Claude had never used <code>microcli</code> before. I told it to run <code>microcli learn</code> and roll with it. That was the last thing I had to say.</p><p>What follows is Claude&#8217;s account of the build, in its own voice.</p><div><hr></div><p><strong>Claude:</strong></p><p>I&#8217;d never seen <code>microcli</code> before. <code>which microcli</code> came back empty; the repo was at <code>~/Workspace/repos/microcli/</code> but not installed. A few minutes of yak-shaving (<code>gh repo clone</code>, <code>uv tool install -e .</code>, two transitive deps I had to fetch the same way) and I had a <code>microcli</code> binary.</p><p><code>microcli --tour</code> gave me a one-screen map. <code>microcli learn</code> listed the topics. Three commands later &#8212; <code>learn --topic principles</code>, <code>learn --topic ok-fail</code>, <code>learn --topic patterns</code> &#8212; I had everything I needed. I never read the framework&#8217;s source. The framework taught me how to use it through its own CLI, which is exactly the property the framework expects me to give my own tools.</p><p>The build started with <code>microcli new journal "Journal entry tool" add</code>. The scaffold isn&#8217;t a stub; it&#8217;s a teaching artefact. Inline comments pointed me at the next learning topics, the imports were already wired, the <code>@command</code> decorator was demonstrated, and a TODO gestured at the pattern I was supposed to follow.</p><p>Writing the body felt structurally guided. Validate first because <code>fail()</code> is the obvious primitive for rejecting bad inputs. Dry-run branch second because <code>save: bool = False</code> is the obvious parameter to add and <code>info()</code> is the obvious primitive for printing what <em>would</em> happen. Mutation branch third because there&#8217;s nowhere else for it to go. I&#8217;d have to actively work to write a tool that mutated by default or failed silently or didn&#8217;t describe itself.</p><p>The script lives at <code>.playground/microcli-journal/journal.py</code>. Ninety-odd lines. The load-bearing<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a> function:</p><pre><code><code>@command
def add(
    tag: Annotated[str, "Entry tag (commit / decision / blocker / ...)"],
    description: Annotated[str, "One-line entry body"],
    save: Annotated[bool, "Actually write the entry"] = False,
):
    """Append a single journal entry."""
    if tag not in ALLOWED_TAGS:
        fail(f"unknown tag '{tag}'. Allowed: {', '.join(sorted(ALLOWED_TAGS))}.")

    path = journal_path_for_today()
    entry = build_entry(tag, description)

    if not save:
        info(f"Would append to: {path}")
        info(f"  Line: {entry.rstrip()}")
        info(f"  Equivalent: journal.py add {tag!r} {description!r} --save")
        return

    write(path, read(path) + entry)
    ok(f"Appended entry to {path}.")
</code></code></pre><p>That&#8217;s the whole shape. The decorator handles argparse. <code>Annotated[..., "help text"]</code> becomes <code>--help</code> for free. <code>fail()</code> writes a red &#10007; and exits 1. <code>info()</code> prints a cyan line. <code>ok()</code> prints a green &#10003;. The dry-run runs first and renders the exact <code>--save</code> command the next caller should run, with the original arguments threaded back through.</p><p>Smoke-test took five minutes. Dry-run printed the preview plus the explicit <code>--save</code> equivalent. Bad tag produced a red message naming the allowed list. Bad date format produced a red message naming the expected shape. The real <code>--save</code> wrote the file and confirmed with a green &#10003; pointing at the path. Nothing to debug because there was nothing to infer.</p><p>The last thing I ran was <code>journal.py --learn add</code>. The framework AST-walked the source I&#8217;d just written and produced a tour of my command: description, next-step commands harvested from the <code>if not save:</code> branch, failure modes harvested from the <code>fail()</code> strings. The next agent that touches this tool will not need to read it either.</p><p>Back to Alex.</p><div><hr></div><p><strong>Alex again.</strong></p><p><em>That was&#8230; weird. But it was indeed Claude. I really gave it the task, and then asked it to explain what happened. To the extend that you can trust an AI Agent introspection (which is, to be honest, not much), this is a real account of what it &#8220;feels like&#8221; for an AI to use microcli.</em></p><p>What I want you to take from that is the <em>structure</em> of the experience, not the specific tool. Claude built a real, working CLI in a workspace it was already familiar with, in a framework it had never used, by following the framework&#8217;s own self-description. The unsafe shape was inaccessible.</p><p>That&#8217;s what I mean by structurally hard to do the wrong thing. You can do all of this in plain typer or click; you&#8217;d just have to remember to.</p><h2>Try it</h2><p><code>microcli</code> is at <a href="https://github.com/apiad/microcli">github.com/apiad/microcli</a>. <code>pip install microcli-toolkit</code>. The <code>microcli learn principles</code> tour runs in five seconds. If you build CLIs that an agent will eventually call &#8212; your agent, mine, or one that runs autonomously on a clock &#8212; this is the shape worth reaching for.</p><p>Have fun. And until next time, <strong>stay curious</strong>.</p><div><hr></div><p><em>If the larger argument resonates, <a href="https://apiad.gumroad.com/l/ai">Mostly Harmless AI</a> goes deep on this material. Full chapters on the harness around the model, the agentic stack we walked through Monday, and the discipline of building tools that compound rather than accumulate risk. You can also <a href="https://books.apiad.net/books/mhai/">read the whole book online for free</a> in a custom reader I built that I&#8217;m rather proud of: dark mode, font controls, progress tracking, offline support, the works.</em></p><p><em>If you want the whole catalog of everything I&#8217;ve written, plus everything I&#8217;m going to write, that&#8217;s <a href="https://apiad.gumroad.com/l/compendium">the Compendium</a>. One purchase, in perpetuity.</em></p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>This is how you know this was indeed Claude (Opus 4.7). I would <em>never</em> use &#8220;load-bearing&#8221;, even if my life depended of it, but Claude <em>loves </em>it. </p></div></div>]]></content:encoded></item><item><title><![CDATA[The Anatomy of a Linguistic AI Agent]]></title><description><![CDATA[From single-turn LLM to long-horizon autonomous AI.]]></description><link>https://blog.apiad.net/p/the-anatomy-of-a-linguistic-ai-agent</link><guid isPermaLink="false">https://blog.apiad.net/p/the-anatomy-of-a-linguistic-ai-agent</guid><dc:creator><![CDATA[Alejandro Piad Morffis]]></dc:creator><pubDate>Mon, 04 May 2026 17:17:05 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!t-3D!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2507af90-0128-4576-8448-2f46a50ea2c8_1376x768.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!t-3D!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2507af90-0128-4576-8448-2f46a50ea2c8_1376x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!t-3D!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2507af90-0128-4576-8448-2f46a50ea2c8_1376x768.png 424w, https://substackcdn.com/image/fetch/$s_!t-3D!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2507af90-0128-4576-8448-2f46a50ea2c8_1376x768.png 848w, https://substackcdn.com/image/fetch/$s_!t-3D!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2507af90-0128-4576-8448-2f46a50ea2c8_1376x768.png 1272w, https://substackcdn.com/image/fetch/$s_!t-3D!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2507af90-0128-4576-8448-2f46a50ea2c8_1376x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!t-3D!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2507af90-0128-4576-8448-2f46a50ea2c8_1376x768.png" width="1376" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2507af90-0128-4576-8448-2f46a50ea2c8_1376x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1376,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2080600,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.apiad.net/i/196445058?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2507af90-0128-4576-8448-2f46a50ea2c8_1376x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!t-3D!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2507af90-0128-4576-8448-2f46a50ea2c8_1376x768.png 424w, https://substackcdn.com/image/fetch/$s_!t-3D!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2507af90-0128-4576-8448-2f46a50ea2c8_1376x768.png 848w, https://substackcdn.com/image/fetch/$s_!t-3D!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2507af90-0128-4576-8448-2f46a50ea2c8_1376x768.png 1272w, https://substackcdn.com/image/fetch/$s_!t-3D!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2507af90-0128-4576-8448-2f46a50ea2c8_1376x768.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>After Rembrandt&#8217;s <a href="https://en.wikipedia.org/wiki/The_Anatomy_Lesson_of_Dr._Nicolaes_Tulp">&#8220;The Anatomy Lesson of Dr. Nicolaes Tulp&#8221;</a> (1632); generated with Nano Banana 2.</em></figcaption></figure></div><p>You have used a language model in a chat box. You typed a question, you got an answer, you closed the tab. The whole interaction lasted under a minute. The model did not remember you the next time you opened the page.</p><p>You have also seen, or read about, agents that work for hours. A coding agent that ships a feature overnight. A research agent that pulls together a hundred sources before breakfast. They plan, they call tools, they back out of dead ends, they hand you something you can use.</p><p>Both are the same model. Same neural network. Same forward pass. The only thing that changed is what&#8217;s wrapped around it.</p><p>This essay is the bridge. The architecture that turns the first thing into the second is not a single insight. It is a stack, a small number of layers, each one added in response to a failure mode of the previous layer. By the end you should be able to point at any agent doing real work in 2026 &#8212; coding, research, customer ops &#8212; and name which layer is doing the heavy lifting at any given moment.</p><p>Some of those layers are old. The fundamental one was published in 2022, before ChatGPT shipped. Some are very new. One was named eighteen months ago and is still settling. None of them, individually, is hard to follow. The trick is seeing them as a sequence, each fix opening the door for the next.</p><p>If you want a number to anchor where we start: METR has been measuring the time-horizon of frontier agents, and a language model on its own, with no scaffolding around it, sustains roughly a few minutes of human-equivalent work at 50% reliability. The equivalent of writing a competent meeting summary.</p><p>That is the floor.</p><blockquote><p><em>Every post on the blog this month is on the theme of agent reliability, anchored on the second edition of <a href="https://apiad.gumroad.com/l/ai">Mostly Harmless AI</a>, where the engineering details that don&#8217;t fit a blog post live. You can also <a href="https://books.apiad.net/books/mhai/">read the whole book online for free</a> in a custom reader I built. More at the end.</em></p></blockquote><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.apiad.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.apiad.net/subscribe?"><span>Subscribe now</span></a></p><h2>The base case</h2><p>Strip everything away first. No agent, no tools, no skills, no harness. Just the model.</p><p>A language model, in the strictly minimal sense, is a function from a string to a string. You hand it a sequence of tokens. It hands you back a sequence of tokens. One forward pass through the network. The input goes in at one end, the output comes out the other, one token at a time until the STOP token is generated, and that is the entire interaction. No state is held between calls. The next time you ask the same model the same question, it has no idea you have ever spoken before.</p><p>Inside that one shot, the model delivers. It will answer, draft, summarize, translate, brainstorm. Give it a good piece of context and a clear ask, and the response that comes back will, in my experience, often be useful enough to ship as-is. This is the experience that made everyone notice in late 2022. Open a chat, ask anything, get something back you can use. People called it magic at the time. Most of them still do, even though <em>useful function with no memory</em> is the more honest description.</p><p>But notice what it cannot do &#8212; which is most things you would ever want from an agent.</p><p>It cannot verify its own output. The same forward pass that produced the answer is the only one available to check it. There is no second opinion, no quick lookup, no <em>let me try it and see what happens</em>. The model is committed to whatever came out the first time.</p><p>It cannot look anything up. Whatever facts it has are baked into the weights from training, frozen at some cutoff date. If you ask about today&#8217;s news, or your codebase, or an internal company document, the model has nothing. And worse, it will frequently invent something plausible-sounding because completing a confident sentence is what it was trained to do.</p><p>It cannot act on the world. It cannot write to a file, send an email, call an API, run a command. It cannot do anything that has a side effect outside the chat window. The only thing it can produce is more text.</p><p>Inside the four walls of the context window, the base model is the most capable text engine the field has ever built. A single chat box was enough to launch the largest consumer product of the decade. Outside those walls, it is inert.</p><p>METR&#8217;s measurements of an unaugmented model &#8212; no tools, no loop, no scaffolding &#8212; put the time horizon at something on the order of minutes of human-equivalent work. Minutes. That is the starting capability. Everything else in this essay is a way of making those minutes compound.</p><h2>The first leap</h2><p>The first real agent paradigm is older than ChatGPT.</p><p>In October 2022, a team at Princeton and Google published <em>ReAct: Synergizing Reasoning and Acting in Language Models</em>. It went out about six weeks before the ChatGPT launch that made the public notice agents existed at all. Every working agent today &#8212; Claude Code, Codex, Gemini CLI, the dozens of research agents and customer-ops agents shipping this year &#8212; is some refinement of the loop that paper introduced.</p><p>Here is the setup. An agent operates in some environment: a Wikipedia API, a household simulator, a web shop, your codebase. The environment offers an <em>action space</em>, the set of things the agent is allowed to do. Call it <code>A</code>. A policy maps the current context to the next action: given everything the agent knows, what does it do next? With nothing else, the policy has to map a long, noisy trajectory of past observations directly to the right next move. This is brittle. The longer the task runs, the more lost the model gets.</p><p>ReAct&#8217;s move is to enlarge the action space. The new action space is <code>A</code> plus <code>L</code>, where <code>L</code> is the space of natural language. A &#8220;thought&#8221; is an action in <code>L</code>, the agent pausing to write itself a sticky note before reaching for the next tool. It does not change the world, it changes the <em>context</em>. The next action is conditioned on a context that now includes the model&#8217;s own reasoning about what just happened.</p><p>The paper spells out what thoughts are actually for, and the list is concrete, not mystical. Decomposing the goal into a plan. Injecting commonsense the environment does not supply. Extracting the relevant signal from a noisy observation. Tracking progress and noticing when a subgoal is done. Handling exceptions when something breaks. Five jobs.</p><p>Why this beats the alternatives is where the paper earns its place. Chain-of-thought prompting, the prior art, has the model reason in a closed loop inside its own head, with no contact with the world. The paper&#8217;s own ablation on the HotpotQA benchmark is brutal: chain-of-thought hallucinates in 14% of its <em>successes</em> and 56% of its failures. Acting alone, calling tools without thought, is grounded in the world but loses the global plan after a few steps. ReAct synthesizes them. On the same task, ReAct hallucinates in 6% of successes. Less than half. Both halves of the loop have to be there.</p><p>One concrete anchor before we move on. ReAct&#8217;s HotpotQA action space, the entire set of things the agent could do, was exactly three actions: <code>search[entity]</code>, <code>lookup[string]</code>, <code>finish[answer]</code>. Three. The first working agent paradigm operated on three tools. Hold that number.</p><p>The paper closes with the line that becomes the engine for the rest of this essay. <em>&#8220;Complex tasks with large action spaces require more demonstrations to learn well, which unfortunately can easily go beyond the input length limit of in-context learning.&#8221;</em> In plain English: more capability needs more action descriptions, which need more context, which we do not have. Every layer that follows is the field iteratively solving exactly that bottleneck.</p><p>METR step: a model wrapped in this loop moves from minutes to tens of minutes on bounded tasks.</p><h2>Tools</h2><p>So how do you fix ReAct&#8217;s bottleneck, the one the paper named in its own conclusion?</p><p>The first, most obvious answer: give the agent more actions to take. If <code>A</code> was the original action space and ReAct enlarged it to <code>A &#8746; L</code>, the next move is to make <code>A</code> itself bigger.</p><p>That is what a <em>tool</em> is. A tool is a function the model can call. It has a name, a typed schema for its arguments, and a return value. The model writes a tool call into the trajectory the same way it writes a thought. Except this one has a side effect on the world. The harness picks it up, runs the function, drops the return value back into the context. The next turn of the loop sees the result and decides what to do next.</p><p>The loop is unchanged. Same thinking, same acting, same context-grows-by-a-turn shape ReAct described. The difference is what the agent is allowed to do.</p><p>ReAct, recall, ran on three tools: <code>search</code>, <code>lookup</code>, <code>finish</code>. That was the entire menu. Claude Code in 2026 ships with more than twenty: read a file, edit a file, run a shell command, search the codebase, fetch a URL, spawn a subagent, take a screenshot, schedule a future tick, and so on. Each one is just a function with a schema. Each one expands the set of things the agent can do without changing one line of the underlying loop.</p><p>This is the part that surprised me, the first time I sat with it. The chatbot you typed at in 2022 and the agent that wrote your test suite this morning share one loop. What changed is the tool catalog. Same loop. Bigger menu.</p><p>That observation is the unsexy version of why tool-building is now a discipline of its own. Every capability you add to an agent &#8212; search the web, read a Slack channel, hit your billing API, deploy to staging &#8212; is just another function with a schema. The architecture does not change. The leverage is entirely in <em>which</em> tools you build and how you describe them to the model.</p><p>The design discipline that emerges is short to state and brutal to follow. Tools should be <strong>few</strong>, <strong>sharp</strong>, and <strong>self-describing</strong>. Few, because every tool you add takes up tokens in the system prompt and a slot in the model&#8217;s attention. Sharp, because a tool that does seven things is one the model will use wrong six times out of seven. Self-describing, because the model only learns to use a tool from its name, its docstring, and its argument schema. There is no other channel. (More on this on Thursday. Anthropic&#8217;s recent guidance on writing tools for agents is the cleanest summary of this craft I have read.)</p><p>METR step: a model with the right toolkit moves from tens of minutes to hours of bounded work.</p><h2>Skills</h2><p>Tools fix half of ReAct&#8217;s bottleneck. They expand the action space.</p><p>The other half, recall, is the input-length limit. Every tool you add costs tokens in the system prompt to describe: name, schema, when to use it, what its return value looks like. Add fifty tools that way and the system prompt is a small book. The model is reading every single tool description on every single turn, even when ninety-five of those turns have nothing to do with that tool.</p><p>Skills are the move that fixes this.</p><p>Anthropic shipped the idea in late 2024 and the rest of the field has been catching up since. A skill is, mechanically, almost embarrassingly simple. It is a markdown file. It has a name, a one-line description of when it applies, and a body that explains how to do the thing. The agent does not read it on startup. The agent reads it <em>on demand</em>: when, in the middle of a task, it notices a description that matches what it is about to do.</p><p>So instead of jamming <em>and here are seventeen other things you might want to do</em> into the system prompt, you put each of those things in its own file with a one-liner that names when to consult it. The system prompt stays small. The latent capability of the agent becomes, for practical purposes, unbounded. Every skill you write is one more thing it can do, but only when it actually needs to.</p><p>I find the deeper shift here more interesting than the engineering. The agent is reading documentation written for it. Not training data ingested months ago and frozen into weights. Documentation. Authored in plain prose. Versioned in git. Like the laminated procedure sheet a mechanic posts above a workbench for a job done once a month. Improvable by the same process that improves any document: someone notices the agent doing the wrong thing, edits the file, the next agent reads the new version and gets it right.</p><p>This is self-extension by reading, not by retraining. A new capability used to require a new training run, or at minimum a new fine-tune. Now it requires a markdown file. The cost of teaching an agent to do one more thing has fallen from days of GPU time to the minutes it takes to write a paragraph, and almost nobody outside the people building agentic systems has noticed.</p><p>The system prompt stays small. The set of things the agent can do, on demand, grows without bound. The two used to be the same number.</p><p>METR step: skills, more than anything else in this list, are what made the time horizon stop being bounded by how cleverly you wrote the system prompt.</p><h2>MCP</h2><p>For most of 2024, every agentic harness invented its own way to attach the same set of capabilities. You wrote a tool for Claude Code; it would not work in Codex. You wrote a skill for one harness; another harness could not see it. You hooked your billing API into one agent and had to do the same wiring four more times for the others. Every integration was bespoke. Nothing composed.</p><p>The Model Context Protocol (MCP) is the field&#8217;s answer to that. Anthropic shipped the spec in late 2024. By the end of 2025 every serious agent harness, including the ones not built by Anthropic, had adopted some version of it. Codex talks MCP. So does Claude Desktop, and Cursor, and a long list of others. This is one of those quiet moments where an industry just... agrees on a wire format, and a year later the world is different.</p><p>The architecture is three nouns. <strong>Hosts</strong> are the applications you actually use: Claude Desktop, Codex, Cursor. <strong>Clients</strong> live inside the host and talk to one server each. <strong>Servers</strong> are the things that actually expose capability: your codebase, your billing API, the Wikipedia search box from the ReAct paper four years ago.</p><p>What a server offers is the second triple: <strong>Resources</strong> (data the model can read), <strong>Prompts</strong> (workflow templates the user can invoke), and <strong>Tools</strong> (functions the model can call). Three nouns, again. The whole protocol is two threes.</p><p>The point is portability. A skill or tool you wrote once, against the protocol instead of against a specific harness, works everywhere. The lock-in moves out from under you. The agent ecosystem starts to compose the way the web did in the late 1990s. Not because someone planned it, but because everyone independently noticed it was cheaper to talk a shared protocol than to keep reinventing the connector layer.</p><p>Worth noticing what the spec foregrounds at the top of every chapter on tool calls: <strong>user consent</strong>. Capability requires permission. The protocol does not assume the model can do whatever a server exposes. It assumes the model has to ask, and the user has to answer. A small design choice with very large downstream consequences, and the reason the rest of this stack does not collapse into something nobody would let near their email.</p><p>METR step: not a step on the ladder, but a multiplier. The tools and skills from the last two sections now travel.</p><h2>Context engineering</h2><p>Add tools. Add skills. Add MCP. The agent can now do, in principle, almost anything you can describe in a prompt and a function. The trouble is what happens when it actually starts trying.</p><p>A long agent run accumulates context. Every observation from a tool call goes in. Every thought goes in. Every error message, every retry, every half-attempted plan that did not work goes in. After a few hours of work the context window is mostly <em>exhaust</em>: the trail of everything the agent tried, the great majority of which is no longer relevant to the next move. The model is searching for signal inside its own attic.</p><p>Karpathy named this <em>context engineering</em> in 2024, and the name stuck because the field had been doing it without a name for two years. Simon Willison wrote it up. LangChain made it a category. By 2026 it is a craft of its own: what to put in the context, when to summarize, what to evict, what to keep verbatim because the agent will need its exact wording later.</p><p>The central primitive in the discipline is <strong>compaction</strong>. At some threshold, typically 70% to 85% of the window, the agent stops, reads its own history, and rewrites it into a smaller form. <em>Here is what we were trying to do. Here are the decisions we made. Here is the state we are in. Here is the next move.</em> The compacted summary replaces the noisy trail. The agent keeps going on a fresh, smaller context with the salient bits intact.</p><p>The deeper move is that the agent now owns its own working memory in a way it never did inside a single ReAct loop. ReAct kept the entire history. Compaction lets the agent <em>curate</em> the history. A small change of grammar with a giant change of consequence.</p><p>Notice what this fixes. ReAct&#8217;s authors, in the same 2022 paper, named the dominant failure mode of their own system: <em>&#8220;the model repetitively generates the previous thoughts and actions, often failing to reason about what the proper next action to take should be and jump out of the loop.&#8221;</em> Translation: the agent gets stuck because its context is full of the same noise as the previous turn, so the next turn is the same noise plus a little more. That is a context problem. Context engineering is what stops it.</p><p>Without this layer, every previous layer eventually drowns. A hundred tools is useless if the agent&#8217;s context is so saturated it cannot find the right one. The five-thousand-word skill on how to handle a billing dispute is useless if the agent compacted it away on turn forty. Context engineering is the layer that makes the others <em>compound</em> over a long run instead of degrading into noise.</p><p>METR step: this is the layer that turns a few hours of focused agent work into a workday.</p><h2>The hierarchy of agency</h2><p>Stack the layers and the picture comes into focus. At 50% reliability on the METR time-horizon scale, a language model alone, with no scaffolding around it, sustains minutes of human-equivalent work. Wrap it in a ReAct loop with no tools, and that becomes tens of minutes. Add tools to ReAct, hours. Add skills and context engineering on top, a workday. Add an external loop above all of that, a fresh agent per turn on a clock with a journal handing state to itself, and the horizon stretches into days and weeks.</p><p>Stare at that ladder for a second. Each rung is the same model. What separates a chatbot from a coding agent that finishes a feature overnight is the scaffolding stacked around it. The frontier of what an agent can do in 2026 is set, almost entirely, by where you stop climbing.</p><p>Each layer has the same shape, in the abstract. Find the thing that bottlenecks the previous layer. Add a structure that lets the model offload that thing into the world, the way a machinist offloads a measurement into a caliper rather than holding it in memory. Into language, into tools, into files, into a clock. The model&#8217;s per-turn intelligence does not change. What changes is the time horizon over which that intelligence compounds.</p><p>The last rung is the one most people have not seen yet, and it is the one I have spent the last few months running on my own infrastructure. The trick is the same one. Take the bottleneck (the agent runs out of context before it runs out of work) and offload it. The new offload target is <em>the file system</em>. The new clock is <em>cron</em>. Past-Claude writes a markdown file at the end of its turn that says what it did and what comes next. A timer fires some hours later. Future-Claude wakes into a fresh context, reads the file, makes the next move, writes the file, exits. The continuity is in the file, not in the model.</p><p>That is the entire primitive. A markdown file and a timer. Past self tells future self what to do.</p><p>What you get from it is hard to describe to someone who has not run one. The agent works on your stuff for weeks at a time. It writes new jobs for itself. It reads the documentation about its own substrate and uses the tools that documentation describes. It makes mistakes (one in five runs produces something I have to throw out) but the mistakes are caught by the same kind of boring engineering that catches mistakes in any other autonomous system. Audit log, lock registry, archive-only deletion, every state change committed to git before the next turn starts.</p><p>The point of saying this out loud is that the same trick keeps working. Extend the action space; add a layer that compounds; let the previous layer drop the things it could not hold. The trick does not stop at hours. It does not stop at days. METR&#8217;s curve has been doubling every four months over the last two years. The 2027 projection is a working day. The 2028 projection is a working week.</p><p>Each doubling is one more scaffolding layer.</p><h2>The frontier is not the model</h2><p>Step back from all of it.</p><p>The architecture you&#8217;ve just walked through is <em>layered</em>. A language model at the core. ReAct around the model, turning tokens into actions. Tools around ReAct, expanding what those actions can be. Skills letting the agent pull capability from the file system instead of carrying it in the system prompt. MCP making everything portable. Context engineering keeping the whole thing from drowning in its own exhaust. An external loop on top of all that, when the work runs longer than a single context window can hold.</p><p>Every agent doing real work in 2026 &#8212; your coding agent, your research agent, the customer-ops bot answering your refund request, my private-tick agent running once an hour &#8212; has this shape. They differ in which tools they ship and which skills they read on demand. They do not differ in the shape of the stack. Once you can see the layers, you can see them everywhere.</p><p>So here is the closing claim, the techno-pragmatist version of what the article has been arguing the whole time. <strong>The frontier is not the model. It is the layers around it.</strong> And the entire stack is the field&#8217;s three-year answer to a single sentence in a single paper from October 2022 that named its own ceiling and dared the rest of us to climb past it.</p><p>One frontier worth flagging before I close. A competent agent can already write its own tools and skills on demand. That part is shipping today. The next move is teaching it, via tools and skills, to <em>detect by itself</em> when its current toolkit doesn&#8217;t cover what it&#8217;s trying to do, so it knows when to extend itself without being told. Self-extension that triggers itself. That is the live edge right now, and where the next few posts are headed.</p><p>The next post zooms in on the innermost layer the agent touches: the tools themselves, and what makes a tool safe enough to live inside a stack like this. That is a story for another Thursday.</p><p>Until next time, stay curious.</p><div><hr></div><p><em>If this is the worldview you want to take more seriously, the second edition of <a href="https://apiad.gumroad.com/l/ai">Mostly Harmless AI</a> (due May 25th) goes deep on the agentic stack we walked through here. Full chapters on context engineering and on the harness around the model, with the math, the case studies, and the parts that didn&#8217;t fit a blog post. You can also <a href="https://books.apiad.net/books/mhai/">read the whole book online for free</a> in a custom reader I built that I&#8217;m rather proud of: dark mode, font controls, progress tracking, offline support, the works.</em></p><p><em>If you want the whole catalog of everything I&#8217;ve written, plus everything I&#8217;m going to write, that&#8217;s <a href="https://apiad.gumroad.com/l/compendium">the Compendium</a>. One purchase, in perpetuity.</em></p>]]></content:encoded></item><item><title><![CDATA[Realtime 3D in Pure Python + Numpy]]></title><description><![CDATA[Just a fun side-project that went way too far-as it happens.]]></description><link>https://blog.apiad.net/p/realtime-3d-in-pure-python-numpy</link><guid isPermaLink="false">https://blog.apiad.net/p/realtime-3d-in-pure-python-numpy</guid><dc:creator><![CDATA[Alejandro Piad Morffis]]></dc:creator><pubDate>Tue, 07 Apr 2026 15:59:45 GMT</pubDate><enclosure url="https://images.unsplash.com/photo-1682846133858-c0599bc99874?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxMHx8M2R8ZW58MHx8fHwxNzc1NTc3NTM2fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://images.unsplash.com/photo-1682846133858-c0599bc99874?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxMHx8M2R8ZW58MHx8fHwxNzc1NTc3NTM2fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://images.unsplash.com/photo-1682846133858-c0599bc99874?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxMHx8M2R8ZW58MHx8fHwxNzc1NTc3NTM2fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1682846133858-c0599bc99874?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxMHx8M2R8ZW58MHx8fHwxNzc1NTc3NTM2fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1682846133858-c0599bc99874?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxMHx8M2R8ZW58MHx8fHwxNzc1NTc3NTM2fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1682846133858-c0599bc99874?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxMHx8M2R8ZW58MHx8fHwxNzc1NTc3NTM2fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw"><img src="https://images.unsplash.com/photo-1682846133858-c0599bc99874?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxMHx8M2R8ZW58MHx8fHwxNzc1NTc3NTM2fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" width="3456" height="2234" data-attrs="{&quot;src&quot;:&quot;https://images.unsplash.com/photo-1682846133858-c0599bc99874?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxMHx8M2R8ZW58MHx8fHwxNzc1NTc3NTM2fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2234,&quot;width&quot;:3456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;a sculpture of a person on top of a ball&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="a sculpture of a person on top of a ball" title="a sculpture of a person on top of a ball" srcset="https://images.unsplash.com/photo-1682846133858-c0599bc99874?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxMHx8M2R8ZW58MHx8fHwxNzc1NTc3NTM2fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1682846133858-c0599bc99874?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxMHx8M2R8ZW58MHx8fHwxNzc1NTc3NTM2fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1682846133858-c0599bc99874?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxMHx8M2R8ZW58MHx8fHwxNzc1NTc3NTM2fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1682846133858-c0599bc99874?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxMHx8M2R8ZW58MHx8fHwxNzc1NTc3NTM2fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>Just a random, completely unrelated stock photo, because I&#8217;m cool now.</em> Photo by <a href="https://unsplash.com/@rozetsky">Ant Rozetsky</a> on <a href="https://unsplash.com">Unsplash</a></figcaption></figure></div><p>Today I want to tell you a different kind of story. It&#8217;s not about machine learning, large language models, algorithms, or theory of computer science.</p><p>It&#8217;s about a side project that I&#8217;ve been building for a couple of weeks that made me fall in love again with an ancient love of mine. The quick and easy way to explain it is this: a performance-focused graphics engine for data-driven visualizations in Python.</p><p>But that&#8217;s only the surface. If you want to see the coold demos and the technical description, feel free to scroll down. But if you want to know the story behind it, let me start from the beginning.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.apiad.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.apiad.net/subscribe?"><span>Subscribe now</span></a></p><h2>The Origin Story</h2><p>So, this starts back in undergrad, before I did anything related to machine learning or optimization or statistics. My first love was actually computer graphics. I had &#8220;learned to code&#8221; like, I don&#8217;t know, at 11 or 12, and for the first five years or so, before getting to college and actually learning to code, all my &#8220;coding&#8221; was basically tiny games. It was RPG Maker back then&#8212;who remembers that?</p><p>I always wanted to be a game developer, as you may imagine, and I think that&#8217;s probably the main motivation why I studied Computer Science. There are two kind of people who want to study Computer Science, as a matter of fact. One is people who love games&#8212;and the other is, of course, people who hate games; there are no in-betweens.</p><p>I was the loving-games kind and all I wanted to build games for a life. So when I was in first year, after actually learning some real coding, my first kind of large project was a game engine. This was before Unity, before even XNA&#8212;who remembers that?&#8212;this was when .NET was getting started, and I wrote a quick and dirty game engine in pure C# that talked native DirectX 11.</p><p>It was very cheap, a disaster of architecture almost surely, but it taught me the basics of how to construct a scene graph, how to animate a camera, how to do lighting, how to write very basic shaders. I learned a ton and basically fell in love with computer graphics.</p><p>I ended up doing my diploma thesis in computer graphics&#8212;screen-based global illumination, a couple of years before NVIDIA came up with ray tracing on the GPU, which basically killed that whole area of research. And I also did my Master&#8217;s on global illumination and some data structures for the GPU, but after graduation I quickly switched research towards machine learning and AI, which, you can imagine, this 2014, and deep learning was just on the rise. The rest is history, as they say.</p><p>And then, here I was this past week thinking about old projects that I used to have fun with when I was in college, and trying to remember what it felt like to code back then, no LLMs, no internet for the most part even. The time where I&#8217;ve been the most fun was probably when I dabbed into procedural generation of cities, mountains, lakes, and&#8230; stuff, in the late 2012. This was at the early era of PCG, and I never got to do anything with that other than a few tutorials and a few lessons that I taught at University.</p><p>I played with Unity for a couple of years, but nothing too serious&#8211;I think I was actually one of the first people in my University to even install Unity, and I even taught a couple of Master&#8217;s courses on it. I participated in a couple of game jams, but after 2017 or so I stopped doing graphics all together. And I&#8217;ve been doing machine learning since.</p><p>But, in any case, I kind of forgot about computer graphics all along. At least during day-worked. So there was I last weeek, remembering that and asking how hard would it be to actually make a graphics engine in Python, some quick hack like my undergrad projects. I did a bit of research and I discovered that Python is, as of 2026, in a very good position to build a graphics engine, and not just a crapy one, but one that is actually fast. We have WGPU now&#8212;the spiritual and practical succesor of OpenGL (who remembers that?), which has native suppotr for GPU-accelerated graphics in Linux.</p><p>I basically did a plan and sat for three days to hack this thing.</p><h2>The Engine</h2><p>My first idea was to have a Rust backend for all the graphics engine stuff&#8212;the rendering loop, materials, lights&#8212;but I quickly decided to drop that idea because getting Rust and Python to talk to each other was becoming increasingly harder and harder, and I really wanted to finally see a damn cube rendering on my screen.</p><p>So I decided to switch completely to Python. But since I&#8217;m a grown-up now, I have to find some kind of serious objective for making something like this. I decided I didn&#8217;t want to make a typical graphics engine where you have a scene graph with hierarchies of entities and properties, and you simply render all of them. No, that is way too 2000s.</p><p>I decided I wanted to do a very fast, data-driven visualization tool purely based on the Entity-Component-System (ECS) paradigm and make it extremely performant, so it would focus on big data-driven simulations like N-body simulations, chemical and physics experiments, AI pathfinding and agents, you know, grown-up stuff like that.</p><p><em>(But actually, all I wanted was to play with WGPU and draw some cubes in Python. Wink, wink.)</em></p><p>This framing gave me two things, though. My solution doesn&#8217;t have to be very fancy as a game engine, we don&#8217;t need to be able to like load skeletal animations or stuff like that. It&#8217;s not actually a game engine; it&#8217;s a graphics engine with at best some interaction logic. But it still lets you do some cool stuff, even if all you can render is blocks and spheres. When you can render thousands of them running very fast on the GPU, you can do some cool stuff. So this is the motivation, and now let me show you what I have.</p><h3>Deep Dive</h3><p>So here is <strong>manifold</strong>&#8212;short Manifold Graphics if you want. It&#8217;s a Python library built on top of WGPU, a graphics engine based on the Entity-Component-System paradigm.</p><p>If you have never heard about it, ECS is a completely different way of writing code that is especially tailored for video games, but it is very little known outside of the game development world. And its awesome.</p><p>In a typical business code, you have entities who own their data, and you usually have behavior associated to entities; so entities also own their behavior&#8212;this is the basic Object-Oriented Programming paradigm where objects own their data and their methods. And if you want to do something with an object, you have to call methods on the object so the object guarantees the instance invariances.</p><p>Since OOP was basically the ONE programming paradigm of the 90s and early 2000s&#8212;when the videogame industry really exploded&#8212;it is only normal that we started writing games like this. But there is a problem with OOP (well, many problems, but one in particular that matters for our discussion).</p><p>When you have 10,000 objects, each of them with more or less the same structure, e.g., they are physical particles bouncing with each other, or little zerlings comming to your base, you simply <em>cannot</em> update them fast enough. For example, making a physics simulation out of this is extremely slow if you have to go to each particle and update its velocity, its scale, its rotation, etc. <strong>You</strong>&#8217;ll end up doing thousands of tiny method calls, thrashing your cache, and issuing lots of super small copies to GPU for drawing.</p><p>What you want is to vectorize this operation. You would like to have all of the objects&#8217; data in a single NumPy matrix, and you want to write a very, very efficient vectorized code that doesn&#8217;t do any loop and just updates everything at once. THen copy all the data to the GPU and issue a single draw call that renders all objects parameterized by their positions, rotations, etc. Chef kiss.</p><p>This is the Entity-Component-System paradigm at its core. It completely flips the responsibilities from standard OOP The <strong>components</strong> are just flat storage of data (rows in a matrix) and the <strong>entities</strong> are just pointers to a row where all of their data lives. Then the systems are methods that act on a subset of entities using heavily vectorized code, because each system deals with a large number of equally-structured entities, and they don&#8217;t care which is which.</p><p>In <strong>manifoldx</strong>, each system is a Python method that receives a subset of entities that have some combination of components. For example, if you want to process all of the particles in a simulation, you write a system that receives entities that have the <code>Particle</code> component, perhaps also a <code>Transform</code> component. In the transform component, you will have the position, rotation, scale, and the particle component will store simulation-specific data like velocity, temperature, momentum, etc.</p><p>The key to high performance in ECS is to avoid looping as much as possible. You assume all of the components of the entities in a system have exactly the same layout, so what you get is really a view of a matrix, and you write vectorized code. You add something to all them, you multiply all them by something, or in general you compute some matrix operations on them. All at once.</p><p>And if you can write your code like this, then you get a very, very fast rendering loop because instead of making one method invocation per entity, you make one method invocation per <em>archetype</em>, that is, per combination of components, which is a couple of order of magnitude less that your entities count.</p><p>Here&#8217;s a minimal example showing how the ECS works in <strong>manifoldx</strong>:</p><pre><code><code>import manifoldx as mx
import numpy as np

engine = mx.Engine("Cubes")
engine.camera.zoom(0.1)

# These are all static things that are created
# and stored in memory once
mesh = mx.geometry.sphere(1)
material = mx.material.phong(mx.colors.BLUE)

# Custom component, gets registered in engine to keep track
# Only used for reflection on the values
@engine.component
class Particle:
    velocity: mx.Vector3
    angular: mx.Vector3
    life: mx.Float

# This runs every frame
@engine.system
def particle_lifecycle(query: mx.Query[Particle, Transform], dt: float):
    query[Particle].life -= dt  # Single vectorial operation
    query[Transform].position += query[Particle].velocity * dt
    query[Transform].rotation += Transform.rotation(euler=query[Particle].angular * dt)
    query[Transform].scale = query[Particle].life / 10.0

    # Destroy all dead particles at once
    engine.destroy(query[Particle].life &lt;= 0)

    # Now we create lots of particles
    N = int(100 * dt)

    # This will in principle reuse the buffers for dead entities,
    # but will expand the buffer if necessary
    engine.spawn(
        Mesh(mesh),
        Material(material),
        Transform(pos=(0, 0, 0), scale=(1, 1, 1)),
        Particle(
            velocity=np.random.uniform(-5, 5, (N, 3)),
            angular=np.random.uniform(-2, 2, (N, 3)),
            life=np.random.rand(N) * 10,
        ),
        n=N,
    )

    # Update camera
    engine.camera.orbit(45 * dt, 0)</code></code></pre><p>That&#8217;s it. A single line of code to update all positions at once. Notice the <code>query</code> argument that defines which entities you get (all entities with both a <code>Transform</code> and a <code>Particle</code> component). And here is how that looks like:</p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;5c6ae9e4-c010-46dd-a47d-6b5b87679965&quot;,&quot;duration&quot;:null}"></div><p>In a real simulation, you can have, say 10 systems, but you have 10,000 or 100,000 entities, and you know you can do very fast vectorized updates in NumPy for all them, 10 times each frame.</p><p>For example, if you have 500 particles and you want to do N-body simulation, computing the 500-squared gravity interactions 60 times per second in Python is suicide. But if you do it in NumPy, then you get something that runs in a few milliseconds. A quarter million interactions computed 60 times per second. In Python.</p><p>To make it really efficient, you need to also avoid copying or moving data; it&#8217;s all masking and clever NumPy layout that keeps all of the memory in one place, and you are just seeing fragments of that memory in each system.</p><p>The other key idea is that you don&#8217;t modify anything in a system. That line where position is set, doesn&#8217;t really write back to the matrix. All it does is compute the right-hand side and then you issue a command that will be run at the end of all the systems, before frame rendering happens. This allows to write pure threaded parallelism, because you can run several systems in different threads&#8212;they are all reading the same data, but they aren&#8217;t writing to the buffers, which is great since Python has real support for multi-threading now in 2026 (after 35 years!).</p><h2>Showcase</h2><p>That is the basic idea. Now lets see some examples. AS of today, version 0.2, <strong>manifoldx</strong> has some basic shapes like cubes, spheres, and planes, and support for basic PBR lighting, camera controls, and that&#8217;s basically it.</p><p>All the engine realy does is set up this somewhat clever inversion of logic that forces you to write very efficient code, and the magic is in what you do inside the systems.</p><p>So let me show you three examples.</p><h3>1. N-Body Gravitational Simulation</h3><p>The first is an N-body simulation. All gravity computation happens in a single NumPy block with no Python loops. The only relevant part of the code is the gravity system, that looks something like this.</p><pre><code><code>@engine.system
def nbody_gravity(query, dt):
    pos = query[Transform].pos.data  # (N, 3)

    # All-pairs position differences: (N, N, 3)
    diff = pos[None, :] - pos[:, None]
    dist = np.linalg.norm(diff, axis=2)

    # Force magnitude: G * m_i * m_j / r&#178;
    force_mag = G * mass_prod / np.maximum(dist, SOFTENING)**2

    # Net force = sum over all other bodies
    net_force = (force_mag[:, :, None] * diff / dist[:, :, None]).sum(axis=1)

    velocities += (net_force / masses[:, None]) * dt
    query[Transform].pos += velocities * dt</code></code></pre><p>This runs 500 bodies with 250,000 force pair computations at 60fps.</p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;3587e1ae-4af3-49a9-b39e-a9d495d24bc1&quot;,&quot;duration&quot;:null}"></div><h3>2. Ideal Gas Simulation</h3><p>The second example is an ideal gas with elastic collisions inside a bounding box. Again, all running without a single for loop. Collision detection and impact resolution in vectorizednumpy operations.</p><pre><code><code>@engine.system
def gas_physics(query, dt):
    pos = query[Transform].pos.data

    # Wall collisions: vectorized mask
    below = (pos + velocities * dt) &lt; -BOX_HALF
    above = (pos + velocities * dt) &gt; BOX_HALF

    # Here we avoid branching and use masking instead
    velocities[below] = np.abs(velocities[below]) * RESTITUTION
    velocities[above] = -np.abs(velocities[above]) * RESTITUTION

    # Particle collisions: find overlapping pairs
    diff = pos[None, :] - pos[:, None]
    dist = np.linalg.norm(diff, axis=2)
    overlap = dist &lt; 2 * PARTICLE_RADIUS
    i_idx, j_idx = np.where(np.triu(overlap))

    # Resolve collisions with impulse
    # ... (collision resolution code)
    # ... (also vectorized)

    query[Transform].pos += velocities * dt</code></code></pre><p>Here&#8217;s how that looks like.</p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;86436879-ae58-4d3d-9616-fac2e5dfcf7c&quot;,&quot;duration&quot;:null}"></div><h3>3. Boids Flocking</h3><p>The third example is a Boids simulation with emergent flocking behavior. This is the one that strikes me the most because boids simulation is often compute-heavy. Each individual entity must keep track of a subset of neighbors and adjust behavior based on them, not the whole set of entities. But again, a bit of numpy magic lets us vectorize the crap out of this and simulate 300 boids at 60 frames per second.</p><pre><code><code>@engine.system
def boids_physics(query, dt):
    # Separation, alignment, cohesion as vectorized tensor ops
    diff = pos[None, :] - pos[:, None]  # (N, N, 3)
    dist_sq = (diff * diff).sum(axis=2)

    neighbors = dist_sq &lt; PERCEPTION_SQ

    # Separation (1/dist&#178; weighted)
    sep = (-diff * (neighbors[:,:,None] * inv_dsq[:,:,None])).sum(axis=1)

    # Alignment (average neighbor velocity)
    avg_vel = (vel[None,:] * neighbors[:,:,None]).sum(axis=1) / safe_count

    # Cohesion (steer toward center of mass)
    center = (pos[None,:] * neighbors[:,:,None]).sum(axis=1) / safe_count

    # Plus predator avoidance and boundary steering...
    # That one is easy.</code></code></pre><p>Here goes a sneak peek.</p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;704ce0e0-34f9-46d4-a8d5-438d5074f70a&quot;,&quot;duration&quot;:null}"></div><div><hr></div><p>You can check all the examples in the <a href="https://github.com/apiad/manifoldx">Github</a> repository to see the full code, but the bulk of the implementation is these cleverly vectorized system methods.</p><h2>Future Directions</h2><p>And that&#8217;s it. This is my pure Python (well, you know what I mean) graphics engine for serious, grown-up stuff that is surely, definitely, not a weekend side-project meant to procrastinate on actual work... I mean, what?!</p><p>Where I will go with this? I don&#8217;t know. I always write these things mostly as a learning exercise and I&#8217;ve learned a lot about graphics in Python. I&#8217;ve updated my view of modern graphics and I think I&#8217;ve paid my debt of the last seven years in graphics computation. I&#8217;m kind of happy now that I know how to do this in 2026. Mission accomplished, I guess.</p><p>There are some places this engine can go to, like some custom shaders when you need stuff like lighting effects. But it is not going to become a traditional, full-blown game engine. I will not add support for lots of game engine-like features including, I don&#8217;t know, skeletal animations, level of detail, scene management, or, god forbids, visual scripting and nonsense like that.</p><p>Now two areas I&#8217;d like to explore in the future. One is extending the engine towards the kind of behavior you need to write for AI simulations. If you want to run some sort of agent simulation or ant colony optimization or stuff like that, that code doesn&#8217;t look that much as a frame-by-frame update, but like an asynchronous event-loop&#8212;which is also something that is not usual in game engines. And the other direction is towards procedural generation of meshes and content in general, which is an area I left five or six years ago and would pretty much love to come back to it.</p><p>And that&#8217;s it for this week. This is not production-ready at all&#8212;it&#8217;s mostly a toy at the moment&#8212;but you can take it apart and hack your way into some cool physical or mathematical simulation. The code is on <a href="https://github.com/apiad/manifoldx">GitHub</a> if you want to try it yourself, and I&#8217;d love to see what you build with it.</p><p>Until next week, stay curious.</p>]]></content:encoded></item><item><title><![CDATA[AI Coding Agents, Deconstructed]]></title><description><![CDATA[The four hidden layers that separate tools that help form tools that hinder]]></description><link>https://blog.apiad.net/p/the-anatomy-of-ai-coding-agents</link><guid isPermaLink="false">https://blog.apiad.net/p/the-anatomy-of-ai-coding-agents</guid><dc:creator><![CDATA[Alejandro Piad Morffis]]></dc:creator><pubDate>Thu, 02 Apr 2026 13:40:59 GMT</pubDate><enclosure url="https://images.unsplash.com/photo-1751494203533-a837d1b536b6?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1MXx8bW9ua2V5JTIwdG95fGVufDB8fHx8MTc3NTEzNTg4Nnww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://images.unsplash.com/photo-1751494203533-a837d1b536b6?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1MXx8bW9ua2V5JTIwdG95fGVufDB8fHx8MTc3NTEzNTg4Nnww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://images.unsplash.com/photo-1751494203533-a837d1b536b6?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1MXx8bW9ua2V5JTIwdG95fGVufDB8fHx8MTc3NTEzNTg4Nnww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1751494203533-a837d1b536b6?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1MXx8bW9ua2V5JTIwdG95fGVufDB8fHx8MTc3NTEzNTg4Nnww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1751494203533-a837d1b536b6?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1MXx8bW9ua2V5JTIwdG95fGVufDB8fHx8MTc3NTEzNTg4Nnww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1751494203533-a837d1b536b6?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1MXx8bW9ua2V5JTIwdG95fGVufDB8fHx8MTc3NTEzNTg4Nnww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw"><img src="https://images.unsplash.com/photo-1751494203533-a837d1b536b6?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1MXx8bW9ua2V5JTIwdG95fGVufDB8fHx8MTc3NTEzNTg4Nnww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" width="4925" height="3238" data-attrs="{&quot;src&quot;:&quot;https://images.unsplash.com/photo-1751494203533-a837d1b536b6?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1MXx8bW9ua2V5JTIwdG95fGVufDB8fHx8MTc3NTEzNTg4Nnww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:3238,&quot;width&quot;:4925,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;A relaxed monkey enjoys a sunny day.&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="A relaxed monkey enjoys a sunny day." title="A relaxed monkey enjoys a sunny day." srcset="https://images.unsplash.com/photo-1751494203533-a837d1b536b6?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1MXx8bW9ua2V5JTIwdG95fGVufDB8fHx8MTc3NTEzNTg4Nnww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1751494203533-a837d1b536b6?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1MXx8bW9ua2V5JTIwdG95fGVufDB8fHx8MTc3NTEzNTg4Nnww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1751494203533-a837d1b536b6?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1MXx8bW9ua2V5JTIwdG95fGVufDB8fHx8MTc3NTEzNTg4Nnww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1751494203533-a837d1b536b6?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw1MXx8bW9ua2V5JTIwdG95fGVufDB8fHx8MTc3NTEzNTg4Nnww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>I&#8217;m telling you, this is the future. AI agents will do aaaallll the work. </em>Photo by <a href="https://unsplash.com/@farzadfelfelian">Farzad Felfelian</a> on <a href="https://unsplash.com">Unsplash</a></figcaption></figure></div><p>You&#8217;ve been using AI coding agents for months. You&#8217;ve crafted elaborate system prompts. You&#8217;ve added a dozen skills. You&#8217;ve learned the dance of context window management. And somewhere around the third hour of work, something breaks. The agent starts forgetting things. Making wrong assumptions. Doing something close&#8212;but not quite&#8212;what you asked.</p><p>This isn&#8217;t a failure of the model. This is a failure of the system.</p><p>To be sure, better models make things easier. And models are getting better by the day. But no matter how good a model is, bad systems lead to bad outputs. Even the smartest people produce junk when fed with incorrect assumptions or given incomplete instructions.</p><p>In contrast, a good system with clear boundaries and explicit rules, that leaves the exact amount of flexibility necessary, makes creativity and productivity thrive.</p><p>You see this day and night in teams (of real humans) in every industry. It&#8217;s not often the smartest person in the room that solves the hard problem. It&#8217;s when you combine the right kinds of intelligence with the right kind of system that things click.</p><p>In this article, I want to make the case for a structured way to think about Large Language Model (LLM)-based agentic systems (mostly for coding, but also for knowledge work in general) that fixes some of the greatest pains I (and I sure most of you) have been facing when trying to scale AI-assisted workflows to professional levels.</p><p>It&#8217;s a system that puts the right constraints in the right places and leaves just enough space for creative exploration (or however you want to call what LLMs do when they hallucinate in your favor). It&#8217;s also a system that makes it clear you are in charge.</p><p>Everything an AI agent does happens inside a context window. System prompt, user input, tool results, skill injections&#8212;they all live there. The agent&#8217;s only mechanism for action is the ReAct (Reasoning + Acting) loop: think, call tools, observe results, repeat. Each cycle grows the context. Each skill activation injects more.</p><p>This creates a fundamental tension: context is power, but context is finite. Too little and the agent can&#8217;t connect the dots. Too much and the important stuff drowns. The gap between those two failure modes is narrow&#8212;and most agent frameworks ignore it entirely.</p><p>I&#8217;ll walk through why current systems fail, introduce a four-element framework for thinking about agentic architectures, show you how these principles apply across three domains, then present a vision for better AI harness engineering.</p><h2><strong>Part I - The Symptoms</strong></h2><p>To understand the problems we first need to understand how a standard agentic loop works. The typical architecture is what&#8217;s called a ReAct loop. The LLM runs in a loop that determines the next action given context, which can be read some files, ask the user, invoke a tool, inject a skill, etc. When the agent decides no more actions are necessary, the loop ends and the user is given control back to continue the prompt.</p><p>That&#8217;s it. All the seemingly supersmart behaviours of Claude Code, Gemini CLI, and Codex are, under the hood, some form of the basic ReAct loop. There are of course nuances. For example, most systems decide that if the agent calls the same tool with the same args three times, it must be stuck in a loop and stop the turn. There are perhaps hard limits on how many tool calls the agent can do in each turn.</p><p>Context is the bottleneck. Not the model. Not the prompt. Context.</p><p>The agent doesn&#8217;t have memory. It doesn&#8217;t have state. It has context. Everything it knows about your project, your preferences, your conventions, all of it lives in the context window. When you add a skill, you&#8217;re injecting more context. When you run a tool, the result goes into context. When you switch modes, you&#8217;re switching which system prompt is active, all still in context.</p><p>This means context engineering <em>is</em> AI agent engineering. The agent&#8217;s behavior isn&#8217;t determined by the model alone, or even primarily, but by what context you give it, and how you structure that context over time.</p><p>Most tools treat context as a solved problem. They stuff everything in and hope the model figures it out. In-context learning seems almost magical, but it has limits&#8212;and those limits become visible fast.</p><p>When context is thin, the agent simply doesn&#8217;t know enough about your project to make informed decisions. It relies on baked-in assumptions from training and falls back to consensus instead of following your style: it uses the common tools and practices it learned from pretraining. This often means it uses slightly old and outdated tools and practices.</p><p>So you do the sensible thing, and inject project-specific information into the context. But then if context grows too large, even if it doesn&#8217;t technically exceed the model&#8217;s capacity, things start to get lost in the middle. Moreover, failed tool calls, wrong assumptions the model had to correct, etc., start creeping up in context, not only taking up valuable space but also, and more importantly, <em>distracting</em> the model and biasing it towards mediocre decisions.</p><p>Then there is context compaction: when the context fills in to about 85%, most systems will invoke a special prompt to instruct the agent to summarize the current state. These prompts vary in detail, but often involve asking the agent what it is immediately doing, where is it stuck, what has failed, etc. Clever, but a hack nonetheless. This hard context reset means the agent will forget important nuances in the current conversation and will repeat past mistakes. It&#8217;s frustrating.</p><p>Let&#8217;s look at how these problems surface in specific symptoms that <em>all</em> LLM-based agents display at some point.</p><h2><strong>Symptom One: Unstated Assumptions</strong></h2><p>The first failure mode isn&#8217;t dramatic. It&#8217;s quiet. You ask the agent to write a test, and it writes a <code>unittest.TestCase</code> instead of a <code>pytest</code> function. You ask it to add a dependency, and it edits <code>requirements.txt</code> instead of running <code>uv add</code>. You ask it to deploy, and it pushes directly to main.</p><p>These aren&#8217;t model failures. They&#8217;re assumption mismatches. The agent doesn&#8217;t know how <em>your</em> team does things. There&#8217;s no guardrail for &#8220;in this project, we always use pytest, we always use uv, we never commit directly to main.&#8221; The agent improvises from general knowledge, and general knowledge is often wrong.</p><p>Skills are supposed to fix this. Add a skill document that says &#8220;use pytest&#8221; and the agent should know. But skills introduce a new problem.</p><p>You add a skill for code review. Then one for documentation. Then one for PR descriptions. Then three more for your company&#8217;s specific stack. Each skill seems small. A few hundred tokens each. But they pile up&#8212;always-on knowledge the agent carries but can&#8217;t prioritize.</p><p>The result is context bloat. The agent can&#8217;t tell what&#8217;s relevant in any given moment. So it blends everything together, and hallucinations increase. More skills made it worse&#8212;not better.</p><h2><strong>Symptom Two: Permission Leakage</strong></h2><p>Every agent framework implements the same plan then build pattern. The idea is sound: think first, plan second, execute third. In practice, the boundaries leak.</p><p>Plan mode is supposed to be read-only. Design the change, review the approach, lock in the scope. Build mode is supposed to execute. Write the code, run the tests, commit the result.</p><p>But &#8220;plan mode&#8221; in most tools is just a prompt. There&#8217;s no enforcement. The agent can write code in plan mode if it wants to. It can ignore the plan in build mode. It can skip straight to implementation if the prompt implies urgency. The modes are suggestions, not constraints.</p><p>This matters because a plan only works if it&#8217;s actually followed. If the agent can deviate mid-execution&#8212;if &#8220;plan mode&#8221; and &#8220;build mode&#8221; are just prompts with different names&#8212;the plan becomes advisory. And advisory plans get ignored.</p><p>The second problem is structural: there&#8217;s no artifact that passes from plan to build. The plan lives in the context. By the time build mode starts, the plan is mixed in with everything else the agent said. Which file was the plan? Which changes were approved? The agent has to re-read the conversation to remember. Context saturation accelerates.</p><h2><strong>Symptom Three: Context Saturation</strong></h2><p>After extended work, you see the same pattern: the agent makes 95% of the progress, then fails on the last 5%. It nails the architecture. The logic is sound. The core implementation works. Then it stumbles on a detail&#8212;because context has saturated. It forgot which environment it was in, which conventions still apply, which constraints matter.</p><p>But the deeper problem is internal noise. The agent keeps everything in context: all internal reasoning, all tool calls, all results. This is fine for minute-to-minute action. But after four failed attempts to solve something, the old tool calls are just noise. These were attempts that went nowhere, just add cost and accelerate saturation.</p><p>The supposed solution for this is context compaction. But this creates a lossy summary problem. The agent is supposed to leave a trail for its future self. After context compaction, it should be able to pick up where it left off. But if agents struggle with long contexts, how are they supposed to build a good trail? The compaction report is only as good as the agent&#8217;s ability to summarize. And summarization is lossy and injects back lots of unstated assumptions from pretraining.</p><p>The frustrating part: this wasn&#8217;t a hard problem. The agent had all the knowledge it needed. But context filled with noise, and the important bits got pushed out. More tokens in, less signal out.</p><p>The solution isn&#8217;t just better prompts or larger context windows. Yes, these help. But the symptoms are systemic, so the solution must be a system overhaul.</p><p>Let me show you how that system looks like.</p><h2><strong>Part II - The System</strong></h2><p>Now that we understand the problem, let&#8217;s look at how every agent system actually works. Every AI agent system addresses four concerns. When you conflate them, the system breaks. When you separate them, the system scales.</p><p>This taxonomy isn&#8217;t original to me. It&#8217;s a synthesis of how modern AI agentic systems work under the hood. Most explicitly, it&#8217;s implemented in the OpenCode CLI (opencode.ai), but all other tools follow a similar pattern, even if they use different names.</p><p>Here&#8217;s the breakdown. Every agent system you&#8217;ll encounter (explicitly or implicitly) is managing these four things:</p><p><strong>Mode &#8212; the who.</strong> A mode is the persona the AI adopts. It defines the thinking style, the permissions, the available tools. When you interact with a &#8220;code assistant,&#8221; you&#8217;re in a coding mode. When you switch to &#8220;creative writer,&#8221; you&#8217;re in a creative mode.</p><p>Modes are <em>explicit</em>. They&#8217;re top-level system prompts that define behavior and permissions. You tell the agent: &#8220;This is how you should think and behave. These are the tools you can use. These are the parts of the filesystem you can write to.&#8221;</p><p><strong>Skill &#8212; the knowledge.</strong> A skill is knowledge the agent can recall when necessary. It doesn&#8217;t get invoked explicitly, it gets applied <em>implicitly</em> when necessary. When you give an agent knowledge about SQL optimization, that skill is available whenever relevant. The agent doesn&#8217;t need to be told to use it. The ReAct cycle injects it when it deems suitable.</p><p>Unlike modes, skills can layer. An agent might have a SQL skill, a documentation skill, and a debugging skill, all active simultaneously, all contributing when relevant. Skills are implicit because the agent should just apply them naturally. They can also contradict or complement each other. In-context learning <em>should</em> be capable of using them in a combined manner.</p><p><strong>Command &#8212; the workflow.</strong> A command is a script. It tells the agent: do this, in this order, using these tools. &#8220;Refactor this function&#8221; is a command. &#8220;Run these tests and report results&#8221; is a command.</p><p>Commands are <em>explicit</em>: you invoke them. Under the hood, commands are just prompts. The difference is who injects them: the user. When you run <code>/build</code>, you&#8217;re injecting a workflow prompt into the agent&#8217;s context. That&#8217;s it. The command tells the agent: do this sequence of things. The complexity lives in the orchestration of the ReAct cycle, not the command itself.</p><p>Commands are intentionally simple. They don&#8217;t contain knowledge. That&#8217;s intentional separation of concerns. The command itself shouldn&#8217;t know <em>how</em> to build; it knows <em>when</em> to spawn subagents and which mode to use. This keeps commands thin and changeable without rewriting underlying knowledge.</p><p><strong>Subagent &#8212; the delegation.</strong> A subagent is a spawned agent for background or parallel tasks. It handles isolated work, returns summarized results, then disappears. It is instantiated with a system prompt and specific instructions (synthesized by the primary agent that called it), and runs for one full ReAct turn.</p><p>Subagents are ephemeral. Their internal reasoning stays private. The main agent only sees the synthesis. You spawn a subagent when you need parallel processing, isolation, or both. They are the way to <em>fork</em>, solve a specific subtask, and return a result, but keep context clean. Kind of like subroutines.</p><h3><strong>Why This Separation Matters</strong></h3><p>Understanding this distinction unlocks everything else. Once you see skills as implicit knowledge and commands as explicit scripts, the rest of the architecture clicks naturally. Most agent setups conflate these. They embed knowledge in commands. They make skills behave like workflows. They mix persona into everything else. And the massively underuse subagents.</p><p>When you separate these concerns&#8211;modes for persona, skills for knowledge, commands for orchestration, subagents for delegation&#8211;you get something that looks like good systems engineering. You can swap skills without touching commands. You can change modes without rewriting workflows. You can spawn subagents without the main agent knowing or caring how they work internally. The result is a system that works and adapts and <em>scales</em> like good software should do.</p><p>The system scales because the pieces are independent. Change one without breaking the others. Each component has a single job, and the boundaries between them are meaningful. When context shifts, when requirements evolve, when a new skill needs adding, the system adapts incrementally rather than collapsing under the weight of accumulated complexity.</p><h2><strong>Part III: The Practice</strong></h2><p>If so far this seems like abstract theory for you, in this section we will ground these concepts in actual practice. Let me show you how I&#8217;m using these ideas today to improve my AI-assisted coding practice. I&#8217;m using opencode.ai but I believe the following is easily adaptable to any agentic toolkit out there.</p><h3><strong>My Three Modes</strong></h3><p>Every agentic system needs boundaries, not social contracts, but enforced constraints. In my setup, those constraints come from three modes: analyze, design, and create.</p><p>Each of these modes defines a thinking style&#8212;a persona&#8212;and a set of constraints for tool use and filesystem access.</p><p><strong>Analyze mode</strong> is research and investigation. This mode reads your work and writes summaries to a knowledge base. It cannot touch production files. Not &#8220;should not&#8221; but <em>cannot</em>. The permissions are built into the mode itself, not enforced through prompts or warnings. The agent is incapable of writing outside of a <code>.playground</code> folder, and is incapable of doing anything that can harm the project or the system (more on how a bit later) but it is still capable of running arbitrary code, download anything from the internet, and play around as it needs.</p><p><strong>Design mode</strong> is architecture and planning. This mode bridges analysis and implementation. It can read your project and write design documents, architecture diagrams, and implementation plans, but still cannot touch production code. It cannot run shell scripts either, at all. It can look at git status and logs, read folder contents, etc., but it can only write to a space where plans and design documents go.</p><p><strong>Create mode</strong> is execution. Full read-write access. This is where production work happens. The agent can write code, create files, and modify the project directly. Again, it cannot do anything outside the project scope, though. It won&#8217;t accidentally change <code>/etc/host(s)<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a></code> even if it tries to.</p><p>The key insight: <strong>modes define permissions, not just persona</strong>. You can&#8217;t accidentally prompt your way into code generation during research. The agent literally lacks the capability. The agent doesn&#8217;t need to &#8220;understand&#8221; these constraints, it simply operates within them.</p><p>Mode is the who, and it determines what the agent <em>can</em> do, not just how it thinks.</p><p>Let me show you how they work in three different domains that make the bread and butter of my daily job: software development, scientific research, and technical writing.</p><p>I chose these domains because they illustrate the simplicity and scalability of the system. Software development shows the framework under constraints: deadlines, production code, real stakes. Research shows it under complexity: synthesis, evaluation, structured output. Technical writing shows it under nuance: voice, audience, iterative refinement. Three different pressures, one consistent architecture that works in all three cases.</p><p>In each of these domains we have two layers to go through: first is the set of <strong>implicit skills</strong> that are available to the agents, and second is the set of <strong>explicit commands</strong> (each tied to a specific mode) that setup concrete workflows. I will show you one example workflow that cross-cuts across the three modes in each case. I will also tell you exactly where delegation occurs.</p><h3><strong>Domain A: Software Development</strong></h3><p>Software development is where agentic systems face the harshest constraints. Production code has stakes. Deadlines are real. Mistakes cost money. Let&#8217;s see how the framework applies.</p><h4><strong>Implicit Skills</strong></h4><p>A software development agent carries knowledge it never needs to be told to use. It knows language idioms and patterns like the idiomatic way to write a list comprehension in Python, or the conventions for error handling in Go. It knows testing conventions: where tests live in the directory structure, how they&#8217;re named, what assertions to prefer. It knows architecture conventions: layered structure, dependency injection patterns, how error states propagate. It knows code review standards: what to flag, what to praise, when to ask for clarification.</p><h4><strong>Example Workflow: Bug Hunting</strong></h4><p>I use this workflow for finding and fixing bugs. It starts with investigation. The agent spawns dozens of subagents to try and break the system (either guided towards a purpose, or completely unbiased). Then you build a comprehensive plan to solve it. And then you execute that plan. Simple, right?</p><p><strong>Phase 1: /trace (analyze mode)</strong> runs systematic experiments to detect and narrow down a bug&#8217;s cause. The agent examines stack traces, compares behavior across commits, and pinpoints the exact files and functions that need attention. This mode is read-only by design, except for a <code>.playground</code> folder. Research happens here, not in the code itself.</p><p>Each experiment is run on a subagent that has the job of verifying one assumption. The main agent receives only experiment results, and constructs an executive report of findings. This means you can run dozens of different experiments autonomously to detect what breaks what.</p><p><strong>Phase 2: /plan (design mode)</strong> takes the diagnosis and defines the changes needed, along with their architectural impact. The agent reviews the affected modules, considers alternative approaches, and documents the implementation plan before touching anything. This is where the scope gets locked in.</p><p>The result of this phase is a structured plan with step by step details on what files must be touched and what must be done in there (semantically, not code). For every phase, it defines success criteria: what must be validated before we can say we got that phase right.</p><p><strong>Phase 3: /build (create mode)</strong> executes the plan step by step. The agent writes tests first (following Test-Driven Development (TDD) discipline) for the success criteria defined for that phase and watches them fail. Then it launches a coding subagent that has <em>read-only</em> access to tests, so it cannot cheat and change the tests.</p><p>The subagent attempts to implement changes that make the test pass. If it succeeds, the main agent commits and moves on. If it doesn&#8217;t, the main agent retries a few times. If there is no progress, the main agent resets the work tree (no harm done), and reports on failure. This usually means the plan needs revisions.</p><h3><strong>Domain B: Research</strong></h3><p>Research is where agentic systems face the greatest complexity. Sources multiply, methodologies diverge, synthesis requires judgment. Let&#8217;s see how the framework applies.</p><h4><strong>Implicit Skills</strong></h4><p>A research agent knows the conventions of academic writing without being reminded. It knows citation formats like APA, MLA, Chicago, and IEEE, and when to use each. It knows how to evaluate papers: methodology soundness, sample size adequacy, replicability claims, conflict of interest disclosures. It knows the structure of literature reviews: how to organize by theme, methodology, or chronological development. It knows domain-specific terminology, distinguishing between &#8220;accuracy&#8221; and &#8220;precision&#8221; in machine learning, or between &#8220;confounding&#8221; and &#8220;colliding&#8221; in causal inference.</p><h4><strong>Example Workflow: State-of-the-Art Report</strong></h4><p><strong>Phase 1: /research (analyze mode)</strong> spawns subagents to gather sources in parallel. Each subagent reads a batch of papers, synthesizes findings, and returns summaries. The main agent synthesizes those summaries into structured notes. This phase can be run multiple times to collect batches of sources without overwhelming context. At the end, you get hundreds of sources summarized into clean research notes.</p><p><strong>Phase 2: /outline (design mode)</strong> identifies patterns across the collected literature. The agent groups papers by methodology, extracts recurring findings, and maps the landscape of the field. It generates outline options for the final document, based on typical structures like problem-solution or paradigm-methods, highlighting gaps where the research is thin and consensus areas where findings align.</p><p><strong>Phase 3: /draft (create mode)</strong> builds the document section by section, following the outline. Each section draws on the structured notes, weaving together sources into coherent narrative.</p><p>The agent launches subagents for writing each subsection because typically, agents write more or less the same length in a single <code>write</code> command, so if you ask it to fill in a large outline all at once you&#8217;ll only get a mediocre extended outline. By launching independent writers for specific sections of the outline, you get all the attention of a single turn to read source material and write a good 4 or 5 paragraphs for a concrete section.</p><p>A cool idea I&#8217;ve been meaning to try is have the main agent can spawn several subagents to write the same section, with a high temperature, and then perform some sort of aggregation or evaluation before building the final draft for every section. This burns through 3x tokens but ensembles have been shown over and over to improve AI models outputs. If you try it, let me know.</p><h3><strong>Domain C: Technical Writing</strong></h3><p>Technical writing is where agentic systems face the most nuance. Voice matters. Audience varies. Iterative refinement is the norm. Let&#8217;s see how the framework applies.</p><h4><strong>Implicit Skills</strong></h4><p>A technical writing agent carries knowledge of prose style without being coached. It knows voice and tense conventions&#8212;active voice for clarity, past tense for completed processes, second person for direct instruction. It knows structural patterns: how documentation differs from blog posts, how reports differ from tutorials, how reference material differs from guides. It knows audience awareness: what to explain for newcomers, what to omit for experts, when to elaborate and when to abbreviate. It knows cross-referencing and linking norms: when to link, when to inline, how to name anchors for scannability.</p><h4><strong>Example Workflow: Paper Review</strong></h4><p><strong>Phase 1: /review (analyze mode)</strong> performs detailed review in a specific order: structural issues first, then content, then style. The agent examines the narrative arc&#8212;how main points connect, whether the flow makes sense, before worrying about grammar or word choice. This ordering matters; reviewing low-level details when high-level problems exist wastes effort.</p><p>Each iteration is performed by spawning several subagents that focus on specific types of problems, like transitions, unverifiable claims, etc. Each subagent returns a structured list of issues, pointing back to exact line numbers and phrasing. Then, the main agent <em>edits</em> the original paper and injects markdown comments in every marked issue, next to the paragraph, or under the header where it best fits.</p><p><strong>Phase 2: /revise (design mode)</strong> plans changes to specific sections, prioritizing by review type. The agent maps structural fixes to particular paragraphs, content additions to thin sections, style improvements to verbose passages. It produces a concrete plan, section by section, change by change. Then it goes into the manuscript and writes markdown comments as replies to the existing review comments, thus grounding the revision plan in the exact context it must fit.</p><p><strong>Phase 3: /rewrite (create mode)</strong> follows the plan. The agent revises sections in priority order, applying structural changes first, then content, then style. Again, each step is performed spawning a subagent tasked with just a change (for style changes we actually do it section by section).</p><p>The subagent doesn&#8217;t edit; it produces a draft revision that the main agent is then tasked to paste into the document where it fits. Crucially, the main agent is instructed to <em>leave</em> the editorial comments but mark them as solved, with a short trail of what was changed. This works wonders for a later human review phase.</p><h2><strong>Part IV: A Look into the Future</strong></h2><p>These workflows work, but with some caveats. There&#8217;s a gap between &#8220;working&#8221; and &#8220;working well.&#8221; Three key pains remain in my implementation.</p><ol><li><p>Long commands are hard to follow when given as a single prompt. The fourth step gets forgotten since it is buried at the beginning of the context.</p></li><li><p>Permissions as currently implemented are all-or-nothing. You either have shell access (destructive) or you don&#8217;t. I want broad permissions (run whatever you want) with provable security (nothing you run can change this file).</p></li><li><p>Context saturation still happens even with delegation. After a while, the agent will have to compact context, and this usually means you lose important information.</p></li></ol><p>I have three ideas for closing this gap. The first is about how commands work. The second is about security. The third is about context management. They are in different levels of implementation, so let me show you what I&#8217;m building toward.</p><h3><strong>Idea One: Better Commands</strong></h3><p>Commands in most tools (Claude Code, Gemini CLI, Codex, Copilot) are one-shot interactions: you invoke the command, a single massive prompt is injected. The agent runs until it decides to stop.</p><p>To make commands truly useful, we need to be more like scripts. Here&#8217;s what that means:</p><ol><li><p>Commands that inject prompt instructions one step at a time, waiting for the agent to do a full turn each time. Instead of dumping a large prompt to run all steps at once, a command like <code>/review</code> could insert surgical mini prompts that say &#8220;read the file&#8221;, wait for the agent, &#8220;analyze structure&#8221;, wait for agent, and so on, until &#8220;write the report&#8221;. This massively reduces the problem of lost-in-middle context saturation. Each turn the agent is focused on one specific step, and you get N times the compute power to solve an N-step workflow.</p></li><li><p>Commands that extract structured information from the agent response, and can later inject variables back into prompt. This allows to reinject important information into later prompts, keeping important information as a contextual variable, not just a string lost in the middle of the prompt. But it allows for something else.</p></li><li><p>Conditional branching based on context or user input. Once we have structured parsing and contextual variables, we can inject different prompts based on whether the agent succeeded or failed. If the plan reveals a breaking change, route to architectural review. If it&#8217;s a bug fix, route directly to implementation. The command adapts its path based on what it discovers.</p></li><li><p>Finally, commands that embed and execute external scripts. Instead of asking the agent to run some script, the command can run arbitrary Python, JS, Bash, or whatever, to, for example, transform structured information. The command becomes an orchestrator of other processes.</p></li></ol><p>Basically, what I&#8217;m asking for here is a Domain-Specific Language (DSL) for guiding agents in a far more structured manner, but still having the power of arbitrary prompts for flexibility. Mixing code and prompts in this way gives us the tools to find the precise balance between constraints and capabilities.</p><p>If this sounds exciting, I&#8217;m happy to tell you this is already doable, to some extent. Check out my <a href="https://apiad.github.io/opencode-literate-commands">literate-commands</a> project for an OpenCode-specific implementation of these ideas. It&#8217;s still a bit rough around the edges, but it works much better than plain, single-prompt commands.</p><h3><strong>Idea Two: Sandboxed Security</strong></h3><p>Most agentic tools have very coarse permission settings. You can allow, deny, or set a specific tool to &#8220;ask&#8221; mode, which means the agent will pause and emit a notification for the user to give permission.</p><p>This works fine for coarse-grained permissions like read-only access, or write but no shell. In OpenCode, you can even define permissions for specific paths, or even specific shell commands (with simple glob patterns, so you can, e.g., allow <code>ls *</code> but reject all other shell commands).</p><p>However, even in this case, I find these permissions too restrictive. They are conflating two different dimensions into one&#8211;what tools the agent can use, and what side-effects can those tools have.</p><p>For example, say I want to give my agent <code>git</code> access but only for reading operations. How do you achieve that? You need to list all safe patterns like <code>git ls-tree *</code>, <code>git status</code>, <code>git log *</code>. But what about <code>git branch</code>? Depending on the arguments, this subcommand can have read-only or write side effects. And then think about pipes, shell substitution, custom bash scripts, or worse, <code>python *</code>.</p><p>If you want your agent to be capable, you need to give it access to a wide variety of tools. For example, my bug-hunting workflow depends on the agent being able to execute arbitrary code that it synthesizes on the fly. However, I want guardrails. There is simply no way to whitelist all possible commands. We need separation of permission to run a command and permission to modify the system.</p><p>The solution, of course, is some form of filesystem isolation. The most obvious one is wrapping all shell execution in Docker, so commands run in a container with proper constraints. This creates all sorts of other problems, which I can discuss in a future post, but for now, it remains my best (and simplest) solution to robust sandboxing.</p><p>And this isn&#8217;t just about safety, though. When you know the agent can&#8217;t accidentally wipe your home directory or exfiltrate your API keys, you can let it do more. Security enables capability. You can let the agent download arbitrary code from the internet, run arbitrary scripts, break things and observe changes. Everything happens inside a Docker container with precise constraints that enable maximum capability with absolute security.</p><p>As of now, I kind of implemented this as a plugin for OpenCode, but it&#8217;s still in beta phase and not ready for widespread use. More on this idea in a future article.</p><h3><strong>Idea Three: Context-Aware Execution</strong></h3><p>And finally, we need to rethink the whole oversimplistic ReAct loop that simply grows the context linearly. The agentic cycle doesn&#8217;t have to be a straight line. Real work branches: you explore options, try things, backtrack when they fail. The context should reflect that.</p><p>I&#8217;ve been designing a system where the context never saturates. It branches when you&#8217;re exploring, spawning parallel contexts for different approaches. It prunes old tool calls that went nowhere. It removes internal reasoning that no longer matters. It maintains a &#8220;trail&#8221; that actually works: a structured record of decisions, not a lossy summary.</p><p>The goal is simple: keep context between 40% and 60% saturation at all times. Not by compacting a 150K tokens context down to 10K&#8212;which kills all understanding the agent had achieved&#8212;but by never letting it grow unchecked.</p><p>Nothing like this exists yet, so I&#8217;m building it, but it&#8217;s a story for another day.</p><h2><strong>Conclusion</strong></h2><p>The main takeaway from this article is not that <em>my</em> system is better. It&#8217;s that <em>you</em> can design your own system to adapt perfectly to your workflows if you clearly separate concerns. The main modes are for establishing an overall persona&#8211;inquisitive and critical, versus detailed and forward-looking, versus focused and action-biased&#8211;while skills incorporate domain knowledge, and commands act as precise workflows.</p><p>The workflows I described are real, based on actual commands and prompts I&#8217;m using in production code. But I have abstracted them a bit to make them easier to understand in the context of an arbitrary agent, not tied to specific idiosyncrasies of the tool I happen to be using at the moment. If you want to see and try for yourself a concrete implementation of these ideas&#8212;still imperfect, but working nonetheless&#8212;check out my <a href="https://apiad.github.io/opencode">opencode toolkit</a> repository. It&#8217;s still pretty much work in progress, so use it with care.</p><p>In future articles I will explore specific problems in more detail and discuss concrete strategies to implement powerful workflows that keep you, the user, in absolute control, while delegating the majority of the grunt work.</p><p>And, as a final remark, I&#8217;m seriously considering building my own CLI agent. I know, I know. Reinventing the wheel and all that. But my plan is not to compete with any of the professional tools out there. What I always care about is <em>understanding</em> things deeply, and as my computer science career has taught me so far, there is no deeper understanding than the one you gain from actually building stuff.</p><p>So stay tuned for that. I will share progress as usual in the form of educational articles, so you&#8217;ll get to see under the hood how to build a fully functional CLI agent with tool calling, context compaction, skills, commands (the powerful ones, not the cheap single-prompt injection), subagent delegation, sandboxing, and all the engineering design hurdles that come with it.</p><p>Until next time, stay curious.</p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>Fun quirk. Typing <code>/etc/host</code> plus the <code>s</code> makes Substack silently fail on draft save, some sort of ill-defined security rule, I suppose. What the f&#8230;</p></div></div>]]></content:encoded></item><item><title><![CDATA[How I'm Using AI Today]]></title><description><![CDATA[Yes, another take on this. I promise you this one is a bit different.]]></description><link>https://blog.apiad.net/p/how-im-using-ai-today</link><guid isPermaLink="false">https://blog.apiad.net/p/how-im-using-ai-today</guid><dc:creator><![CDATA[Alejandro Piad Morffis]]></dc:creator><pubDate>Mon, 02 Mar 2026 21:13:04 GMT</pubDate><enclosure url="https://images.unsplash.com/reserve/oIpwxeeSPy1cnwYpqJ1w_Dufer%20Collateral%20test.jpg?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHx0b29sc3xlbnwwfHx8fDE3NzI0NTU2Mzl8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://images.unsplash.com/reserve/oIpwxeeSPy1cnwYpqJ1w_Dufer%20Collateral%20test.jpg?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHx0b29sc3xlbnwwfHx8fDE3NzI0NTU2Mzl8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://images.unsplash.com/reserve/oIpwxeeSPy1cnwYpqJ1w_Dufer%20Collateral%20test.jpg?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHx0b29sc3xlbnwwfHx8fDE3NzI0NTU2Mzl8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/reserve/oIpwxeeSPy1cnwYpqJ1w_Dufer%20Collateral%20test.jpg?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHx0b29sc3xlbnwwfHx8fDE3NzI0NTU2Mzl8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/reserve/oIpwxeeSPy1cnwYpqJ1w_Dufer%20Collateral%20test.jpg?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHx0b29sc3xlbnwwfHx8fDE3NzI0NTU2Mzl8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/reserve/oIpwxeeSPy1cnwYpqJ1w_Dufer%20Collateral%20test.jpg?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHx0b29sc3xlbnwwfHx8fDE3NzI0NTU2Mzl8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw"><img src="https://images.unsplash.com/reserve/oIpwxeeSPy1cnwYpqJ1w_Dufer%20Collateral%20test.jpg?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHx0b29sc3xlbnwwfHx8fDE3NzI0NTU2Mzl8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" width="5663" height="4599" data-attrs="{&quot;src&quot;:&quot;https://images.unsplash.com/reserve/oIpwxeeSPy1cnwYpqJ1w_Dufer%20Collateral%20test.jpg?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHx0b29sc3xlbnwwfHx8fDE3NzI0NTU2Mzl8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:4599,&quot;width&quot;:5663,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;clothes iron, hammer, axe, flashlight and pitcher on brown wooden table&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="clothes iron, hammer, axe, flashlight and pitcher on brown wooden table" title="clothes iron, hammer, axe, flashlight and pitcher on brown wooden table" srcset="https://images.unsplash.com/reserve/oIpwxeeSPy1cnwYpqJ1w_Dufer%20Collateral%20test.jpg?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHx0b29sc3xlbnwwfHx8fDE3NzI0NTU2Mzl8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/reserve/oIpwxeeSPy1cnwYpqJ1w_Dufer%20Collateral%20test.jpg?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHx0b29sc3xlbnwwfHx8fDE3NzI0NTU2Mzl8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/reserve/oIpwxeeSPy1cnwYpqJ1w_Dufer%20Collateral%20test.jpg?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHx0b29sc3xlbnwwfHx8fDE3NzI0NTU2Mzl8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/reserve/oIpwxeeSPy1cnwYpqJ1w_Dufer%20Collateral%20test.jpg?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHx0b29sc3xlbnwwfHx8fDE3NzI0NTU2Mzl8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Photo by <a href="https://unsplash.com/@toddquackenbush">Todd Quackenbush</a> on <a href="https://unsplash.com">Unsplash</a></figcaption></figure></div><p>I know, this is the probably the twentieth article you&#8217;ve read this week on &#8220;how I am using AI to automate my whole life/work/whatever&#8221;. Me too. But please, give me a couple of minutes to tell why this article might be different, and thus interesting to you.</p><p>But even if you don&#8217;t have two minutes to spare, please check the <a href="https://github.com/apiad/starter">repository</a> where all I&#8217;m going to tell you about is implemented, ready for your taking. Clone it and play with it, then if you like it, come back and read the rationale behind it.</p><p>Done? Ok, here we go.</p><p>So yes, this is another article trying to explain to you how I use AI coding agents (specifically Gemini CLI, but the specifics don&#8217;t matter) to enhance my workflows. Here are a couple of reasons why I think you might be interested&#8212;and why this article might be different to so many lookalikes out there.</p><p>First, I&#8217;m not an enthusiastic techbro who just discovered AI. If you&#8217;ve read this blog before, you know I&#8217;m a longtime researcher in AI&#8212;way before LLMs were a thing&#8212;and also a self-proclaimed AI anti-hypist. I&#8217;m not just overexcited about this shiny new toy. I&#8217;ve been using generative AI since day one for everything, and I&#8217;ve been telling you exactly how it sucks at almost anything important since day one. Coding has been the same until very recently. I can tell you the change in productivity is real, provided you are responsible and considerate.</p><p>Second, my approach to incorporating AI into my workflows is very careful and grounded in a large dose of healthy skepticism. I know firsthand how these things fail, so my approach attempts to be very robust to hallucinations and context drift and all the plagues of even the most powerful LLMs.</p><p>Third, I have a thing for systems. What I&#8217;m going to show is not just a set of hacks or clever prompts or productivity tips. It&#8217;s a principled system to go from ideation to research and planning to execution at the fastest responsible speed, without sacrificing on safety or maintainability.</p><p>And fourth, I have kind of a unique position in that I&#8217;m both heavily invested into coding as well as technical writing. I&#8217;m a college professor, so I do a lot of research, writing, and editing; but I also run a small AI startup, small enough that I get to do a large part of the coding. So my system attempts to bridge these two facets&#8212;code and prose creation&#8212;with the same unified principles of careful deliberation and planning, and robust tracking of the project evolution.</p><p>If you&#8217;re curious already, let me start by exposing the overall principles behind this approach, and then we&#8217;ll dive (not delve, but close) into the details.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.apiad.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.apiad.net/subscribe?"><span>Subscribe now</span></a></p><h2>Principles of Effective AI-Assisted Work</h2><p>By far the most pressing limitation of modern, top-tier LLMs for my line of work is context saturation. What I mean by this, is no matter how many tokens your model can fit (Gemini 3 claims to chug as much as 1 million tokens), when you work for a considerable amount of time on a single project, you will exhaust this context. And even if the context window isn&#8217;t full, the model will quickly lose the capability to keep track of the important bits of context, and will start to deviate from your instructions and make up its own agenda. Not out of evilness but simple probabilities.</p><p>The way you see this problem when using Claude Code, Gemini CLI, Codex, Copilot, or anything similar is twofold. The model will either forget midtask what is was supposed to do, but it won&#8217;t simply fail; it will reinterpret the task based on its faulty, lossy, blurry viewpoint given the available context and it will do something close, but not quite what you wanted. Or, the model will make faulty assumptions and forget to ask relevant questions, again behaving close but not quite exactly as you intended.</p><p>The result is always the same, you get frustrated that you achieved 95% of what you needed, but the remaining 5% is harder to fix than to just try again. And what could have been a happy working session where you get important and difficult things done quickly derails into a session of arguing with an LLM trying to convince it (him/her/pick your side) to do things the way you want.</p><p>Barring any fundamental paradigm shifts in the near future, under the assumption that LLMs will keep working as they are, the only way to fix this is to be very conscious and careful about the context&#8212;what has been dubbed context engineering&#8212;in two senses: First, do not pollute the context with unnecessary details. And second, re-inject into the context whatever is relevant for any given task, so it doesn&#8217;t get forgotten.</p><p>I know, kind of contradictory, but tradeoffs are what engineering is all about. In my system, I&#8217;ve applied three principles to help me manage these tradeoffs effectively.</p><ol><li><p>The important things should be made explicit.</p></li><li><p>Resist the urge to guess.</p></li><li><p>Delegate, delegate, delegate (yeah, three times).</p></li></ol><p>Here&#8217;s how that works. For principle one, we will keep track of everything important in markdown files in the repository. This means ideas are committed to plans in markdown files before acting on them, research is summarized and stored in real time, and everything that changes in the project is logged to a long-lived journal, so the model remembers why we made some decision months ago.</p><p>For principle two, we will favor using explicit commands that are translated into explicit prompts, instead of relying on implicitly activated skills that you have little control of. So if you want the model to make a plan, you will prompt it with &#8220;/plan lets design feature X&#8221;, and the <code>/plan</code> command will invoke a carefully crafted prompt that says how plans work, where they are stored, etc.</p><p>And principle three means using sub-agents a lot. This is a Gemini CLI specific feature&#8212;but every other coding agent has a similar thing&#8212;where you can launch a complicated task as a &#8220;sub-agent&#8221;&#8212;which basically means a custom prompt&#8212;but here is the important part: All the context of that sub-agent is kept private, not shared with the main agent, so the internal reasoning the sub-agent needed to run to find 20 different sources in Google does not pollute the main context. We only receive back the summarized responses. This allows running very long tasks (my record is a 30 minutes long research loop, involving hundreds of retrieved web pages) on a single agent turn, without exhausting the context.</p><p>I use four sub-agents in different commands. The <code>planner</code> is the lead architect. It&#8217;s a read-only agent that walks through your codebase and reads everything necessary to understand architecture, design decisions, etc., given a specific task. It then produces a detailed Markdown plan in the <code>plans/</code> directory&#8212;a physical source of truth that you can review before any code is touched, and that the main agent will follow step-by-step. This separation prevents the system from &#8220;guessing&#8221; its way through your codebase.</p><p>When I need external knowledge&#8212;like a library&#8217;s latest API or a specific technical specification&#8212;the <code>researcher</code> agent takes over. It scours the web to fetch relevant documentation, which it then synthesizes into granular summaries in the <code>research/</code> directory. This raw data is then handed off to the main agent to build an executive report annotated and linked to all relevant sources, again all stored already in your repository.</p><p>And there are two more agents, specifically designed for technical writing. The <code>reporter</code> agent takes an outline, and a folder of content, and it will write section by section, a detailed account of what the outline requested.</p><p>Unlike a standard LLM that might provide a high-level summary, the <code>reporter</code> is trained to expand specific placeholders with deep, evidence-based paragraphs. It draws directly from your <code>research/</code> files and the project <code>journal/</code> to ensure every sentence is grounded in the project&#8217;s actual state. Finally, the <code>editor</code> provides the final polish, auditing the draft for structural gaps and linguistic tics. It is grounded in a customizable style guide to make sure it always respects your style.</p><p>This distributed intelligence is held together by a central nervous system of context files, as per principle one. A <code>journal/</code> directory provides a chronological record of decisions and progress, acting as a long-term memory for the project. The <code>plans/</code> directory stores the strategic intent, while a <code>TASKS.md</code> file provides a high-level overview of the project&#8217;s current status. This structured environment allows the subagents to maintain a high degree of situational awareness without needing to ingest the entire repository in every turn.</p><p>Now that we have the key pieces in place, you can start to see why I think this approach is powerful. It is very extensible&#8212;you can add new agents triggered by specific commands to customize any kind of workflow&#8212;and it mostly solves the main pain point of modern LLMs, which is precisely the brittleness of long contexts.</p><h2>Workflow Details</h2><p>With all that, let me show you the specific commands and workflows I have currently implemented, but keep in mind what follows is but one example of the kind of powerful workflows we can start to automate.</p><p>I will divide the rest of the article into a few major areas, and explain the commands and agents that I use in each case, and a bit of the high-level instructions given to each of them.</p><h3>Discovery &amp; Strategy</h3><p>The most critical phase of any project occurs before I write a single line of code. I call this the &#8220;Discovery and Strategy&#8221; phase, powered by the <code>/research</code> and <code>/plan</code> commands. By formalizing this process, I&#8217;ve moved away from impulsive execution toward a deliberate, architected approach.</p><p>The <code>/research</code> command is my primary tool for external knowledge. When triggered, the <code>researcher</code> scours the web for technical documentation and relevant case studies, synthesizing them into granular summaries in the <code>research/</code> directory. These files become a persistent knowledge base, allowing me to reference verified facts without leaving my terminal.</p><p>Complementing this is the <code>/plan</code> command, which focuses on internal strategy. The <code>planner</code> conducts a thorough analysis of the codebase and the <code>journal/</code> to understand the system&#8217;s current state. After an interactive dialogue to resolve any ambiguity, it produces a comprehensive Markdown plan in the <code>plans/</code> directory. This document maps out the technical territory and provides a step-by-step execution roadmap.</p><p>The strength of this workflow lies in the synergy between these two tools. A <code>/plan</code> operation might reveal a gap in my understanding of a specific library, prompting a targeted <code>/research</code> session. Conversely, a new research finding might shift my technical direction, leading to a refined plan.</p><h3>Software Development</h3><p>Once I have a solid strategy in my <code>plans/</code> directory, I can move into execution. Let&#8217;s focus now on software development. I&#8217;ve designed four core commands&#8212;<code>/issues</code>, <code>/task</code>, <code>/commit</code>, and <code>/release</code>&#8212;to eliminate the friction of context-switching between my IDE and my terminal.</p><p>The cycle starts with <code>/issues</code> and <code>/task</code>. The <code>/issues</code> command acts as an expert project lead, interfacing directly with the GitHub CLI to analyze open issues and recommend what to tackle next based on strategic impact. For roadmap tracking, the <code>/task</code> command manages a living <code>TASKS.md</code> document. It assesses the value of pending work to ensure my efforts are always aligned with the project&#8217;s goals.</p><p>As I translate the plan into code, the <code>/commit</code> command brings order to my workspace. Instead of a monolithic &#8220;wip&#8221; commit that hides the logic of my changes, the system analyzes the <code>git diff</code> and logically groups modifications into cohesive units. It separates a core feature update from a documentation tweak, then proposes a series of atomic, Conventional Commits for my approval. This keeps my version history pristine and easy to navigate.</p><p>The final stage is deployment. Manual releases are fragile processes fraught with repetitive checklists: bumping versions and running tests before managing tags. The <code>/release</code> command automates this entire sequence. It verifies the workspace integrity by ensuring a clean git tree and passing tests via <code>make</code>. It then analyzes the commit history to propose the next version bump, drafts a <code>CHANGELOG.md</code> entry, and publishes the final tag to GitHub. This transforms a tedious afternoon of housekeeping into a single-command operation.</p><p>But, as you&#8217;ve seen, everything happens in tandem with those principles. No important action is taken without my confirmation, and everything gets logged into the filesystem, so all future decisions are grounded in past experience.</p><h3>Content Creation</h3><p>Now let&#8217;s focus on writing high-quality documentation and long-form articles. This is perhaps the most sensible part of the article (and the system) because people are <em>very</em> sensitive today with the topic of AI writing&#8212;and rightly so. Again, my intention here is to enhance how I work and get stuff done. If you&#8217;re writing for the pleasure of doing it, that&#8217;s totally fine, you probably don&#8217;t want any help there.</p><p>Anyway, the approach is built on the same cognitive foundation as the development path: the research and plans gathered during the discovery phase should serve as grounding for writing.</p><p>It starts with the <code>/draft</code> command. In its initial phase, the system performs a deep scan of the <code>research/</code> and <code>plans/</code> directories to identify the key themes relevant to the requested topic. If the foundation is too thin, the system will pause and suggest a <code>/research</code> or <code>/plan</code> cycle to ensure the draft has sufficient substance. Once the context is validated, the workflow enters an interactive &#8220;Outline Creation&#8221; phase. Rather than guessing at a structure, the system proposes a detailed Markdown outline. This collaborative step allows me to set the narrative arc and logical flow that I want, iterating on the high-level structure of, say, a technical article, before committing on the details.</p><p>Once the outline is locked, the <code>/draft</code> process initializes a skeleton file&#8212;complete with section headers and strategic placeholders&#8212;and then moves into an iterative, section-by-section expansion. Here, the <code>reporter</code> subagent takes the lead. Guided by the specific context of each section, the <code>reporter</code> weaves together research summaries and technical specifications into professional prose, all grounded on a style guide document.</p><p>Because the expansion happens in granular steps, the system maintains a high level of detail that a single-shot generation would inevitably lose. The result is a first draft that is structurally sound and rich with technical depth.</p><p>However, a first draft is rarely the final word. It will always sound AI-ish, and for many other reasons, it is rarely good enough. To achieve professional quality, I use the <code>/revise</code> command, which runs a structural and linguistic audit powered by the <code>editor</code> subagent following the same style guide.</p><p>Unlike a simple &#8220;check my writing&#8221; prompt, the <code>editor</code> performs a deep analysis of the document&#8217;s flow and tone. It identifies logical gaps where more evidence might be needed and highlights awkward phrasing that could obscure my intent. And crucially, this isn&#8217;t an automated &#8220;fix-all&#8221; tool; it&#8217;s an interactive process. The system presents its findings and proposes specific improvements, which I can then review or approve.</p><p>This collaborative refinement process ensures the final output maintains a consistent, professional voice while benefiting from the speed of the AI. By using <code>/revise</code>, I can surgically improve the text to enhance clarity and impact without losing control over the narrative.</p><p>But, in any case, I always find necessary a manual review and editing after all the AI enhancements. It shouldn&#8217;t be a surprise to you that this article is written in this way, but what you&#8217;re reading now is probably 80% different to what the final <code>/revise</code> iteration gave me. There is only so much you can prompt an AI, and that final human touch is not part of it.</p><p>But that&#8217;s good. This automates the first 80% or so of compiling a gazillion sources into a coherent narrative, and leaves the remaining 80% of polishing for me, which is the part I actually enjoy about writing.</p><h3>Background Tasks</h3><p>But there&#8217;s more. All of the above is what happens during, let&#8217;s say, the work day. That&#8217;s me sitting in front of the terminal, typing commands, approving stuff, fixing and redirecting, etc. Being an orchestrator.</p><p>But the real magic of AI-assisted development is what happens when you&#8217;re not looking. How you can leave you AI assistant working through the night, compiling sources, fixing bugs and proposing pull requests, enhancing the test suite, burning tokens your behalf.</p><p>To achieve this, I built an automation layer via the <code>/cron</code> command. The heart of this automation is the <code>cron.toml</code> file. This configuration file allows me to define scheduled tasks with a simple, declarative syntax. Each task specifies a name, an execution schedule, and a natural language prompt for the AI to execute.</p><p>For instance, I can schedule a task to perform &#8220;Background Research&#8221; every midnight on the unfinished tasks, scouring the web for new developments in a specific technical niche or finding specific sources to deal with the recently discovered bugs. By offloading these repetitive tasks, I ensure the knowledge base remains fresh and the project&#8217;s momentum never stalls. When the morning arrives, we have a lot of new context to start planning the day&#8217;s bugfixes and feature developments.</p><h2>Maintenance &amp; Refactoring</h2><p>Now, for the final touch, here&#8217;s how I deal with technical debt and feature rot. As a project evolves&#8212;and especially, as fast as AI-powered projects evolve&#8212;it accumulates technical debt&#8212;outdated implementations, untested paths, and plain old useless features&#8212;but also, contextual debt&#8211;&#8212;outdated plans and completed tasks that clutter the roadmap, and research we never acted upon.</p><p>Without deliberate intervention, this noise degrades the AI&#8217;s performance, leading to context rot. The <code>/maintenance</code> command is my primary defense against this entropy. It treats the development environment as a living instrument that I must regularly tune and sharpen to maintain its efficiency.</p><p>The <code>/maintenance</code> workflow follows the same plan-first architecture as the rest of the system. When invoked, the AI performs a comprehensive audit of the codebase, focusing on improvements like code readability and performance optimization. It identifies opportunities to apply the DRY (Don&#8217;t Repeat Yourself) principle and ensures that every function is documented with high-quality docstrings. But it also fixes deviations between the documentation and the actual implementation.</p><p>Crucially, this is an interactive process: the system presents a detailed refactoring plan for my approval before making any changes. This ensures that I remain in control while the machine handles the labor of cleaning the code.</p><p>Beyond code refactoring, I maintain system health through disciplined repository hygiene. A key component is the management of the <code>TASKS.md</code> file. By regularly moving completed items into the &#8220;Archive&#8221; section, I ensure that my primary operational view remains focused on what is relevant. This simple act of archiving prevents the &#8220;Active Tasks&#8221; list from becoming a source of distraction.</p><p>The goal of these maintenance practices is to provide the AI with the cleanest possible line of sight into the project&#8217;s state. When the repository is cluttered with stale research, the subagents are forced to sift through irrelevant data, increasing the risk of hallucinations. By treating maintenance as a first-class citizen, I ensure that every interaction, whether a <code>/plan</code> or a <code>/draft</code>, is grounded in a precise context.</p><h2>Conclusion</h2><p>This system is far from done, and as models improve in capabilities I&#8217;m sure we&#8217;ll unlock new areas for automation and augmentation that we cannot think about today. But for me, the key principles will remain valid for a long time. These are principles of robust engineering and management, after all. You can read them thinking of a completely human-based organization, and it&#8217;s all valid:</p><ol><li><p>The important things should be made explicit.</p></li><li><p>Resist the urge to guess.</p></li><li><p>Delegate, delegate, delegate (yeah, three times).</p></li></ol><p>And this is the key insight for me. Good AI users are basically good managers. All the science and engineering behind good practices for people management also apply to good AI management. And then there are of course technical considerations because AIs are not people, and perhaps never will.</p><p>So this is perhaps the most philosophical take-away from this article. Sorry to have made you read so long for this!</p><p>Now, on the technical side, please do check the <a href="https://github.com/apiad/starter">repository</a> and play with it. There are a couple extra goodies I haven&#8217;t tell you about, like a <code>/scaffold</code> command that creates new projects from scratch, and an <code>/onboard</code> command that explains the whole repository in detail. Those are particularly useful if you&#8217;re reusing this repository as a template to start your own project.</p><p>Ultimately, this repository is not a one-size-fits-all solution. It is a starting point. The commands and subagents provided here represent a particular opinion on how modern development should look, but they are not the only way. The power of this framework lies in its extensibility. Every system prompt for agents and commands is a living document, meant to be tweaked and rewritten to suit your unique mental model.</p><p>So if you do try it out, please let me know in the comments. And if you have a different (or similar) system set up for yourself, please share with all us your experience and your thoughts. We are all learners in this era of AI, and we can only help each other.</p><p>Stay curious.</p>]]></content:encoded></item><item><title><![CDATA[Drawing (not so) Beautiful Diagrams with Pure Python]]></title><description><![CDATA[Ups, I did it again :)]]></description><link>https://blog.apiad.net/p/drawing-not-so-beautiful-diagrams</link><guid isPermaLink="false">https://blog.apiad.net/p/drawing-not-so-beautiful-diagrams</guid><dc:creator><![CDATA[Alejandro Piad Morffis]]></dc:creator><pubDate>Fri, 13 Feb 2026 14:06:17 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/21624634-7c05-40c2-98f0-46ed20132c3d_663x359.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Over the past couple of weeks, I&#8217;ve been working on a pure Python library for rendering SVGs. In a spur of unbridled inspiration, I called it <strong><a href="https://apiad.github.io/tesserax">Tesserax</a></strong>. Long story short, it got <em>waaayyyy</em> beyond control, as it happens. </p><p>This article is a quick recap of why I did it, how I went down the rabbit hole of implementing yet-another-drawing-library, and a small showcase of what it can do it can do, in case you want to give it a shot.</p><p>But before moving one, here is a minimal example of what you can do with a few lines of Python.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!R3IB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93a97f55-54b7-49a8-9c5f-f2f21a862d7e_320x78.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!R3IB!,w_424,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93a97f55-54b7-49a8-9c5f-f2f21a862d7e_320x78.gif 424w, https://substackcdn.com/image/fetch/$s_!R3IB!,w_848,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93a97f55-54b7-49a8-9c5f-f2f21a862d7e_320x78.gif 848w, https://substackcdn.com/image/fetch/$s_!R3IB!,w_1272,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93a97f55-54b7-49a8-9c5f-f2f21a862d7e_320x78.gif 1272w, https://substackcdn.com/image/fetch/$s_!R3IB!,w_1456,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93a97f55-54b7-49a8-9c5f-f2f21a862d7e_320x78.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!R3IB!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93a97f55-54b7-49a8-9c5f-f2f21a862d7e_320x78.gif" width="320" height="78" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/93a97f55-54b7-49a8-9c5f-f2f21a862d7e_320x78.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:78,&quot;width&quot;:320,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!R3IB!,w_424,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93a97f55-54b7-49a8-9c5f-f2f21a862d7e_320x78.gif 424w, https://substackcdn.com/image/fetch/$s_!R3IB!,w_848,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93a97f55-54b7-49a8-9c5f-f2f21a862d7e_320x78.gif 848w, https://substackcdn.com/image/fetch/$s_!R3IB!,w_1272,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93a97f55-54b7-49a8-9c5f-f2f21a862d7e_320x78.gif 1272w, https://substackcdn.com/image/fetch/$s_!R3IB!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93a97f55-54b7-49a8-9c5f-f2f21a862d7e_320x78.gif 1456w" sizes="100vw" fetchpriority="high"></picture><div></div></div></a></figure></div><p>Are you curious now? Then read on!</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.apiad.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.apiad.net/subscribe?"><span>Subscribe now</span></a></p><h2>The Case for Tesserax</h2><p>There are a gazillion tools out there for drawing mathematical diagrams in Python, from the ancient, trustable <strong>matplotlib</strong> to the new cool kid in the park <strong>manim</strong>. There are a ton of charting libraries (those intended to draw bars, pies, and doughnuts) like <strong>altair</strong> and <strong>plotly</strong>; and a ton of high-level diagram makers like <strong>graphviz</strong> for graphs and networks, and <strong>mermaid</strong> for, well, whatever the mermaid devs think is worth drawing.</p><p>The problem with all these is that, one, <em>there weren&#8217;t made by me</em>. Now, jokes apart, I do believe there is a lot of value in reinventing the wheel, if only because you get to learn a lot about wheel&#8212;and boy have I learned more than I ever wanted to about SVG in the last couple of weeks. But also, you may find that your wheel fits your cart slightly better than all others, just because you know precisely what you need (and enough with wheel analogies).</p><p>So, I made Tesserax to cover a sweet spot that I couldn&#8217;t find anywhere: a lightweight library (literally zero dependencies, not even <strong>numpy</strong>) that renders web-native content (so everything scales and layouts perfectly in Jupyter / Quarto) and has both a very powerful low-level engine for when you want pixel-perfect control, and a very comfortable high-level engine for the most typical workflows.</p><p>Also, it should support <em>animations</em>. But I&#8217;m getting ahead of myself.</p><h2>What Can Tesserax Do</h2><blockquote><p>I won&#8217;t put any code in this article because, one, Substack sucks at code. And two, it would only make the article harder to follow. All of these examples are fully described in the <a href="https://apiad.github.io/tesserax">online documentation</a>.</p></blockquote><p>At its core, <strong>Tesserax</strong> is a library for defining an SVG scene. You create a <strong>Canvas</strong>, and some <strong>Shapes</strong> to it, define some attributes, and render it as an SVG file. Simple enough<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a>. </p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!67lT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c88048f-2c2c-4f68-a567-b01f783e12f4_399x128.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!67lT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c88048f-2c2c-4f68-a567-b01f783e12f4_399x128.png 424w, https://substackcdn.com/image/fetch/$s_!67lT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c88048f-2c2c-4f68-a567-b01f783e12f4_399x128.png 848w, https://substackcdn.com/image/fetch/$s_!67lT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c88048f-2c2c-4f68-a567-b01f783e12f4_399x128.png 1272w, https://substackcdn.com/image/fetch/$s_!67lT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c88048f-2c2c-4f68-a567-b01f783e12f4_399x128.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!67lT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c88048f-2c2c-4f68-a567-b01f783e12f4_399x128.png" width="399" height="128" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1c88048f-2c2c-4f68-a567-b01f783e12f4_399x128.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:128,&quot;width&quot;:399,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:4287,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.apiad.net/i/187845763?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c88048f-2c2c-4f68-a567-b01f783e12f4_399x128.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!67lT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c88048f-2c2c-4f68-a567-b01f783e12f4_399x128.png 424w, https://substackcdn.com/image/fetch/$s_!67lT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c88048f-2c2c-4f68-a567-b01f783e12f4_399x128.png 848w, https://substackcdn.com/image/fetch/$s_!67lT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c88048f-2c2c-4f68-a567-b01f783e12f4_399x128.png 1272w, https://substackcdn.com/image/fetch/$s_!67lT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c88048f-2c2c-4f68-a567-b01f783e12f4_399x128.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Beyond basic primitives, you can of course draw arbitrary paths and style them easily. Here is a <strong>Polyline</strong> primitive that allows controlling the curvature with a <strong>smoothness</strong> factor.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!YaVP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8960d96f-f48b-4c2c-af1c-0f4bcfb8b579_314x301.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YaVP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8960d96f-f48b-4c2c-af1c-0f4bcfb8b579_314x301.png 424w, https://substackcdn.com/image/fetch/$s_!YaVP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8960d96f-f48b-4c2c-af1c-0f4bcfb8b579_314x301.png 848w, https://substackcdn.com/image/fetch/$s_!YaVP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8960d96f-f48b-4c2c-af1c-0f4bcfb8b579_314x301.png 1272w, https://substackcdn.com/image/fetch/$s_!YaVP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8960d96f-f48b-4c2c-af1c-0f4bcfb8b579_314x301.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YaVP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8960d96f-f48b-4c2c-af1c-0f4bcfb8b579_314x301.png" width="282" height="270.3248407643312" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8960d96f-f48b-4c2c-af1c-0f4bcfb8b579_314x301.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:301,&quot;width&quot;:314,&quot;resizeWidth&quot;:282,&quot;bytes&quot;:5697,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.apiad.net/i/187845763?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8960d96f-f48b-4c2c-af1c-0f4bcfb8b579_314x301.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!YaVP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8960d96f-f48b-4c2c-af1c-0f4bcfb8b579_314x301.png 424w, https://substackcdn.com/image/fetch/$s_!YaVP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8960d96f-f48b-4c2c-af1c-0f4bcfb8b579_314x301.png 848w, https://substackcdn.com/image/fetch/$s_!YaVP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8960d96f-f48b-4c2c-af1c-0f4bcfb8b579_314x301.png 1272w, https://substackcdn.com/image/fetch/$s_!YaVP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8960d96f-f48b-4c2c-af1c-0f4bcfb8b579_314x301.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>You can control and distort these shapes in any form you want, including fully procedural warping of the edges.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ifdW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1226efa9-359b-4b1a-a986-19eed93df165_474x175.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ifdW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1226efa9-359b-4b1a-a986-19eed93df165_474x175.png 424w, https://substackcdn.com/image/fetch/$s_!ifdW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1226efa9-359b-4b1a-a986-19eed93df165_474x175.png 848w, https://substackcdn.com/image/fetch/$s_!ifdW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1226efa9-359b-4b1a-a986-19eed93df165_474x175.png 1272w, https://substackcdn.com/image/fetch/$s_!ifdW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1226efa9-359b-4b1a-a986-19eed93df165_474x175.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ifdW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1226efa9-359b-4b1a-a986-19eed93df165_474x175.png" width="412" height="152.10970464135022" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1226efa9-359b-4b1a-a986-19eed93df165_474x175.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:175,&quot;width&quot;:474,&quot;resizeWidth&quot;:412,&quot;bytes&quot;:9834,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.apiad.net/i/187845763?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1226efa9-359b-4b1a-a986-19eed93df165_474x175.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ifdW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1226efa9-359b-4b1a-a986-19eed93df165_474x175.png 424w, https://substackcdn.com/image/fetch/$s_!ifdW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1226efa9-359b-4b1a-a986-19eed93df165_474x175.png 848w, https://substackcdn.com/image/fetch/$s_!ifdW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1226efa9-359b-4b1a-a986-19eed93df165_474x175.png 1272w, https://substackcdn.com/image/fetch/$s_!ifdW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1226efa9-359b-4b1a-a986-19eed93df165_474x175.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>And, of course, since no mathematical drawing library is worth a penny without a sketchy mode, you can also do that (albeit with some constraints).</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3DwH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd62b252a-8212-4144-a857-d7e8d9944767_493x210.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3DwH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd62b252a-8212-4144-a857-d7e8d9944767_493x210.png 424w, https://substackcdn.com/image/fetch/$s_!3DwH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd62b252a-8212-4144-a857-d7e8d9944767_493x210.png 848w, https://substackcdn.com/image/fetch/$s_!3DwH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd62b252a-8212-4144-a857-d7e8d9944767_493x210.png 1272w, https://substackcdn.com/image/fetch/$s_!3DwH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd62b252a-8212-4144-a857-d7e8d9944767_493x210.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3DwH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd62b252a-8212-4144-a857-d7e8d9944767_493x210.png" width="403" height="171.6632860040568" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d62b252a-8212-4144-a857-d7e8d9944767_493x210.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:210,&quot;width&quot;:493,&quot;resizeWidth&quot;:403,&quot;bytes&quot;:11596,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.apiad.net/i/187845763?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd62b252a-8212-4144-a857-d7e8d9944767_493x210.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3DwH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd62b252a-8212-4144-a857-d7e8d9944767_493x210.png 424w, https://substackcdn.com/image/fetch/$s_!3DwH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd62b252a-8212-4144-a857-d7e8d9944767_493x210.png 848w, https://substackcdn.com/image/fetch/$s_!3DwH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd62b252a-8212-4144-a857-d7e8d9944767_493x210.png 1272w, https://substackcdn.com/image/fetch/$s_!3DwH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd62b252a-8212-4144-a857-d7e8d9944767_493x210.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>To make positioning and layout easy, Tesserax comes a set of builtin layouts (that of course can be extended in any way you want, it&#8217;s Python) for common patterns like rows, columns, grids&#8230;</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tEXd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60ca74b0-8ca0-4522-93b2-046664db44e3_349x210.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tEXd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60ca74b0-8ca0-4522-93b2-046664db44e3_349x210.png 424w, https://substackcdn.com/image/fetch/$s_!tEXd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60ca74b0-8ca0-4522-93b2-046664db44e3_349x210.png 848w, https://substackcdn.com/image/fetch/$s_!tEXd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60ca74b0-8ca0-4522-93b2-046664db44e3_349x210.png 1272w, https://substackcdn.com/image/fetch/$s_!tEXd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60ca74b0-8ca0-4522-93b2-046664db44e3_349x210.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tEXd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60ca74b0-8ca0-4522-93b2-046664db44e3_349x210.png" width="349" height="210" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/60ca74b0-8ca0-4522-93b2-046664db44e3_349x210.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:210,&quot;width&quot;:349,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:858,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.apiad.net/i/187845763?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60ca74b0-8ca0-4522-93b2-046664db44e3_349x210.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!tEXd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60ca74b0-8ca0-4522-93b2-046664db44e3_349x210.png 424w, https://substackcdn.com/image/fetch/$s_!tEXd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60ca74b0-8ca0-4522-93b2-046664db44e3_349x210.png 848w, https://substackcdn.com/image/fetch/$s_!tEXd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60ca74b0-8ca0-4522-93b2-046664db44e3_349x210.png 1272w, https://substackcdn.com/image/fetch/$s_!tEXd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60ca74b0-8ca0-4522-93b2-046664db44e3_349x210.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>And a couple of complex layouts for things like trees and arbitrary graphs.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KodJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5dc8f82b-67f7-424f-9f0b-abe7e1fd0085_192x333.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KodJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5dc8f82b-67f7-424f-9f0b-abe7e1fd0085_192x333.png 424w, https://substackcdn.com/image/fetch/$s_!KodJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5dc8f82b-67f7-424f-9f0b-abe7e1fd0085_192x333.png 848w, https://substackcdn.com/image/fetch/$s_!KodJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5dc8f82b-67f7-424f-9f0b-abe7e1fd0085_192x333.png 1272w, https://substackcdn.com/image/fetch/$s_!KodJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5dc8f82b-67f7-424f-9f0b-abe7e1fd0085_192x333.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KodJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5dc8f82b-67f7-424f-9f0b-abe7e1fd0085_192x333.png" width="192" height="333" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5dc8f82b-67f7-424f-9f0b-abe7e1fd0085_192x333.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:333,&quot;width&quot;:192,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:7101,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.apiad.net/i/187845763?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5dc8f82b-67f7-424f-9f0b-abe7e1fd0085_192x333.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KodJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5dc8f82b-67f7-424f-9f0b-abe7e1fd0085_192x333.png 424w, https://substackcdn.com/image/fetch/$s_!KodJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5dc8f82b-67f7-424f-9f0b-abe7e1fd0085_192x333.png 848w, https://substackcdn.com/image/fetch/$s_!KodJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5dc8f82b-67f7-424f-9f0b-abe7e1fd0085_192x333.png 1272w, https://substackcdn.com/image/fetch/$s_!KodJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5dc8f82b-67f7-424f-9f0b-abe7e1fd0085_192x333.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>But things start to get really interesting when you discover that if you can render an image from code, then you can render as many as you want! And that leads to&#8230; animations!</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5ssP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff23b3b8e-d4af-4a9b-ac8b-b09474037e25_160x100.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5ssP!,w_424,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff23b3b8e-d4af-4a9b-ac8b-b09474037e25_160x100.gif 424w, https://substackcdn.com/image/fetch/$s_!5ssP!,w_848,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff23b3b8e-d4af-4a9b-ac8b-b09474037e25_160x100.gif 848w, https://substackcdn.com/image/fetch/$s_!5ssP!,w_1272,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff23b3b8e-d4af-4a9b-ac8b-b09474037e25_160x100.gif 1272w, https://substackcdn.com/image/fetch/$s_!5ssP!,w_1456,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff23b3b8e-d4af-4a9b-ac8b-b09474037e25_160x100.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5ssP!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff23b3b8e-d4af-4a9b-ac8b-b09474037e25_160x100.gif" width="172" height="107.5" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f23b3b8e-d4af-4a9b-ac8b-b09474037e25_160x100.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:100,&quot;width&quot;:160,&quot;resizeWidth&quot;:172,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5ssP!,w_424,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff23b3b8e-d4af-4a9b-ac8b-b09474037e25_160x100.gif 424w, https://substackcdn.com/image/fetch/$s_!5ssP!,w_848,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff23b3b8e-d4af-4a9b-ac8b-b09474037e25_160x100.gif 848w, https://substackcdn.com/image/fetch/$s_!5ssP!,w_1272,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff23b3b8e-d4af-4a9b-ac8b-b09474037e25_160x100.gif 1272w, https://substackcdn.com/image/fetch/$s_!5ssP!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff23b3b8e-d4af-4a9b-ac8b-b09474037e25_160x100.gif 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Tesserax animations are procedural, meaning you write code that defines how each object changes in time. But there is a very high-level API for defining and composing animations declaratively, which means you seldom need the full power of tweaking each shape properties (but that power is there when you need it, as it should).</p><p>And, of course, once you grok animations, the next immediate idea is <em>physically-based animations</em>. (This is where things got way out of control, as you may imagine). Tesserax comes with a very small but fully-fledged pure Python 2D physics engines&#8212;that was a mouthful. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xao8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07c8667d-af5a-47ec-8471-6941baeb543b_120x250.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xao8!,w_424,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07c8667d-af5a-47ec-8471-6941baeb543b_120x250.gif 424w, https://substackcdn.com/image/fetch/$s_!xao8!,w_848,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07c8667d-af5a-47ec-8471-6941baeb543b_120x250.gif 848w, https://substackcdn.com/image/fetch/$s_!xao8!,w_1272,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07c8667d-af5a-47ec-8471-6941baeb543b_120x250.gif 1272w, https://substackcdn.com/image/fetch/$s_!xao8!,w_1456,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07c8667d-af5a-47ec-8471-6941baeb543b_120x250.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xao8!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07c8667d-af5a-47ec-8471-6941baeb543b_120x250.gif" width="126" height="262.5" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/07c8667d-af5a-47ec-8471-6941baeb543b_120x250.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:250,&quot;width&quot;:120,&quot;resizeWidth&quot;:126,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xao8!,w_424,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07c8667d-af5a-47ec-8471-6941baeb543b_120x250.gif 424w, https://substackcdn.com/image/fetch/$s_!xao8!,w_848,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07c8667d-af5a-47ec-8471-6941baeb543b_120x250.gif 848w, https://substackcdn.com/image/fetch/$s_!xao8!,w_1272,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07c8667d-af5a-47ec-8471-6941baeb543b_120x250.gif 1272w, https://substackcdn.com/image/fetch/$s_!xao8!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07c8667d-af5a-47ec-8471-6941baeb543b_120x250.gif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>It can resolve collisions among arbitrary objects (using circles and rotated boxes as approximate colliders) and simulate fixed-step rigid body mechanics. Physic animations are baked&#8212;which means you compute the animation once and then render it. So, no, no 2D games for you! What were you thinking about? This is a graphing library, for Turing&#8217;s sake!</p><p>The purpose of this physics engine is, once again, didactic&#8212;e.g., to explain physics concepts in a Jupyter notebook with a bit of shizzass. </p><h2>Final Words</h2><p>I started writing Tesserax to help me create diagrams and animations for my lectures and articles. But when you do that with <em>a fully-fledged programming language</em> things start to get really interesting, because now you can leverage your well-honed programming skills to create complex diagrams with the minimum necessary code. You can encapsulate repeatable patterns into classes and methods, and build impressive diagrams bottom-up. You can abstract common patterns into drawable concepts like trees, automatons, etc.</p><p>For example, in the docs you will find a bit of code to implement a blob-like simulation by creating a<strong> ConvexHull</strong> component that automatically tracks inner shapes and builds the convex surrounding path around them. Tied with some simple physics (a bunch of balls interconnected with springs) gives you this ugly-looking blob of dread that I hope haunts your dreams for the next week or so&#8212;it sure will mine.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FD07!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7820fc03-0d3d-43bb-8c37-0126e27b5652_620x388.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FD07!,w_424,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7820fc03-0d3d-43bb-8c37-0126e27b5652_620x388.gif 424w, https://substackcdn.com/image/fetch/$s_!FD07!,w_848,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7820fc03-0d3d-43bb-8c37-0126e27b5652_620x388.gif 848w, https://substackcdn.com/image/fetch/$s_!FD07!,w_1272,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7820fc03-0d3d-43bb-8c37-0126e27b5652_620x388.gif 1272w, https://substackcdn.com/image/fetch/$s_!FD07!,w_1456,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7820fc03-0d3d-43bb-8c37-0126e27b5652_620x388.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FD07!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7820fc03-0d3d-43bb-8c37-0126e27b5652_620x388.gif" width="436" height="272.8516129032258" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7820fc03-0d3d-43bb-8c37-0126e27b5652_620x388.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:388,&quot;width&quot;:620,&quot;resizeWidth&quot;:436,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FD07!,w_424,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7820fc03-0d3d-43bb-8c37-0126e27b5652_620x388.gif 424w, https://substackcdn.com/image/fetch/$s_!FD07!,w_848,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7820fc03-0d3d-43bb-8c37-0126e27b5652_620x388.gif 848w, https://substackcdn.com/image/fetch/$s_!FD07!,w_1272,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7820fc03-0d3d-43bb-8c37-0126e27b5652_620x388.gif 1272w, https://substackcdn.com/image/fetch/$s_!FD07!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7820fc03-0d3d-43bb-8c37-0126e27b5652_620x388.gif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>On a more serious note, the next step for Tesserax is building a library of reusable math and computer science concepts, like animated arrays, trees, graphs, etc., that can be used to explain complex CS topics with the much need visual assistance. As the great Donald Knuth once said, an algorithm must be seen to be believed. Tesserax is here to help you see them.</p><p>I&#8217;d really love if you guys would give Tesserax a try, now that is a bit more polished, and shoot me all your questions and suggestions. Special thanks to <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Wyrd Smythe&quot;,&quot;id&quot;:195807185,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3f88cc94-56a6-4453-b3dd-c8ccb9194c15_2316x2316.png&quot;,&quot;uuid&quot;:&quot;45140a44-685f-4df2-95a6-819f2f652900&quot;}" data-component-name="MentionToDOM"></span> for many insightful comments and suggestions (although I&#8217;m not sure I&#8217;ve totally addressed all of them :)</p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>Sadly, among all the stupid things that Substack cannot do for the sake of being annoying, it seems rendering SVG is one of them. So all drawings in this article are dumb screenshots of what otherwise would be pixel-perfect, infinitely scalable SVGs.</p></div></div>]]></content:encoded></item><item><title><![CDATA[Introducing Tesserax]]></title><description><![CDATA[A pure Python, developer-friendly library for academic drawing.]]></description><link>https://blog.apiad.net/p/introducing-tesserax</link><guid isPermaLink="false">https://blog.apiad.net/p/introducing-tesserax</guid><dc:creator><![CDATA[Alejandro Piad Morffis]]></dc:creator><pubDate>Sat, 31 Jan 2026 17:24:56 GMT</pubDate><enclosure url="https://images.unsplash.com/photo-1525278070609-779c7adb7b71?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw4fHxkcmF3aW5nfGVufDB8fHx8MTc2OTg3OTkwOHww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://images.unsplash.com/photo-1525278070609-779c7adb7b71?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw4fHxkcmF3aW5nfGVufDB8fHx8MTc2OTg3OTkwOHww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://images.unsplash.com/photo-1525278070609-779c7adb7b71?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw4fHxkcmF3aW5nfGVufDB8fHx8MTc2OTg3OTkwOHww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1525278070609-779c7adb7b71?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw4fHxkcmF3aW5nfGVufDB8fHx8MTc2OTg3OTkwOHww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1525278070609-779c7adb7b71?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw4fHxkcmF3aW5nfGVufDB8fHx8MTc2OTg3OTkwOHww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1525278070609-779c7adb7b71?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw4fHxkcmF3aW5nfGVufDB8fHx8MTc2OTg3OTkwOHww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw"><img src="https://images.unsplash.com/photo-1525278070609-779c7adb7b71?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw4fHxkcmF3aW5nfGVufDB8fHx8MTc2OTg3OTkwOHww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" width="6491" height="4447" data-attrs="{&quot;src&quot;:&quot;https://images.unsplash.com/photo-1525278070609-779c7adb7b71?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw4fHxkcmF3aW5nfGVufDB8fHx8MTc2OTg3OTkwOHww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:4447,&quot;width&quot;:6491,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;sketch pad and coloring pens&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="sketch pad and coloring pens" title="sketch pad and coloring pens" srcset="https://images.unsplash.com/photo-1525278070609-779c7adb7b71?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw4fHxkcmF3aW5nfGVufDB8fHx8MTc2OTg3OTkwOHww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1525278070609-779c7adb7b71?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw4fHxkcmF3aW5nfGVufDB8fHx8MTc2OTg3OTkwOHww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1525278070609-779c7adb7b71?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw4fHxkcmF3aW5nfGVufDB8fHx8MTc2OTg3OTkwOHww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1525278070609-779c7adb7b71?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw4fHxkcmF3aW5nfGVufDB8fHx8MTc2OTg3OTkwOHww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Photo by <a href="https://unsplash.com/@kobuagency">KOBU Agency</a> on <a href="https://unsplash.com">Unsplash</a></figcaption></figure></div><p>It is 2026. We are still treating scientific diagrams like cave paintings.</p><p>I have a PhD in Machine Learning. I can explain the mathematics of high-dimensional manifold optimization. Yet, until recently, if I wanted to draw two boxes connected by an arrow, I had to Google the syntax.</p><p>My diagrams were dead artifacts. While my code was version-controlled and modular, my figures were fragile binary blobs or brittle LaTeX macros. I needed my drawings to behave like the rest of my research.</p><p>Instead, I found myself trapped between two equally bad options.</p><p>I have spent weekends fighting <strong>Mermaid</strong> and <strong>Graphviz</strong>. They are fine for a quick flowchart. But try asking them for a publication-quality Turing machine. Try asking for a precise computer architecture diagram. You hit a wall.</p><p>Using these black-box layout engines is like trying to parallel park a bus while shouting instructions from the sidewalk. You can nudge the parameters, but you never quite end up where you want to be.</p><p>The real sin of these tools is their refusal to encapsulate. As developers, we take parameterization for granted. We define logic once and instantiate it a thousand times. But you cannot take a Mermaid diagram, wrap it in a function, and ask it to render a version with &#8220;n&#8221; nodes. Our drawing tools should respect the engineering principles we teach.</p><p>Then there is <strong>TikZ</strong>.</p><p>TikZ is the undisputed masterpiece of academic drawing. It offers total control. It is also fundamentally broken for the web.</p><p>TikZ is a &#8220;print-first&#8221; citizen. If you convert it to SVG for a blog post, you don&#8217;t get a clean vector drawing. You get a crime scene. Inspect the element, and you will see a soup of unreadable paths that no CSS can touch.</p><p>And then, there is the &#8220;Reviewer #2&#8221; problem. <em>(Damn Reviewer #2, why is it always you?!)</em> You spend three hours perfecting a TikZ architecture diagram. You submit the paper. The review comes back: <em>&#8220;Please swap layer 3 and layer 4.&#8221;</em> In a drag-and-drop tool, this is annoying. In TikZ, it is a reconstruction project involving fifty lines of coordinate hard-coding. It isn&#8217;t just tedious; it&#8217;s fragile.</p><p>To render my documentation in a CI/CD pipeline, I shouldn&#8217;t have to install four gigabytes of TeX Live just to draw a few circles. And there is the friction of the language itself. Writing geometry macros in a specialized DSL is a jarring context switch. It forces you to leave the logic of your Python research and enter a world of backslashes and curly braces just to visualize a result.</p><h2>Meet Tesserax</h2><p>The shift happened when I moved my academic publishing to <strong>Quarto</strong>.</p><p>Quarto is essentially Jupyter on steroids. Since Jupyter is the standard for scientific computing, the solution was obvious. We didn&#8217;t need another GUI. We didn&#8217;t need a new domain-specific language. We needed a Python library that could live inside the notebook and speak fluent SVG.</p><p>I built <strong><a href="https://github.com/apiad/tesserax">Tesserax</a></strong> to close the gap. I started with the atoms, like Rects, Circles, and Paths. But the breakthrough was the Anchor System. I wanted a &#8220;nervous system&#8221; for diagrams, where objects know where they are relative to one another. No manual coordinates. No magic numbers.</p><p>Tesserax is what happens when you treat a drawing as a function of state.</p><p>In Tesserax, you don&#8217;t place a node at <code>(100, 200)</code>. You define a <code>Layout</code>. You say, &#8220;I want a column of nodes, and for every node, I want an arrow pointing to the next one.&#8221;</p><pre><code><code># This isn't just a drawing. It's logic.
with Canvas() as c:
    nodes = [Circle(10) for _ in range(5)]
    layout = Column(nodes, gap=20)

    # If I add a node to the list above, the arrows update automatically.
    for n1, n2 in pairwise(nodes):
        Arrow(n1.anchor("bottom"), n2.anchor("top"))</code></code></pre><p>This snippet doesn&#8217;t just render a figure. It renders <em>any</em> version of that figure. It captures the abstraction, not just the pixels.</p><h2>The Rationale Behind Tesserax</h2><p><em>But why, Alex, why building something like this? Don&#8217;t you have enough in your plate now?</em></p><p>Well, first, who are you and why do you know me so well? Second, it&#8217;s fun! But if that isn&#8217;t enough by itself, let&#8217;s analyze the landscape of modern academic drawing.</p><p>First, there is <strong>Matplotlib</strong>. </p><p>It is the industry standard for a reason. If I have a CSV of training data, Matplotlib is my first call. But have you ever tried to draw a linked list in it? You end up fighting the axes. You spend a dozen lines of code just hiding the ticks and borders to get a blank canvas.</p><p>Matplotlib is a library built for statistics that is occasionally forced to perform geometry. It feels like doing calligraphy with a highlighter. Tesserax respects the division of labor: use Altair or Matplotlib for your data, and use Tesserax for your concepts.</p><p>Then, there is <strong>Tikz</strong>. (Again!) We already discussed it, but there is more.</p><p>On one hand, there is a strange Stockholm Syndrome in academia. We accept that drawing a simple three-state automata requires learning a bizarre macro language on top of another macro language (yeah, I&#8217;m looking at you, <em>LaTeX!</em>). We treat TikZ&#8217;s loop syntax like a revelation, ignoring that any programming language is actually better at this, but I can also have, I don&#8217;t know, <em>variables!</em> And also, who the hell knows how to write Tikz!? <em>(I know some of you nerds do, don&#8217;t give me that look. Shame on you.)</em></p><p>Tesserax bets that you don&#8217;t need a Domain Specific Language (DSL). You need objects. You need classes. You need type hints. If you want a grid of shapes? That is a nested list comprehension. If you want a recursive tree? That is a recursive function. You don&#8217;t need to &#8220;learn Tesserax.&#8221; You just need to know Python. By piggybacking on Python&#8217;s syntax, we get free IDE support, linting, and the ability to debug our drawings with the same tools we use to debug our kernels.</p><p>But also, even more important, the browser is the new print driver. TikZ was built for a world where the final output was a piece of paper. Tesserax is built for a world where the output is a responsive HTML page that <em>can</em> be printed.</p><p>By targeting SVG natively, we bypass the fragile conversion layers. We get semantic scaling. We get CSS styling. We stop fighting the renderer and start using the most optimized vector graphics engine in history: the modern web browser. And you can render SVG to PDF anytime. Going back, though, means you lose the <em>semantics</em> of what your drawing means.</p><p>And last but not least, Tesserax is <em>lean. </em>Like, zero dependencies. A couple hundred of pure, fully-typed, Python 3.12. Nothing more, nothing less. </p><p>I rest my case.</p><h2>What&#8217;s Next</h2><p>So Tesserax is my attempt to bring software engineering to scientific illustration. I&#8217;m releasing it today in a very crude, 0.2 something version, as I always do, to gather as much feedback as possible as soon as possible. From now on, all the diagrams in my papers, blog posts, and books will be built with this.</p><p>So, next step, I&#8217;m moving past the primitives phase now and into building high-level scientific abstractions. I want to generate automata that look like they were pulled from a classic textbook and data structures that reflect actual memory layouts. That&#8217;s what comes in the next few iterations. Moving from a world where we draw lines to a world where we code diagrams.</p><p>I&#8217;d love it if you&#8217;d give <strong><a href="https://github.com/apiad/tesserax">Tesserax</a></strong> a try. The documentation is still pretty thin but feel free to drop me a comment with any questions or suggestions.</p>]]></content:encoded></item><item><title><![CDATA[The Basics of Search]]></title><description><![CDATA[From Chapter 1 of The Algorithm Codex]]></description><link>https://blog.apiad.net/p/the-basics-of-search</link><guid isPermaLink="false">https://blog.apiad.net/p/the-basics-of-search</guid><dc:creator><![CDATA[Alejandro Piad Morffis]]></dc:creator><pubDate>Tue, 27 Jan 2026 15:32:20 GMT</pubDate><enclosure url="https://images.unsplash.com/photo-1505244783088-5a36f166e5b5?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2fHxzZWFyY2hpbmd8ZW58MHx8fHwxNzY4OTMxMTA1fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" length="0" type="image/jpeg"/><content:encoded><![CDATA[<blockquote><p><em>This article is a draft of Chapter 1 of <strong>The Algorithm Codex</strong>. You can read the entire Part I (chapters 1 through 6) <a href="https://matcom.github.io/codex">online</a> for free (forever). If you want to support this initiative, feel free to <a href="https://store.apiad.net/l/codex">grab the official PDF</a>.</em></p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://images.unsplash.com/photo-1505244783088-5a36f166e5b5?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2fHxzZWFyY2hpbmd8ZW58MHx8fHwxNzY4OTMxMTA1fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://images.unsplash.com/photo-1505244783088-5a36f166e5b5?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2fHxzZWFyY2hpbmd8ZW58MHx8fHwxNzY4OTMxMTA1fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1505244783088-5a36f166e5b5?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2fHxzZWFyY2hpbmd8ZW58MHx8fHwxNzY4OTMxMTA1fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1505244783088-5a36f166e5b5?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2fHxzZWFyY2hpbmd8ZW58MHx8fHwxNzY4OTMxMTA1fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1505244783088-5a36f166e5b5?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2fHxzZWFyY2hpbmd8ZW58MHx8fHwxNzY4OTMxMTA1fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw"><img src="https://images.unsplash.com/photo-1505244783088-5a36f166e5b5?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2fHxzZWFyY2hpbmd8ZW58MHx8fHwxNzY4OTMxMTA1fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" width="4096" height="2730" data-attrs="{&quot;src&quot;:&quot;https://images.unsplash.com/photo-1505244783088-5a36f166e5b5?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2fHxzZWFyY2hpbmd8ZW58MHx8fHwxNzY4OTMxMTA1fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2730,&quot;width&quot;:4096,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;woman walking on sand dunes during daytime&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="woman walking on sand dunes during daytime" title="woman walking on sand dunes during daytime" srcset="https://images.unsplash.com/photo-1505244783088-5a36f166e5b5?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2fHxzZWFyY2hpbmd8ZW58MHx8fHwxNzY4OTMxMTA1fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1505244783088-5a36f166e5b5?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2fHxzZWFyY2hpbmd8ZW58MHx8fHwxNzY4OTMxMTA1fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1505244783088-5a36f166e5b5?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2fHxzZWFyY2hpbmd8ZW58MHx8fHwxNzY4OTMxMTA1fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1505244783088-5a36f166e5b5?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw2fHxzZWFyY2hpbmd8ZW58MHx8fHwxNzY4OTMxMTA1fDA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Photo by <a href="https://unsplash.com/@katekerdi">Katerina Kerdi</a> on <a href="https://unsplash.com">Unsplash</a></figcaption></figure></div><p>Searching is arguably the most important problem in Computer Science. In a very simplistic way, searching is at the core of critical applications like databases, and is the cornerstone of how the internet works.</p><p>However, beyond this simple, superficial view of searching as an end in itself, you can also view search as means for general-purpose problem solving. When you are, for example, playing chess, what your brain is doing is, in a very fundamental way, <em>searching</em> for the optimal move&#8211;the only that most likely leads to winning.</p><p>In this sense, you can view almost all of Computer Science problems as search problems. In fact, a large part of this book will be devoted to search, in one way or another.</p><p>In this first chapter, we will look at the most explicit form of search: where we are explicitly given a set or collection of items, and asked to find one specific item.</p><p>We will start with the simplest, and most expensive kind of search, and progress towards increasingly more refined algorithms that exploit characteristics of the input items to minimize the time required to find the desired item, or determine if it&#8217;s not there at all.</p><h2><strong>Linear Search</strong></h2><p>Let&#8217;s start by analyzing the simplest algorithm that does something non-trivial: linear search. Most of these algorithms work on the simplest data structure that we will see, the sequence.</p><p>A sequence (<code>Sequence</code> class) is an abstract data type that represents a collection of items with no inherent structure, other than each element has an index.</p><pre><code><code>from typing import Sequence</code></code></pre><p>Linear search is the most basic form of search. We have a sequence of elements, and we must determine whether one specific element is among them. Since we cannot assume anything at all from the sequence, our only option is to check them all.</p><pre><code><code>def find[T](x:T, items: Sequence[T]) -&gt; bool:
    for y in items:
        if x == y:
            return True

    return False</code></code></pre><p>Our first test will be a sanity check for simple cases:</p><pre><code><code>from codex.search.linear import find

def test_simple_list():
    assert find(1, [1,2,3]) is True
    assert find(2, [1,2,3]) is True
    assert find(3, [1,2,3]) is True
    assert find(4, [1,2,3]) is False</code></code></pre><h3><strong>Analyzing Linear Search</strong></h3><p>Once we have an implementation, we must subject it to the three-step analysis established in our foundations.</p><h4>Is it correct?</h4><p>The property of <strong>correctness</strong> ensures that for any valid input, the algorithm produces the expected output. For linear search, we can verify this through three increasingly formal lenses:</p><ul><li><p><strong>The Exhaustive Argument</strong>: Suppose an element x exists in the sequence. By definition, there is some index i such that <code>items[i] == x</code>. Since the algorithm performs an equality test over every single index in the sequence without exception, it is logically impossible to miss the item if it is there.</p></li><li><p><strong>The Inductive Argument</strong>: We can reason about the algorithm&#8217;s correctness across different input sizes. For a sequence of length 0, the loop never executes, and the algorithm correctly returns <code>False</code>. Assume the algorithm works for a sequence of length n. For a sequence of length n+1, the target x is either in the first n elements&#8212;where the inductive hypothesis ensures we find it&#8212;or it is the n+1-th element, which we check in the final iteration. If it is in neither, the algorithm correctly concludes it is not present.</p></li><li><p><strong>The Loop Invariant</strong>: We can define a formal invariant for the <code>for</code> loop: <em>At the start of iteration i, the element x has not been found in the first i&#8722;1 elements of the sequence.</em> By the time the loop completes at iteration n, if the function hasn&#8217;t returned <code>True</code>, we know with certainty that x is not in the first n elements, which constitutes the entire sequence.</p></li></ul><h4>How efficient is it?</h4><p>We analyze linear search using the <strong>RAM model</strong>, assuming each comparison and iteration step has a unitary cost.</p><ul><li><p><strong>Time Complexity</strong>: In the worst-case scenario (the item is at the very end or not present at all), we must perform n comparisons for a sequence of size n. This gives us a growth rate of O(n), or <strong>linear time</strong>.</p></li><li><p><strong>Space Complexity</strong>: The algorithm only requires a constant amount of extra memory to store the loop variable and the target, regardless of the input size, resulting in O(1) <strong>space complexity</strong>.</p></li></ul><h4>Is it optimal?</h4><p>Intuitively, linear search must be <strong>optimal for unstructured data</strong>. If we know <em>nothing</em> about the order or distribution of the elements, we are mathematically forced to look at every single item at least once to be certain x is not there. Any algorithm that skipped an element could be &#8220;fooled&#8221; if that specific element happened to be the one we were looking for. Thus, for a generic sequence, O(n) is the best possible lower bound.</p><p>To prove this more formally, we employ an <strong>adversarial argument</strong>, a powerful technique in complexity theory where we imagine a game between our algorithm and a malicious adversary.</p><ul><li><p><strong>The Adversary&#8217;s Strategy</strong>: Suppose an algorithm claims to find an element x (or prove its absence) by examining fewer than n elements&#8212;say, n&#8722;1 elements. The adversary waits for the algorithm to finish its n&#8722;1 checks.</p></li><li><p><strong>The &#8220;Trap&#8221;</strong>: Because there is one element the algorithm did not inspect, the adversary is free to define that specific element as x if the algorithm concludes &#8220;False,&#8221; or as something other than x if the algorithm concludes &#8220;True&#8221; without having seen it.</p></li><li><p><strong>The Conclusion</strong>: Since the adversary can always change the unexamined element to make the algorithm&#8217;s answer wrong, any correct algorithm <em>must</em> inspect every element in the worst case.</p></li></ul><p>This proves that the lower bound for searching an unstructured sequence is &#937;(n). Linear search, which operates in O(n), meets this lower bound exactly, making it a <strong>tightly optimal</strong> solution for the problem as defined. Unless we possess more information about the data&#8217;s structure&#8212;the central theme of the next chapter&#8212;we simply cannot do better.</p><h2><strong>Indexing and Counting</strong></h2><p>The <code>find</code> method is good to know if an element exists in a sequence, but it doesn&#8217;t tell us <em>where</em>. We can easily extend it to return an <em>index</em>. We thus define the <code>index</code> method, with the following condition: if <code>index(x,l) == i</code> then <code>l[i] == x</code>. That is, <code>index</code> returns the <strong>first</strong> index where we can find a given element <code>x</code>.</p><pre><code><code>def index[T](x: T, items: Sequence[T]) -&gt; int | None:
    for i,y in enumerate(items):
        if x == y:
            return i

    return None</code></code></pre><p>When the item is not present in the sequence, we return <code>None</code>. We could raise an exception instead, but that would force a lot of defensive programming.</p><p>Let&#8217;s write some tests!</p><pre><code><code>from codex.search.linear import index

def test_index():
    assert index(1, [1,2,3]) == 0
    assert index(2, [1,2,3]) == 1
    assert index(3, [1,2,3]) == 2
    assert index(4, [1,2,3]) is None</code></code></pre><p>As a final step in the linear search paradigm, let&#8217;s consider the problem of finding not the first, but <em>all</em> occurrences of a given item. We&#8217;ll call this function <code>count</code>. It will return the number of occurrences of some item <code>x</code> in a sequence.</p><pre><code><code>def count[T](x: T, items: Sequence[T]) -&gt; int:
    c = 0

    for y in items:
        if x == y:
            c += 1

    return c</code></code></pre><p>Let&#8217;s write some simple tests for this method.</p><pre><code><code>from codex.search.linear import count

def test_index():
    assert count(1, [1,2,3]) == 1
    assert count(2, [1,2,2]) == 2
    assert count(4, [1,2,3]) == 0</code></code></pre><h3><strong>Analysis</strong></h3><p>We won&#8217;t dwell too much in this section since the analysis is very similar to linear search&#8211;these are just specialized versions of it. Once more, we have O(n) algorithms (with O(1) memory cost) for a problem that is provable &#937;(n). Thus, given our assumptions (that there is no intrinsic structure to the elements order), we have optimal algorithms.</p><h2><strong>Min and Max</strong></h2><p>Let&#8217;s now move to a slightly different problem. Instead of finding one specific element, we want to find the element that ranks minimum or maximum. Consider a sequence of numbers in an arbitrary order. We define the minimum (maximum) element as the element <code>x</code> such as <code>x &lt;= y</code> (<code>x &gt;= y</code>) for all <code>y</code> in the sequence.</p><p>Now, instead of numbers, consider some arbitrary total ordering function <code>f</code>, such that <code>f(x,y) &lt;= 0</code> if and only if <code>x &lt;= y</code>. This allows us to extend the notion of minimum and maximum to arbitrary data types.</p><p>Let&#8217;s formalize this notion as a Python type alias. We will define an <code>Ordering</code> as a function that has this signature:</p><pre><code><code>from typing import Callable

type Ordering[T] = Callable[[T,T], int]</code></code></pre><p>Now, to make things simple for the simplest cases, let&#8217;s define a default ordering function that just delegates to the items own <code>&lt;=</code> implementation. This way we don&#8217;t have to reinvent the wheel with numbers, strings, and all other natively comparable items.</p><pre><code><code>def default_order(x, y):
    if x &lt; y:
        return -1
    elif x == y:
        return 0
    else:
        return 1</code></code></pre><p>Let&#8217;s write the <code>minimum</code> method using this convention. Since we have no knowledge of the structure of the sequence other than it supports partial ordering, we have to test all possible items, like before. But now, instead of returning as soon as we find the correct item, we simply store the minimum item we&#8217;ve seen so far, and return at the end of the <code>for</code> loop. This guarantees we have seen all the items, and thus the minimum among them must be the one we have marked.</p><pre><code><code>from codex.types import Ordering, default_order

def minimum[T](items: Sequence[T], f: Ordering[T] = default_order) -&gt; T:
    m = items[0]

    for x in items:
        if f(x,m) &lt;= 0:
            m = x

    return m</code></code></pre><p>The <code>minimum</code> method can fail only if the <code>items</code> sequence is empty. In the same manner, we can implement <code>maximum</code>. But instead of coding another method with the same functionality, which is not very DRY, we can leverage the fact that we are passing an ordering function that we can manipulate.</p><p>Consider an arbitrary ordering function <code>f</code> such <code>f(x,y) &lt;= 0</code>. This means by definition that <code>x &lt;= y</code>. Now we want to define another function <code>g</code> such that <code>g(y,x) &lt;= 0</code>, that is, it <em>inverts</em> the result of <code>f</code>. We can do this very simply by swapping the inputs in <code>f</code>.</p><pre><code><code>def maximum[T](items: Sequence[T], f: Ordering[T] = default_order) -&gt; T:
    return minimum(items, lambda x,y: f(y,x))</code></code></pre><p>We can easily code a couple of test methods for this new functionality.</p><pre><code><code>from codex.search.linear import minimum, maximum

def test_minmax():
    items = [4,2,6,5,7,1,0]

    assert minimum(items) == 0
    assert maximum(items) == 7</code></code></pre><p>The correctness, cost, and optimality analysis is very similar in these cases as well.</p><h2><strong>Conclusion</strong></h2><p>Linear search is a powerful paradigm precisely because it is universal. Whether we are checking for the existence of an item, finding its index, or identifying the minimum or maximum element in a collection, the exhaustive approach provides absolute certainty. No matter the nature of the data, if we test every single element and skim through every possibility, the problem will be solved.</p><p>The primary drawback of this certainty is the cost: some search spaces are simply too vast to be traversed one item at a time. To achieve better performance, we must move beyond the assumption of an unstructured sequence. We need to know more about the search space and impose some level of structure.</p><p>In our next chapter, we will explore the most straightforward structure we can impose: <strong>order</strong>. We will see how knowing the relative position of items allows us to implement what is arguably <em>the most efficient and beautiful algorithm ever designed</em>.</p><p>But that&#8217;s a story for another Tuesday.</p><blockquote><p><em>Remember you can read the <a href="https://matcom.github.io/codex">full draft online</a> (currently Part 1) or get a <a href="https://store.apiad.net/l/codex">beautifully typed PDF</a> if you want to support the project.</em></p></blockquote>]]></content:encoded></item><item><title><![CDATA[Introducing GenSIE 2026]]></title><description><![CDATA[An invitation for researchers and developers interested in Agentic AI]]></description><link>https://blog.apiad.net/p/introducing-gensie-2026</link><guid isPermaLink="false">https://blog.apiad.net/p/introducing-gensie-2026</guid><dc:creator><![CDATA[Alejandro Piad Morffis]]></dc:creator><pubDate>Mon, 26 Jan 2026 19:32:38 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!qNGT!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F582c72c0-c120-4ea8-ae6b-376a025250bb_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>If you have built an &#8220;Agentic&#8221; workflow recently, you know the pain. You prompt a model to perform a complex action&#8212;maybe scraping a website, analyzing a contract, or controlling a robot&#8212;and you ask it to return the result in a nice, clean JSON format so your Python code can parse it.</p><p>And it works. Mostly.</p><p>Until it doesn&#8217;t. Until the model decides to wrap the JSON in markdown backticks. Or adds a &#8220;Here is your data&#8221; preamble. Or hallucinates a field that doesn&#8217;t exist in your Pydantic model. Or, worse, it just makes up a fact that sounds plausible but isn&#8217;t there.</p><p>We are entering the era of agentic AI, where models talk to machines, not just humans. And machines speak in protocols. If we want AI that is reliable, robust, and <em>affordable</em>, we cannot rely on massive 100B+ parameter models for every single function call. We need <strong>Small Language Models (SLMs)</strong> that can speak general-purpose structured data fluently.</p><p>That is why, together with my colleagues at the University of Havana and the University of Alicante, we are launching <strong>GenSIE: General-purpose Schema-guided Information Extraction</strong> at <a href="https://sites.google.com/view/iberlef-2026">IberLEF 2026</a>.</p><h2>The Challenge: Zero-Shot Structure</h2><p>Most information extraction tasks are &#8220;fixed.&#8221; You train a model to find <code>PERSON</code>, <code>ORG</code>, and <code>DATE</code>. You show it thousands of examples. It learns. GenSIE is different, it is a <strong>Zero-Shot Schema</strong> task.</p><p>At inference time, your system receives a text and a <strong>schema it has never seen before</strong>. It might be a legal verdict today, a recipe tomorrow, and a chemical compound specification next week. Your system must read the schema definition (provided as a JSON Schema), understand the semantic constraints (like &#8220;extract the verdict, but map it to <code>POSITIVE</code> or <code>NEGATIVE</code>&#8220;), and generate valid, grounded JSON. </p><p>There is a hallucination trap built-in also&#8212;sometimes we&#8217;ll squeeze a field that can be answered by the model&#8217;s original training data, but is not explicitly answered in the input text, so your system has to output <code>null</code> for it.</p><p>Oh, and there is a catch: <strong>No Fine-Tuning allowed</strong>.</p><h2>Why This Matters (and Why It&#8217;s Hard)</h2><p>We explicitly designed this task to close an existing innovation gap. Huge models like GPT-5 or Gemini 3 Pro can often brute-force this problem through sheer scale. But running a 1TB model just to parse a date is sustainable neither economically nor ecologically.</p><p>GenSIE challenges you to use <strong>Small, Open-Weights Models</strong> (like Llama 3 8B, Qwen 14B, or Salamandra 2B). To make these smaller models perform at a high level, you can&#8217;t just throw compute at the problem. You need <strong>Inference-Time Engineering</strong>:</p><ul><li><p><strong>Constrained Decoding:</strong> Forcing the token sampling to obey a grammar.</p></li><li><p><strong>Chain-of-Thought:</strong> Letting the model reason about the schema before outputting the JSON.</p></li><li><p><strong>Self-Correction:</strong> Catching validation errors and asking the model to fix them in a loop.</p></li></ul><p>This is the real challenge. You cannot just throw more compute at the problem and hope it will get solved by an expensive API call. You&#8217;ll have to work around these hard technical constraints to achieve something that works as good a Gemini or ChatGPT but runs on commodity hardware.</p><h2>Join Us</h2><p>We are building a dataset of <strong>1,000 human-curated examples</strong>&#8212;rigorously checked to punish hallucinations and reward precision.</p><p>The timeline is tight, but the barrier to entry is low. On March 1st we&#8217;ll release a first batch of annotated examples and a startup kit with Docker templates and a baseline implementation. Then you have until May 8th to submit a working system. All participants (either solo or in teams) have the option to submit a system paper that will be peer-reviewed and indexed in Scopus.</p><p>If you are interested in the future of reliable AI agents, structured generation, or just want to test your engineering skills against a hard benchmark, I invite you to participate. This is particularly interesting for master or PhD students in search of interesting, open research problems.</p><p>Check out the official site, and leave me a comment if you want to know more.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://uhgia.org/gensie&quot;,&quot;text&quot;:&quot;Official GenSIE Website&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://uhgia.org/gensie"><span>Official GenSIE Website</span></a></p><p></p><p></p><p></p>]]></content:encoded></item><item><title><![CDATA[Part I of The Algorithm Codex]]></title><link>https://blog.apiad.net/p/part-i-of-the-algorithm-codex</link><guid isPermaLink="false">https://blog.apiad.net/p/part-i-of-the-algorithm-codex</guid><dc:creator><![CDATA[Alejandro Piad Morffis]]></dc:creator><pubDate>Mon, 19 Jan 2026 09:54:57 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!qNGT!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F582c72c0-c120-4ea8-ae6b-376a025250bb_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>The first draft of <strong>Part I: Searching and Sorting</strong> for <em>The Algorithm Codex</em> is now officially ready for readers. This opening section establishes the foundational theme of the book: that structure is the primary driver of computational efficiency. You can read the full draft online at <a href="https://matcom.github.io/codex/">matcom.github.io/codex</a> or download a high-quality PDF version on <a href="https://store.apiad.net/l/codex">Gumroad</a> on a pay-what-you-want basis to support the project.</p><p>This draft encompasses 19 distinct algorithms across six chapters, focusing on the core logic and intuitions that make them work:</p><ul><li><p><strong>Basic Search</strong>: Establishing the universal but expensive baseline of searching through unstructured sequences.</p></li><li><p><strong>Efficient Search</strong>: Introducing the &#8220;magical&#8221; scaling of binary search and bisection, extending these concepts to abstract decision spaces through predicate search.</p></li><li><p><strong>Basic Sorting</strong>: Analyzing the &#8220;geometry of inversions&#8221; to understand why simple approaches like Selection and Insertion sort are naturally bounded by a quadratic ceiling.</p></li><li><p><strong>Efficient Sorting</strong>: Breaking the quadratic barrier with the recursive divide-and-conquer strategies of Merge Sort and Quick Sort.</p></li><li><p><strong>Rankings and Selection</strong>: Implementing strategies for order statistics, including the randomized Quick Select and the deterministic &#8220;Median of Medians&#8221;.</p></li><li><p><strong>Linear Time Sorting</strong>: Bypassing the theoretical limits of comparison-based algorithms by exploiting domain-specific constraints with Counting and Radix sort.</p></li></ul><p>While the core content is ready, the <em>Codex</em> remains a work in progress. I plan to further extend the explanations with more intuitive descriptions and perform deeper analysis without getting into overly formal proofs. I am also considering the addition of a few more clever algorithms to round out this section before moving on to Part II on foundational data structures (lists, queues, stacks, and dictionaries).</p><p>Over the next two weeks, I will be adapting several chapters into articles for <em>The Computist Journal</em>. These posts will focus specifically on the beauty and underlying intuitions of binary search and efficient sorting, rather than the nitty-gritty implementation details.</p><p>If you find this resource valuable, please consider supporting the ongoing development of the remaining seven parts by getting the PDF on Gumroad. Your support directly enables the continuation of this project.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://store.apiad.net/l/codex&quot;,&quot;text&quot;:&quot;Support The Algorithm Codex&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://store.apiad.net/l/codex"><span>Support The Algorithm Codex</span></a></p>]]></content:encoded></item><item><title><![CDATA[Introducing the Algorithm Codex]]></title><description><![CDATA[A New Journey Into the Science of Computation]]></description><link>https://blog.apiad.net/p/introducing-the-algorithm-codex</link><guid isPermaLink="false">https://blog.apiad.net/p/introducing-the-algorithm-codex</guid><dc:creator><![CDATA[Alejandro Piad Morffis]]></dc:creator><pubDate>Sun, 18 Jan 2026 13:51:52 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!qNGT!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F582c72c0-c120-4ea8-ae6b-376a025250bb_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey, this is just a quick informative post to talk to you about what&#8217;s coming next in <em>The Computist Journal</em>.</p><p>As part of our ongoing work in curriculum design for the Computer Science career at the University of Havana, we are launching a public project we&#8217;ve dubbed <strong>The Algorithm Codex</strong>. It&#8217;s a big name, I know, but we have big goals for it. We plan to cover the vast majority of algorithms taught at the undergraduate level in a typical CS major, moving from the absolute basics of searching and sorting to more complex structures like trees, graphs, and string matching. We&#8217;ll even dive into some of the more advanced final boss topics in number theory and computational complexity.</p><p>This is an ongoing effort that we will be updating weekly as new chapters and algorithms are finalized.</p><h4><strong>What exactly is the Codex?</strong></h4><p>The whole project is designed as a living reference book. This means it isn&#8217;t just a collection of source code; it&#8217;s a narrative experience built on the philosophy of <strong>literate programming</strong>. Every algorithm comes with intuitive explanations, correctness proofs, and back-of-the-envelope complexity analysis.</p><p>We aren&#8217;t going to get bogged down in dense, impenetrable math. Instead, the Codex is meant to serve as a complement&#8212;not a replacement&#8212;for traditional textbooks. It&#8217;s a place for students and enthusiasts alike to find a quick, clear reference to the best, most well-known, and occasionally the most obscure but cute algorithms ever devised.</p><p>The project is entirely open-source on <a href="https://github.com/matcom/codex">GitHub</a>, and the accompanying book is, of course, free to <a href="https://matcom.github.io/codex">read online</a>. It&#8217;s still very early stages but it&#8217;ll grow, in time, as good things do. I have big hopes for this project.</p><h4><strong>What&#8217;s the deal for you?</strong></h4><p>Since I am one of the main contributors to this project, I&#8217;ll be using <em>The Computist Journal</em> to post deep dives into some of these algorithms over the next few months. I&#8217;ll be interleaving these with our usual topics&#8212;CS theory, AI, and the philosophy of computation&#8212;but for a while, a large part of my focus will be on these fundamental building blocks.</p><p>You can expect the usual <em>Computist</em> style: intuitive explanations, clear visualizations, and a focus on <em>why</em> things work the way they do.</p><p>We are kicking things off this week with an entry on the basics of search. We&#8217;ll be understanding and designing from scratch what is arguably the most beautiful and quintessential algorithm in computer science: <strong>Binary Search</strong>.</p><p>So, if learning about algorithms is your kink, hit that subscribe button. I&#8217;ll see you next Tuesday with the first entry.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.apiad.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.apiad.net/subscribe?"><span>Subscribe now</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Is the Universe a Computer? - Part I]]></title><description><![CDATA[Unravelling the true nature of the Cosmos, one bit at a time.]]></description><link>https://blog.apiad.net/p/is-the-universe-a-computer-part-i</link><guid isPermaLink="false">https://blog.apiad.net/p/is-the-universe-a-computer-part-i</guid><dc:creator><![CDATA[Alejandro Piad Morffis]]></dc:creator><pubDate>Fri, 02 Jan 2026 11:38:50 GMT</pubDate><enclosure url="https://images.unsplash.com/photo-1444703686981-a3abbc4d4fe3?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHx1bml2ZXJzZXxlbnwwfHx8fDE3NjczMzc0OTd8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://images.unsplash.com/photo-1444703686981-a3abbc4d4fe3?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHx1bml2ZXJzZXxlbnwwfHx8fDE3NjczMzc0OTd8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://images.unsplash.com/photo-1444703686981-a3abbc4d4fe3?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHx1bml2ZXJzZXxlbnwwfHx8fDE3NjczMzc0OTd8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1444703686981-a3abbc4d4fe3?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHx1bml2ZXJzZXxlbnwwfHx8fDE3NjczMzc0OTd8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1444703686981-a3abbc4d4fe3?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHx1bml2ZXJzZXxlbnwwfHx8fDE3NjczMzc0OTd8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1444703686981-a3abbc4d4fe3?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHx1bml2ZXJzZXxlbnwwfHx8fDE3NjczMzc0OTd8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw"><img src="https://images.unsplash.com/photo-1444703686981-a3abbc4d4fe3?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHx1bml2ZXJzZXxlbnwwfHx8fDE3NjczMzc0OTd8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" width="5616" height="3744" data-attrs="{&quot;src&quot;:&quot;https://images.unsplash.com/photo-1444703686981-a3abbc4d4fe3?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHx1bml2ZXJzZXxlbnwwfHx8fDE3NjczMzc0OTd8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:3744,&quot;width&quot;:5616,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;silhouette photography of person&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="silhouette photography of person" title="silhouette photography of person" srcset="https://images.unsplash.com/photo-1444703686981-a3abbc4d4fe3?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHx1bml2ZXJzZXxlbnwwfHx8fDE3NjczMzc0OTd8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1444703686981-a3abbc4d4fe3?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHx1bml2ZXJzZXxlbnwwfHx8fDE3NjczMzc0OTd8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1444703686981-a3abbc4d4fe3?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHx1bml2ZXJzZXxlbnwwfHx8fDE3NjczMzc0OTd8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1444703686981-a3abbc4d4fe3?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxfHx1bml2ZXJzZXxlbnwwfHx8fDE3NjczMzc0OTd8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Photo by <a href="https://unsplash.com/@grakozy">Greg Rakozy</a> on <a href="https://unsplash.com">Unsplash</a></figcaption></figure></div><p>Let&#8217;s kick off 2026 with a thought experiment.</p><p>You&#8217;ve all heard the <a href="https://en.wikipedia.org/wiki/Simulation_hypothesis">Simulation Hypothesis</a> by now, right? The idea that we might just characters in a high-res version of <em>The Sims</em> running on a supercomputer in some teenager&#8217;s basement in another dimension.</p><p>Well, this is <strong>not</strong> that.</p><p>The simulation hypothesis is a fun thought experiment, and it&#8217;s certainly great for selling movie tips, but I think it misses the point entirely. It presumes an <em>outside</em> that we can never reach. But we don&#8217;t need to posit any external observer, a higher-dimensional teenager in a basement. What if the universe isn&#8217;t <em>inside</em> a computer; but it is <em>itself</em> a computer.</p><p>This is a serious metaphysical position held by several prominent thinkers known as <strong>pancomputationalism</strong>, which basically means everything there is, is just computation. In this and a follow up article, I want to run with this argument and stretch it to its ultimate implications. Not necessarily because <em>I</em> believe it (I will tell you <em>exactly</em> what I believe at the end) or because I want <em>you</em> to believe it.</p><p>No, I just honestly think this is a very intellectually engaging argument to discuss, and even if it ends up not being true in its ultimate form (which we shall see), it presents some pretty damn cool ideas&#8211;and some pretty horrific ones. So even if just as a fun thought experiment, I invite you to ponder on it for a while.</p><p>You may be surprised by what you discover.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.apiad.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">The Computist Journal is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>The Universe as a Universal Computer</h2><p>To understand pancomputationalism, we have to go back to the 1980s, to a physicist-turned-polymath named Stephen Wolfram staring at a computer screen.</p><p>Wolfram was playing with something called <a href="https://en.wikipedia.org/wiki/cellular_automata">Cellular Automata</a>&#8212;incredibly simple programs that live on a grid of pixels. You start with a line of black and white squares and give them a rule: &#8220;If your neighbor is black and you are white, turn black in the next step.&#8221; It sounds like a toy or perhaps something you&#8217;d learn in a mid-school basic coding class.</p><p>CAs where not Wolfram&#8217;s invention. We can trace them back to Stanislaw Ulam and the polygenious John von Neumann back in the 1940s (because of course anything you find interesting in late 20th century was actually invented by von Neumann!) but it wasn&#8217;t until the 70s when John Conway brought them to the public&#8217;s attention with a series of puzzles about a specific form of 2-dimensional CA called <a href="https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life">Game of Life</a>.</p><p>The following picture shows a &#8220;Glider Gun&#8221;, one of the most interesting patterns that answers one of Conway&#8217;s original questions: the existence of configurations that ran forever without looping (you can see how the small thingies that are thrown diagonally&#8212;called gliders&#8212;keep moving forever).</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!F6SM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0c3f3cf-af39-4d49-b66f-a78bfd068e3e_250x180.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!F6SM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0c3f3cf-af39-4d49-b66f-a78bfd068e3e_250x180.gif 424w, https://substackcdn.com/image/fetch/$s_!F6SM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0c3f3cf-af39-4d49-b66f-a78bfd068e3e_250x180.gif 848w, https://substackcdn.com/image/fetch/$s_!F6SM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0c3f3cf-af39-4d49-b66f-a78bfd068e3e_250x180.gif 1272w, https://substackcdn.com/image/fetch/$s_!F6SM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0c3f3cf-af39-4d49-b66f-a78bfd068e3e_250x180.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!F6SM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0c3f3cf-af39-4d49-b66f-a78bfd068e3e_250x180.gif" width="320" height="230.4" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d0c3f3cf-af39-4d49-b66f-a78bfd068e3e_250x180.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:180,&quot;width&quot;:250,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:21272,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/gif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.apiad.net/i/183230309?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0c3f3cf-af39-4d49-b66f-a78bfd068e3e_250x180.gif&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!F6SM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0c3f3cf-af39-4d49-b66f-a78bfd068e3e_250x180.gif 424w, https://substackcdn.com/image/fetch/$s_!F6SM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0c3f3cf-af39-4d49-b66f-a78bfd068e3e_250x180.gif 848w, https://substackcdn.com/image/fetch/$s_!F6SM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0c3f3cf-af39-4d49-b66f-a78bfd068e3e_250x180.gif 1272w, https://substackcdn.com/image/fetch/$s_!F6SM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0c3f3cf-af39-4d49-b66f-a78bfd068e3e_250x180.gif 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption">Taken from Wikipedia. By Lucas Vieira - Own work, CC BY-SA 3.0, <a href="https://commons.wikimedia.org/w/index.php?curid=101736">https://commons.wikimedia.org/w/index.php?curid=101736</a></figcaption></figure></div><p>The reason these are interesting, among other things is because depending on the initial configuration, there are several, qualitatively distinct developments a cellular automaton can follow. Wolfram classified them into 4 more-or-less well-defined classes.</p><p>Classes 1 and 2 corresponding to different ways in which the initial configuration converges to a stable configuration (like the Glider Gun) that may or not keep evolving, but in a very controlled, predictable way. Some of these eventually stop, others keep going forever, but always in a very structured form.</p><p>Class 3, in contrast, is chaotic, random, unpredictable. In fact, several configurations in this category are thought to be cryptographically-secure random generators&#8212;the strongest kind of randomness you can get from a deterministic set of rules.</p><p>And then there is Class 4, a category of CA configurations that exhibit such a degree of complexity in their evolution that Wolfram conjectured most if not all configurations in this category where capable of general-purpose computation! Eventually, it was shown that extremely simple CAs do possess this trait, including one of Wolfram&#8217;s original CAs, <a href="https://en.wikipedia.org/wiki/Rule_110">Rule 110</a>, and several configurations in Conway&#8217;s Game of Life.</p><p>The reason this is interesting for our discussion, is because it led some thinkers to ponder whether this kind of complexity born out of simplicity could be the explanation behind all of physics. The first one is perhaps Konrad Zuse, a German engineer who built the first programmable computer (the Z3) during World War II.</p><p>In his 1969 book <em>Rechnender Raum</em> (Calculating Space), he proposed that the universe is a discrete lattice. He argued that the infinities that plague quantum physics could be solved if we just admitted that space has a minimum pixel size. In Zuse&#8217;s vision, space isn&#8217;t an empty stage where things move around. Space <em>is the computer</em> itself. An electron isn&#8217;t a particle moving <em>through</em> the vacuum; it&#8217;s a propagating pattern <em>on</em> the spatial grid. Think of it like a glider in Conway&#8217;s Game of Life: the pixel doesn&#8217;t move, the <em>information</em> does.</p><p>This is the heart of <strong>Pancomputationalism</strong>. The idea that, deep down, the Universe is not a metaphorical clockwork machine but an actual digital computer, though not one made of transistors, of course. Wolfram takes this to an extreme, by positing all of physics can be derived from a simple computational model, a hypergraph of nodes and edges that change according to some specific rules. In this theory&#8212;which is perhaps the most advanced form of pancomputationalism&#8212;the underlying computational lattice is so intricate that extremely simple processes like one electron jumping from an energy state to another would require several trillions of operations.</p><p>But the details matter less than the general idea: if there are some extremely simple mechanism (like the Rule 110 CA) that can support general computation, then universal computers most be abundant in the Universe. As an example, the DNA molecule must be capable of universal computation!</p><p>So why not take this argument to its logical end? Perhaps this is all there is! Perhaps universal computation is the ultimate level of complexity. If that&#8217;s the case, then the whole Universe is ultimately just a very complex computer.</p><h2>Time for some counter-arguments</h2><p>Of course, the idea that the universe is just a computer&#8212;or more specifically, <a href="https://en.wikipedia.org/wiki/Turing_machine">a Turing machine</a>&#8212;is a hard pill to swallow for many. If the cosmos is indeed a digital lattice, then everything within it should be effectively computable. But nature has a way of throwing us curve balls that look suspiciously like hacks of the system. Processes that seem to require more than universal computation to work out.</p><p>Let&#8217;s check two of the most famous counter-arguments to pancomputationalism.</p><h3>Black Holes and Hypercomputers</h3><p>One of the most persistent challenges to pancomputationalism is the idea of <strong>Hypercomputation</strong>. The argument goes like this: one of the simplest ways to show something is more complex than a Turing machine is to show it can solve problems known to be computationally undecidable. The most famous of these is the Halting problem&#8211;the question of whether a given arbitrary computation will end up eventually.</p><p>This was proven by Alan Turing himself to be generally undecidable. That is, you <em>cannot</em> predict whether a computer program will eventually halt or run forever just by looking at its source code. You can do it for some programs (because the code has no loops, for example, or for some contrived mathematical property), but you cannot devise a single, universal rule&#8212;a computer program&#8212;that tells you without doubt whether another arbitrary program stops.</p><p>But you can simulate it! If you want to know whether a program ends, just run it. If it ends, then eventually you will know it. The problem is, if it doesn&#8217;t end, you will never know! You will just wait indefinitely, because you can never say &#8220;Ok, I&#8217;ve seen enough&#8221;, the program might just end next iteration.</p><p>Ok, so here&#8217;s the twist. This presumes each iteration takes a <em>finite</em> amount of time, so an infinite number of iterations must take an infinite time, so you cannot outwait it. But what if we could make each iteration take less and less time? Like, the first iteration takes 0.5 seconds, the next one 0.25 seconds, and so on, each iteration taking half of the previous one? If we had this magical computer that accelerates as it computes, we could finish an infinite amount of iterations in a finite time!</p><p>This is called hypercomputation, and is an extension of something called a <a href="https://en.wikipedia.org/wiki/Supertask">hypertask</a> to the realm of computational complexity. Hypertasks are like taking limits in real life. You perform each step slightly faster, in such a way that the infinite sum of smaller and smaller times converges to something finite. Much like Zeno&#8217;s paradox, but each step is not smaller, it&#8217;s just faster.</p><p>Can we have hypercomputers in the real world? Well, some physicists suggest that the extreme gravity of a special, theoretic kind of black holes could allow for supertasks. Imagine an observer orbiting a black hole while a computer falls into it. Because of the way gravity warps time, the falling computer, in its own timeframe, would spend eternity counting towards infinity. But we would observe all its infinite lifetime compressed into a finite amount of time.</p><p>This would be crazy! Imagine we want to know whether, e.g., <a href="https://en.wikipedia.org/wiki/Riemann_hypothesis">Riehmann hypothesis</a>&#8212;the most important unsolved problem in pure mathematics&#8212;is true. We would just make a program that would check all infinite zeros of the Zeta function, run it in a computer, and throw it into a black hole. As the computation stretches to infinity, we would know definitely whether any zero has real part different to 1/2, because the program would check them all in finite time!</p><p>We could solve almost any open mathematical problem this way!</p><p>Now, this does require some stretch of imagination, some bizarre physics that many don&#8217;t agree actually exists. These are not regular black holes, but a special kind of black hole that has never been observed. And of course we have to issue of actually building a computer than can last forever! But nevertheless, just proving the physical possibility of hypercomputation&#8211;even if impossible in engineering terms&#8211;would undermine the whole idea of pancomputationalism.</p><h3>The Mind vs. The Machine</h3><p>Then there is the challenge from the human side of the fence. Roger Penrose, in his seminal <em>The Emperor&#8217;s New Mind</em>, famously argued that human consciousness is fundamentally non-algorithmic.</p><p>His argument leans heavily on <a href="https://en.wikipedia.org/wiki/G%C3%B6del%27s_incompleteness_theorems">G&#246;del&#8217;s Incompleteness iheorem</a>. Penrose claims that because we can &#8220;see&#8221; that a G&#246;delian statement is true&#8212;even though it&#8217;s unprovable within its own formal system&#8212;our brains must be doing something that transcends simple computation. He actually suggests that this &#8220;something&#8221; is a quantum gravity effect happening in the tiny structures of our brain called microtubules.</p><p>Critics like Scott Aaronson are quick to point out that even if our brains use quantum effects, that doesn&#8217;t necessarily make them hypercomputational. (A quantum computer, after all, is still just a computer, it just has a different complexity class). But Penrose&#8217;s argument hits on a deep-seated intuition: that phenomena such as love, consciousness, or creativity aren&#8217;t just very long strings of 1s and 0s. If this is true, then not only is the Universe itself uncomputable, but actually the most interesting human traits, perhaps even the root of intelligence itself, would be forever outside the reach of regular computers&#8212;no superintelligent AI for you, Sam Altman.</p><h2>The Universe Hardware Spec</h2><p>But for now, let&#8217;s run with the original argument. If we do assume the Universe is doing something akin to general computation, we can ask the next logical question? How long has this been going on? Not in years, of course&#8211;we kinda know that already&#8211;but in <em>computational steps</em>. How big is the Universal Hardware?</p><p>The question might seem bogus at first, but underneath this seemingly crazy assumption we will find some profound truths about the nature of computation. To even start to answer this we must link two seemingly separate realms: the digital and the physical.</p><p>We often think of information as something ethereal&#8212;software is just math, right? But in the 1960s, Rolf Landauer showed all us wrong. He realized that information is fundamentally tied to thermodynamics. Specifically, <a href="https://en.wikipedia.org/wiki/Landauer%27s_principle">Landauer&#8217;s Principle</a> states that any logically irreversible operation&#8212;like erasing a bit of information&#8212;must produce a specific amount of heat.</p><p>This isn&#8217;t the regular heat you feel when your computer works for too long, of course. That is just a byproduct of imperfect engineering&#8211;moving pieces with friction, electrical wiring with imperfect insulation, etc. No, this is much more fundamental. Even if you remove all sources of energy loss, the fact that two bits of information become one (when you <em>and</em> two boolean variables for example) means that some amount of energy is released, increasing the entropy of the system.</p><p><em>Thinking</em> has a literal energy cost.</p><p>This gives us a maximum performance for any given physical computer: <a href="https://en.wikipedia.org/wiki/Bremermann%27s_limit">Bremermann&#8217;s Limit</a>. Just take the total energy you input into the system, assume there are no extra losses, and you will still deplete that energy just <em>doing computation</em>. So, if the Universe is a computer, it must have this physical limit. After all, there is just so much energy in the Universe. How fast is the cosmic CPU?</p><p>If you take 1 kg of matter and turn it into the Ultimate Laptop, it can perform at most 1.36&#8197;&#215;&#8197;10<sup>50</sup> bits per second. That&#8217;s a staggering number&#8212;far beyond anything we can currently build&#8212;but it&#8217;s not infinite. It is a hard, physical ceiling. There is no &#8220;overclocking&#8221; the universe beyond the speed of light and the Planck constant.</p><p>When you scale this logic up to the entire observable universe, you get a spec sheet for reality that is both terrifying and humbling.</p><p>Physicist Seth Lloyd calculated that since the Big Bang, the entire universe has performed at most 10<sup>120</sup> operations on roughly 10<sup>90</sup> bits. To put that in perspective, there are some 10<sup>80</sup> elementary particles in the observable universe, so each particle must be represented by something like 10<sup>10</sup> bits. Huge, yes, but <em>finite</em>.</p><h2>Where does this leave us?</h2><p>If the Universe is indeed a computer, that has been running for a finite amount of time, and has thus performed a finite amount of computation, what can we say about the laws of physics. Are there things forever beyond the realm of Science.</p><p>This is question I want to answer next, but that will take us on a tour around some of the most important theoretical concepts in computer science. In the end, we will see that a Digital Universe has some damn horrifying constraints, but it also possesses some beautiful freedoms we can only dream of in an otherwise uncomputable Universe.</p><p>But that is a story for another Friday!</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.apiad.net/p/is-the-universe-a-computer-part-i/comments&quot;,&quot;text&quot;:&quot;Leave a comment&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.apiad.net/p/is-the-universe-a-computer-part-i/comments"><span>Leave a comment</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Welcome to The Computist Journal - 2026 Update]]></title><description><![CDATA[A short intro to this blog and what you can find here.]]></description><link>https://blog.apiad.net/p/welcome-to-the-computist-journal</link><guid isPermaLink="false">https://blog.apiad.net/p/welcome-to-the-computist-journal</guid><dc:creator><![CDATA[Alejandro Piad Morffis]]></dc:creator><pubDate>Tue, 30 Dec 2025 14:18:31 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!fvWl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60b170a7-9bff-4ea0-89ea-1472dbc3349f_2816x1536.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!fvWl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60b170a7-9bff-4ea0-89ea-1472dbc3349f_2816x1536.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!fvWl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60b170a7-9bff-4ea0-89ea-1472dbc3349f_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!fvWl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60b170a7-9bff-4ea0-89ea-1472dbc3349f_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!fvWl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60b170a7-9bff-4ea0-89ea-1472dbc3349f_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!fvWl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60b170a7-9bff-4ea0-89ea-1472dbc3349f_2816x1536.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!fvWl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60b170a7-9bff-4ea0-89ea-1472dbc3349f_2816x1536.png" width="1456" height="794" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/60b170a7-9bff-4ea0-89ea-1472dbc3349f_2816x1536.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:794,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:6358412,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.apiad.net/i/182895064?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60b170a7-9bff-4ea0-89ea-1472dbc3349f_2816x1536.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!fvWl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60b170a7-9bff-4ea0-89ea-1472dbc3349f_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!fvWl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60b170a7-9bff-4ea0-89ea-1472dbc3349f_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!fvWl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60b170a7-9bff-4ea0-89ea-1472dbc3349f_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!fvWl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60b170a7-9bff-4ea0-89ea-1472dbc3349f_2816x1536.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Generated with Nano Banana 3 Pro.</figcaption></figure></div><p>Welcome, dear reader! </p><p>If you&#8217;re new over here, this post will get you set up and oriented on what you can expect to find in this blog/newsletter. And if you&#8217;re a seasoned reader of The Computist Journal, this post will explain you what&#8217;s new or different for 2026.</p><p>I am Alejandro, a computer science professor, AI researcher, and a firm believer that technology is neither magic, nor good or bad, but a language we can all learn to speak and tool we have both the power and the responsibility to use for our own flourishing.</p><p>I call this position techno-pragmatism, and it informs everything I write about in this blog.</p><p>In 2025 my primary focus was the mostly harmless nature of modern AI. In 2026 I&#8217;m shifting gears and expanding my interests to cover a wider range of computer science topics, including AI, but not limited to it.</p><p>Here&#8217;s how I plan to do it.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.apiad.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.apiad.net/subscribe?"><span>Subscribe now</span></a></p><h3><strong>Writing in Public</strong></h3><p>Most of the long-form content you see here in 2026 will serve a dual purpose: it will be a full featured standalone post, and it will eventually become a chapter in one of my upcoming books. I am building a public and free <a href="https://books.apiad.net">library of computational knowledge</a>, and I am inviting you to watch the gears turn in real-time.</p><p>In 2025 I finished the first draft of <a href="https://store.apiad.net/l/mhai">Mostly Harmless AI</a>, and as it goes through several editing and revision phases, I will turn my focus for a while to core topics in computational thinking like algorithm design and the theory of computation&#8212;although AI will remain a main theme in the upcoming months as well, how wouldn&#8217;t it!</p><p>To make the content as accessible as possible, I am simplifying the journal into three distinct tracks (or sections):</p><ol><li><p><strong>Educational Articles:</strong> Long-form, weekly or biweekly deep dives into Computer Science theory, Artificial Intelligence, Mathematics, and the Philosophy of Computation.</p></li><li><p><strong>Essays:</strong> Personal, opinionated pieces on the role of science, the future of education, parenting in a digital age, and ethics. These appear whenever I feel I have something interesting or important to say.</p></li><li><p><strong>News:</strong> Short, informational updates on my writing milestones, new software projects, and the &#8220;behind-the-scenes&#8221; of the Computist universe.</p></li></ol><h3><strong>The 2026 Roadmap</strong></h3><p>My research and writing in 2026 will focus on three major topics:</p><ul><li><p><strong>Foundations of Computer Science:</strong> The duality of languages and machines; which is to say the story behind some of the most important scientific and technological breakthroughs of the last century.</p></li><li><p><strong>Algorithm Design:</strong> The art and science of making computers do our bidding, together with the deep understanding necessary to solve some of the most complex problems.</p></li><li><p><strong>Artificial Intelligence:</strong> Moving beyond the hype to understand the actual mechanics of neural networks, LLMs, chatbots, and the limits of what is achievable with synthetic intelligence.</p></li></ul><h3><strong>The Schedule</strong></h3><p>I aim to maintain a rhythm that allows me to output quality articles regularly:</p><ul><li><p><strong>6+ Educational Posts per month</strong> (roughly 1.5 per week), each month focusing on one theme to maintain some short-term cohesion, but varying themes across month.</p></li><li><p><strong>1 or 2 Personal Essays per month</strong>, whenever I feel there&#8217;s something worthy to say about something.</p></li><li><p><strong>2 to 3 Informational News posts per week</strong> to keep you in the loop about my open source projects and books.</p></li></ul><div><hr></div><h3><strong>Supporting The Computist Journal</strong></h3><p>All my articles, books, and educational material, and most of the code I write, are and forever will be free&#8212;as in free beer <em>and</em> free speech. There no subscriptions, no strings attached, you can read it all online without any compromise. I am a product of excellent free education, and this is my way to pay it forward.</p><p>However, if you find value in my work and want to support the hundreds of hours that go into research and typesetting, there are several ways to do it.</p><h4>Tier Zero: The Reader</h4><p>The simplest and most valuable way to support my work is just by reading, commenting, and sharing whatever you find interesting here. At the end of 2025 there <strong>6,800</strong> or so of you, and that is far beyond the number of people I ever believed could be interested in reading my work. </p><p>Please keep doing that!</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.apiad.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.apiad.net/subscribe?"><span>Subscribe now</span></a></p><h4><strong>Tier One: The Student</strong></h4><p>Next level is grabbing some or all of my in-progress books. You can <a href="https://store.apiad.net">get each one individually</a>, but for the price of a single standard technical textbook, you can purchase <strong>The Computist Compendium</strong>.</p><p>This is a lifetime bundle of beautifully typeset PDF and ePUB copies of all my current and future books, featuring enhanced visuals and exclusive complementary content. It is the easiest way to own a part of the soul of this project.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://store.apiad.net/l/compendium&quot;,&quot;text&quot;:&quot;Buy the Compendium&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://store.apiad.net/l/compendium"><span>Buy the Compendium</span></a></p><h4><strong>Tier Two: The Supporter</strong></h4><p>If you want to provide a regular support, next level is becoming a paid subscriber here on Substack. Besides my eternal gratitude, this gives you a couple of nice perks:</p><ul><li><p><strong>Early Access:</strong> All educational posts are published for you one week before the general public.</p></li><li><p><strong>Prioritized Attention:</strong> You can join a private Discord server where I give technical advice on whatever topics we discuss over here.</p></li><li><p><strong>100% Discount:</strong> You receive a code for a total discount on all my digital products on Gumroad, including the Compendium.<br></p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.apiad.net/subscribe&quot;,&quot;text&quot;:&quot;Become a Supporter&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.apiad.net/subscribe"><span>Become a Supporter</span></a></p><h4><strong>Tier Three: The Founder</strong></h4><p>For those who want to give the maximum level of support, I offer you the <strong>Founding Member</strong> tier. This is a one-time contribution that grants you all previous perks, plus access to <strong>The Codex</strong>: my private Obsidian Vault where all my thinking and technical writing happens.</p><p>The Codex includes all my raw notes, early drafts, background research, LaTeX sources, and everything that <em>didn&#8217;t</em> make the final cut. It is an uncensored, behind-the-camera view of my entire creative and scientific process&#8212;including unfinished work that may never see the light of day elsewhere.</p><p>Plus, if you accept, I&#8217;ll make a special mention of you in the acknowledgement section of all my books!</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.apiad.net/subscribe&quot;,&quot;text&quot;:&quot;Become a Founder&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.apiad.net/subscribe"><span>Become a Founder</span></a></p><div><hr></div><h3><strong>Join the Journey</strong></h3><p>Whatever your interests, I am grateful you are here. Let 2026 be a wonderful year for you! I hope I can add some value to it with a bit of free knowledge.</p><p><strong>See you in the next post!</strong></p>]]></content:encoded></item><item><title><![CDATA[How to Train your Chatbot - Chapter One]]></title><description><![CDATA[Hello AI]]></description><link>https://blog.apiad.net/p/how-to-train-your-chatbot-chapter-4d8</link><guid isPermaLink="false">https://blog.apiad.net/p/how-to-train-your-chatbot-chapter-4d8</guid><dc:creator><![CDATA[Alejandro Piad Morffis]]></dc:creator><pubDate>Wed, 22 Oct 2025 11:01:13 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!ydor!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb1006c3-cc3a-4419-8b64-350f8977683b_1538x721.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<blockquote><p>The following is the first draft of Chapter 1 of <strong>How to Train your Chatbot</strong>, a developer-centric book I&#8217;m writing in public. Subscribe to get all chapters in your inbox, as they come out, for free.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.apiad.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.apiad.net/subscribe?"><span>Subscribe now</span></a></p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ydor!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb1006c3-cc3a-4419-8b64-350f8977683b_1538x721.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ydor!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb1006c3-cc3a-4419-8b64-350f8977683b_1538x721.png 424w, https://substackcdn.com/image/fetch/$s_!ydor!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb1006c3-cc3a-4419-8b64-350f8977683b_1538x721.png 848w, https://substackcdn.com/image/fetch/$s_!ydor!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb1006c3-cc3a-4419-8b64-350f8977683b_1538x721.png 1272w, https://substackcdn.com/image/fetch/$s_!ydor!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb1006c3-cc3a-4419-8b64-350f8977683b_1538x721.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ydor!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb1006c3-cc3a-4419-8b64-350f8977683b_1538x721.png" width="1456" height="683" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bb1006c3-cc3a-4419-8b64-350f8977683b_1538x721.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:683,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:82009,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.apiad.net/i/176766314?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb1006c3-cc3a-4419-8b64-350f8977683b_1538x721.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ydor!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb1006c3-cc3a-4419-8b64-350f8977683b_1538x721.png 424w, https://substackcdn.com/image/fetch/$s_!ydor!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb1006c3-cc3a-4419-8b64-350f8977683b_1538x721.png 848w, https://substackcdn.com/image/fetch/$s_!ydor!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb1006c3-cc3a-4419-8b64-350f8977683b_1538x721.png 1272w, https://substackcdn.com/image/fetch/$s_!ydor!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb1006c3-cc3a-4419-8b64-350f8977683b_1538x721.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In this first chapter we will build a bare-bones conversational application&#8211;a chatbot&#8211;that is capable of maintaining a coherent conversation with a single user. We will learn how to connect with an LLM provider, pass messages, and stream the responses back.</p><p>In this chapter we won&#8217;t use any additional libraries beyond the Python <a href="https://github.com/openai/openai">OpenAI</a> wrapper to communicate with an LLM backend. This is on purpose, so we understand the whole lifecycle of a typical conversation.</p><p>Then, in the next chapter, we will redo the core logic using <a href="https://github.com/gia-uh/argo">ARGO</a> to simplify the chat management and introduce some modularization, and we will add a database to store conversations and make our interactions persistent.</p><p>The whole source for this chapter (and this entire book) can be found in this <a href="https://github.com/apiad/chatbot">Github repository</a>. This specific chapter is in the <a href="https://github.com/apiad/chatbot/blob/main/basic.py">basic.py</a> file.</p><h2>Setting up the environment</h2><p>The first step is to setup a development environment. At the moment, the community preferred way of doing this with Python is using a package and virtual environment manager, and by far the best in the world is <a href="">uv</a>. So let&#8217;s begin by installing thatl, and once ready, just run:</p><pre><code><code>mkdir chatbot # or whichever folder name you prefer
cd chatbot
uv init .</code></code></pre><p>This will create a <code>pyproject.toml</code> file in the <code>chatbot</code> folder. Now let&#8217;s add <code>streamlit</code> and <code>openai</code>.</p><pre><code><code>uv add streamlit openai</code></code></pre><p>After a few seconds downloading the necessary packages, you will be ready to start coding.</p><h3>Get the source code</h3><p>The complete source code for this book is available in <a href="https://github.com/apiad/chatbots">Github</a> with an MIT license. Feel free to fork it and use it as you see fit.</p><h2>The basic chatbot</h2><p>Let&#8217;s start outlining our first chatbot. We will use Streamlit&#8217;s builtin functionality to create a pretty decent chatbot UI in less than 10 lines of code. Here is the boilerplate code.</p><pre><code><code>import streamlit as st

# import and setup OpenAI
&lt;&lt;setup_openai&gt;&gt;

# regenerate the conversation history
&lt;&lt;render_history&gt;&gt;

msg = st.chat_input()

if not msg:
    st.stop()

with st.chat_message(&#8221;user&#8221;):
    st.write(msg)

# code to actually call the LLM
&lt;&lt;invoke_llm&gt;&gt;

with st.chat_message(&#8221;assistant&#8221;):
    response = st.write(reply(msg))

# save the llm response
&lt;&lt;save_llm_response&gt;&gt;</code></code></pre><h3>About Literate Programming</h3><p>You may have noticed the code above uses a somewhat weird notation, like <code>&lt;&lt;setup_openai&gt;&gt;</code>, which is not valid Python. This is because we are using a tool called <code>illiterate</code> to generate automatically the source code from the book content. This notation is how we reference snippets of code that will be defined later on.</p><p>This is a paradigm called <em>Literate Programming</em> that emphasizes narrative documentatiopn as the primary source, and the actual source is derived automatically, so that your documentation (this book) and your code are always synchronized</p><p>Don&#8217;t worry too much about it, just know that all these snippets will be explained in the remainder of this chapter. You&#8217;ll get the hang of it.</p><p>This completes the basic layout of a typical chatbot app. We still haven&#8217;t specified how to actually call the LLM provider, and how to manage the conversation history. We will do that in the remaining of this chapter.</p><h2>Setting up an LLM provider</h2><p>For all practical purposes, when you code a chatbot you consider the LLM as a black-box component that is, almost always, encapsulated behind a REST API. This lets you focus on building the app and forget about the extremely complex problem of serving, scaling, and monitoring an LLM in a production environment.</p><p>This is pretty much the same as when you use a production database, like MySQL, Postgre, or MongoDB, you almost never directly run the database server daemon. Instead, you almost certainly provision a managed server from Amazon, Azure, Google, or any other of the myriad cloud providers. In the same way, most of the time you won&#8217;t run an LLM locally, but instead use a cloud provider which has far better infrastructure.</p><p>By far, the most common API design used by all cloud LLM providers is the OpenAI API, which has become a de-facto standard. So, even if you don&#8217;t use OpenAI&#8217;s models directly (like GPT-5), you will almost certainly use the <code>opena</code> Python package to communicate with your LLM cloud provider, whoever they are.</p><p>In this book I will recommend <a href="https://openrouter.ai">OpenRouter</a>, because they provide access to a huge range of models, and they deal with automatically routing to the optimal cloud provider. But you are free to choose whatever LLM provider you prefer, and as long as they give you an OpenAI-compatible API (and they all do), the rest of this book will work exactly the same for you.</p><p>Whatever you choose, you will need three things from your LLM provider:</p><ul><li><p>A base URL (e.g., <code>https://openrouter.ai/api/v1</code>)</p></li><li><p>An API key (e.g., <code>sk-12345678...</code>)</p></li><li><p>A model identifier (e.g., <code>meta-llama/llama-4-maverick:free</code>)</p></li></ul><p>Of these three, the API Key is the most important to keep safe (and secret) because it&#8217;s what the LLM provider will use to identify you (and thus charge you) for using their service. If this API Key ends up in a Github repository or somewhere public, someone might use it to impersonate you and thus deplete your credits or put you in a significant debt with your LLM provider.</p><div class="pullquote"><p>TL;DR: Keep your API Key secret!</p></div><p>Once you have identified your LLM provider, you need to provide <code>streamlit</code> with the credentials and metadata indicated above. The best way to do this for a <code>streamlit</code> app is to use its native secrets management. You need to create a <code>.streamlit</code> folder, and place a <code>secrets.toml</code> file inside. There you can add the entries just as if they where environment variables.</p><pre><code><code># .streamlit/secrets.toml
api_key=&#8221;sk-12345678&#8221;
base_url=&#8221;https://openrouter.ai/api/v1&#8221;
model=&#8221;meta-llama/llama-4-maverick:free&#8221;</code></code></pre><p>Needless to say, this <code>secrets.toml</code> file should be the first line in your <code>.gitignore</code>!</p><p>For the purpose of development and testing, you can use one of the many free models OpenRouter hosts, which give you enough daily uses for development purposes. Read more on the <a href="https://openrouter.ai/docs/faq#how-are-rate-limits-calculated">OpenRouter documentation</a>. As a perhaps obvious disclaimer, I&#8217;m not associated with OpenRouter nor any part of this book is supported by them. I just think they provide a wonderful service for the AI developer community.</p><p>Once you&#8217;ve gotten your API key, you can use it to create an OpenAI client to interact with the LLM.</p><pre><code><code>from openai import OpenAI

client = OpenAI(
    base_url=st.secrets.base_url,
    api_key=st.secrets.api_key,
)</code></code></pre><p>And we&#8217;re ready to start sending some messages.</p><h2>Calling the LLM</h2><p>OpenAI&#8217;s client makes it very easy to invoke an LLM. We just need to pass in a list of messages and stream the result back to our Streamlit app. But before, we need to understand one key thing about LLM providers, and is the fact that they are <strong>stateless</strong>. This means you <em>always have to pass the whole conversation</em> to the API, because neither the actual model nor the server itself &#8220;remember&#8221; the conversation.</p><p>So, we will need to store the conversation in our application state. I&#8217;ll show you how to do this in just a second, but let&#8217;s assume we have it and move on to actually talking to the LLM.</p><p>In our app, we need to include the following method, which will stream the response from the LLM to the user.</p><pre><code><code>&lt;&lt;store_conversation&gt;&gt;

def reply(msg: str):
    # retrieve the conversation list
    conversation = st.session_state.conversation

    for chunk in client.chat.completions.create(
        model=st.secrets.model,
        messages=conversation,
        stream=True,
    ):
        if msg := chunk.choices[0].delta.content:
            yield msg</code></code></pre><p>That&#8217;s it. We just invoke <code>client.chat.completions.create</code> with the model name and conversation (a list of messages we will see right away) and stream the results back in the form of a Python generator. The <code>st.write</code> method in our main app is smart enough to produce a pretty typewriter animation as we receive the response one chunk at a time.</p><p>The conversation is actually stored in Streamlit&#8217;s builtin session storage, which is a dictionary-like object unique to each user session. This means that as long as we stay in the same browser tab, the conversation persists. Once we open another tab or hit F5, the session storage is cleaned up and we have a new conversation. Kind of brute, but for now it will do.</p><p>And this is the missing snippet to actually keep track of the conversation. We need to store the user message just after we get from the chat input widget:</p><pre><code><code>if &#8220;conversation&#8221; not in st.session_state:
    st.session_state.conversation = []

st.session_state.conversation.append(dict(role=&#8221;user&#8221;, content=msg))</code></code></pre><p>And we need to store the assistant reply message in the main loop after streaming it. Fortunately, the <code>st.write</code> method is smart enough so that it not only prints the message as it is streamed, but also returns the full concatenated message once the stream is closed.</p><pre><code><code>st.session_state.conversation.append(
    dict(role=&#8221;assistant&#8221;, content=response)
)</code></code></pre><p>As you&#8217;ve seen, we&#8217;re storing the messages in the format that the OpenAI API expects. This is a list of dictionaries, where each dictionary contains a <code>role</code> and a <code>content</code> key. The role (for now) can be either <code>user</code> or <code>assistant</code>, and it&#8217;s important we respect it because the models are trained to behave as if they are replying from the &#8220;assistant&#8221; perspective.</p><h2>Rebuilding the conversation history</h2><p>The final step to make our app a fully-fledged chatbot is to actually render the whole conversation in the UI. This is necessary because Streamlit works with an immediate execution paradigm, which effectively means every interaction cleans the whole screen and redraws the entire UI. Thus, every time the user enters a message, the UI restarts and we lose the previously rendered interaction.</p><p>Fortunately, as we have the conversation stored in the session state, we can simply iterate over it and render each message.</p><pre><code><code>for msg in st.session_state.get(&#8217;conversation&#8217;, []):
    with st.chat_message(msg[&#8217;role&#8217;]):
        st.write(msg[&#8217;content&#8217;])</code></code></pre><p>We just need a bit of care the first time we run the app because the <code>conversation</code> key won&#8217;t exist, that&#8217;s why we use <code>get</code> instead of a direct indexer.</p><h2>Closing up</h2><p>Phew! There you go, in around 50 lines of code (including comments and whitespace) we have a fully working chatbot. It&#8217;s very crude for now, all it does is send some messages to an LLM provider and stream the response back, but this is the foundation we can build upon to create all sorts of cool stuff.</p><p>In the next chapter we will rebuild this application from scratch but using an LLM framework that will give us much more flexibility later on.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.apiad.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.apiad.net/subscribe?"><span>Subscribe now</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[AI Winter is Coming… Or Is It?]]></title><description><![CDATA[A level-headed, pragmatic overview of the forthcoming reckoning in the AI industry]]></description><link>https://blog.apiad.net/p/ai-winter-is-coming-or-is-it</link><guid isPermaLink="false">https://blog.apiad.net/p/ai-winter-is-coming-or-is-it</guid><dc:creator><![CDATA[Alejandro Piad Morffis]]></dc:creator><pubDate>Tue, 21 Oct 2025 14:32:00 GMT</pubDate><enclosure url="https://images.unsplash.com/photo-1487782310695-ed8583618566?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxNnx8d2ludGVyfGVufDB8fHx8MTc2MDk5NTQ3MXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://images.unsplash.com/photo-1487782310695-ed8583618566?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxNnx8d2ludGVyfGVufDB8fHx8MTc2MDk5NTQ3MXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://images.unsplash.com/photo-1487782310695-ed8583618566?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxNnx8d2ludGVyfGVufDB8fHx8MTc2MDk5NTQ3MXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1487782310695-ed8583618566?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxNnx8d2ludGVyfGVufDB8fHx8MTc2MDk5NTQ3MXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1487782310695-ed8583618566?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxNnx8d2ludGVyfGVufDB8fHx8MTc2MDk5NTQ3MXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1487782310695-ed8583618566?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxNnx8d2ludGVyfGVufDB8fHx8MTc2MDk5NTQ3MXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw"><img src="https://images.unsplash.com/photo-1487782310695-ed8583618566?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxNnx8d2ludGVyfGVufDB8fHx8MTc2MDk5NTQ3MXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" width="4608" height="3456" data-attrs="{&quot;src&quot;:&quot;https://images.unsplash.com/photo-1487782310695-ed8583618566?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxNnx8d2ludGVyfGVufDB8fHx8MTc2MDk5NTQ3MXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:3456,&quot;width&quot;:4608,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;white pendant lamp hanging on ceiling outside of snow covered forest&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="white pendant lamp hanging on ceiling outside of snow covered forest" title="white pendant lamp hanging on ceiling outside of snow covered forest" srcset="https://images.unsplash.com/photo-1487782310695-ed8583618566?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxNnx8d2ludGVyfGVufDB8fHx8MTc2MDk5NTQ3MXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1487782310695-ed8583618566?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxNnx8d2ludGVyfGVufDB8fHx8MTc2MDk5NTQ3MXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1487782310695-ed8583618566?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxNnx8d2ludGVyfGVufDB8fHx8MTc2MDk5NTQ3MXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1487782310695-ed8583618566?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwxNnx8d2ludGVyfGVufDB8fHx8MTc2MDk5NTQ3MXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Photo by <a href="https://unsplash.com/@mirakemppainen">Mira Kemppainen</a> on <a href="https://unsplash.com">Unsplash</a></figcaption></figure></div><p>You can&#8217;t scroll through a tech feed these days without tripping over a prophecy: the AI bubble is about to burst, and a long, cold &#8220;AI Winter&#8221; is coming. The narrative is as seductive as it is simple. The current frenzy around Generative AI, we&#8217;re told, is a speculative mania. When the inflated expectations inevitably collide with reality and the firehose of investment capital slows to a trickle, the whole enterprise will be exposed as a grand fiasco. We&#8217;ll discover, the skeptics say, that it was all a <em>cuento</em>.</p><p>And let&#8217;s be clear: they&#8217;re not entirely wrong about the first part. The expectations <em>are</em> inflated. A correction is not just likely; it&#8217;s necessary.</p><p>But here&#8217;s my thesis: the idea that this correction will lead to another AI Winter&#8212;a catastrophic freeze comparable to the funding droughts of the 1970s and 80s&#8212;is a fundamental misreading of the landscape. I will argue that what we are heading for is not a collapse, but a <em>normalization</em>&#8212;what I will call an AI <em>autumn.</em> </p><p>The inevitable deflation of the hype won&#8217;t reveal a failed technology. Instead, it will reveal a technology that has already, quietly and irrevocably, proven its utility and woven itself into the fabric of our digital lives. </p><p>This isn&#8217;t a story about a bubble bursting; it&#8217;s about a revolutionary technology finally growing up. But let&#8217;s be clear: growing up can be a painful process. The normalization I&#8217;m describing won&#8217;t be a gentle, seamless transition. An industry built on unsustainable economics and AGI-or-bust promises can still face maybe not a brutal winter, but a significant autumn, even if the underlying technology continues to thrive.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.apiad.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.apiad.net/subscribe?"><span>Subscribe now</span></a></p><h2>Anatomy of the Hype (Or Why the Skeptics Have a Point)</h2><p>Before we can talk about the future, we have to be honest about the present. The current AI landscape feels like a bubble because, in many ways, it is one. This isn&#8217;t to say the technology is vaporware; far from it. The frenzy is built on a kernel of genuinely astonishing progress. But that kernel has been buried under an avalanche of speculative capital and quasi-religious prophecy.</p><p>The promises are, to put it mildly, grandiose. Tech leaders, flush with unprecedented investment, speak of replacing vast swaths of the workforce and ushering in an era of unimaginable productivity. Every incremental improvement is framed as another step on the inexorable march toward Artificial General Intelligence. This narrative is then amplified by a chorus of <em>accelerationists</em> and futurists who speak of the Singularity not as a distant sci-fi concept, but as an imminent event. It&#8217;s a powerful and compelling story, and it&#8217;s fueling a gold rush.</p><p>But back on planet Earth, the story is more complicated. For every breathless demo, there are practical and theoretical roadblocks that the hype conveniently ignores. The most glaring is the hallucination problem. These models, by their very nature, invent things. We&#8217;ve managed to reduce the frequency, but we haven&#8217;t eliminated the phenomenon, and there are compelling theoretical arguments that we may never be able to. This isn&#8217;t just a bug; it&#8217;s a feature of the architecture, a fundamental crack in the foundation of trust.</p><p>This technical limitation then crashes headfirst into the corporate world&#8217;s messy reality. Most companies, lured by the promise of easy productivity gains, are discovering a <em>massive adoption gap</em>. They lack the clean data, the streamlined processes, and the technical expertise to reliably integrate these powerful but flawed tools. It&#8217;s no wonder, then, that an astonishing number of corporate AI projects&#8212;some estimates say as high as 85%&#8212;are quietly failing to deliver a return on investment. Sky-high promises plus messy, difficult reality is the classic recipe for a bubble.</p><p>Perhaps the most potent dose of reality, however, is coming from the frontier models themselves. We&#8217;re witnessing a classic case of diminishing returns. The leap in capability from GPT-3 to GPT-4 was so profound it felt like a paradigm shift, leading many to draw a straight line on the progress graph and conclude that GPT-5 would be knocking on AGI&#8217;s door. That hasn&#8217;t happened.</p><p>The newest models are better, certainly, but the improvement is incremental, not awe-inspiring. It strongly suggests we&#8217;re hitting the ceiling of what the current paradigm can do. Experts like Yann LeCun and Fran&#231;ois Chollet argue persuasively that to progress further, we need fundamentally new approaches&#8212;paradigms that have yet to be invented. This pushes the dream of AGI firmly back into the realm of long-term research, not the foreseeable future.</p><p>Compounding this is a simple fact: <em>the economics of frontier AI are fundamentally broken</em>. The cost to train a single model like GPT-4 is north of $100 million. The data center infrastructure required to support the industry&#8217;s ambitions will require an estimated $5.2 trillion by 2030.</p><p>Unsurprisingly, this has created a severe profitability crisis. In 2024, OpenAI reportedly lost approximately $5 billion on $9 billion in revenue, with inference costs alone accounting for a multi-billion dollar loss. This isn&#8217;t a business model; it&#8217;s a venture-subsidized science experiment, and it&#8217;s hitting a hard physical wall with an energy grid that cannot keep up.</p><p>Furthermore, we must recognize that this isn&#8217;t just another tech bubble. The investment flowing into AI is qualitatively different from, say, funding for a better SaaS tool or a more efficient database. A significant portion of this capital is a high-stakes, geopolitical bet on the imminent arrival of AGI. The valuations of the frontier labs are not based on their current, money-losing products; they are based on the promise of creating a literal <em>god-in-a-box</em>. </p><p>Whether Sam Altman and company believe or not is beyond the point. This dream of AGI is driving market valuations, and when the market finally digests that we are hitting a paradigm ceiling&#8212;a point this article has already made&#8212;the withdrawal of that &#8216;AGI-or-bust&#8217; capital won&#8217;t be a gentle correction. It will be a sudden, violent repricing that could vaporize billions in paper wealth overnight.</p><h2>What Will Happen When the Bubble Bursts?</h2><p>So, given the inflated expectations and technical ceilings, what happens when the hype recedes? I don&#8217;t really like to make predictions, and much less about the future. It&#8217;s damn hard. But I think we can outline a possible, perhaps even probable near future. I want to draw an analogy here and claim we will see not a true AI winter, but something close to an AI autumn.</p><p>An AI autumn is an economic event. It&#8217;s a period of massive financial correction, characterized by layoffs, hiring freezes, startup failures, and a freeze in venture capital. It&#8217;s painful for the people and companies in the field. An AI winter, on the other hand, is a crisis of relevance of the core technology. It&#8217;s when the technology itself proves to be a dead end, progress stalls, and the world moves on.</p><p>To be as blunt as I can, I do believe a severe autumn for the AI industry is not just possible; it&#8217;s likely. The current economics are unsustainable, as we&#8217;ve seen. But the central argument of this article is that this painful industrial correction will <em>not</em> trigger a catastrophic winter, which would be far worse. No, AI is here to stay, and here is why.</p><p>First, we can&#8217;t ignore the relentless democratization of compute. The idea that cutting-edge AI will forever be the exclusive domain of billion-dollar data centers is a historical fallacy. We are already seeing an explosion of highly capable open-source models that can run on local, consumer-grade hardware. What requires a professional-grade, 10,000 dollars GPU today will run on your laptop in two years, and on your phone two years after that. </p><p>This trajectory completely decouples the utility of AI from the subsidized business models of a few large companies. The capability is escaping the lab and becoming part of the background radiation of computing.</p><p>Second, even if the progress of frontier models were to stop dead in its tracks today&#8212;which it won&#8217;t, but it will likely continue to decelerate&#8212;we still have a decade&#8217;s worth of technological breakthrough that most of the world has not even begun to properly digest. The current adoption gap isn&#8217;t a sign of inevitable failure; it&#8217;s a sign that the technology has advanced far faster than our institutions can keep up. </p><p>A slowdown in R&amp;D investment won&#8217;t cause a retreat. Instead, it will trigger a necessary and healthy shift in focus from pure research to practical implementation, integration, and process refinement. This is what maturity looks like. The frantic sprint to invent the future will become the marathon of actually building it.</p><p>Most importantly, this shift will not trigger a true AI winter because we are simply far beyond the point where Artificial Intelligence can disillusion us. It is already a proven technology, woven so deeply into our digital infrastructure that a true winter is no longer possible.</p><h2>Why We Won&#8217;t See Another AI Winter</h2><p>Let&#8217;s start with Generative AI itself. Even with all its flaws, its core utility is now undeniable. The previous AI winters occurred when promising lab demos failed to translate into real-world applications. That is not the situation today.</p><p>A significant percentage of the global population&#8212;some conservative estimates say around 10%&#8212; now uses these tools not as novelties, but as integrated parts of their daily work. It&#8217;s the assistant that transcribes a meeting and pulls out action items, summarizes a sprawling email thread you don&#8217;t have time to read, and helps you rephrase a blunt message into a diplomatic one. Online search is quickly becoming the playground for generative AI, and online search is by far the most profitable business in the Internet Era. </p><p>The genie is out of the bottle; people are not going to suddenly stop using a tool that demonstrably saves them time, just because its creators promised it would become a god.</p><p>But perhaps the world of software development is an even more potent example. There&#8217;s a lot of noise about irresponsible &#8220;vibe coding,&#8221; where novices generate code they don&#8217;t understand, creating an unmaintainable mess. This is a real problem, but it&#8217;s a problem of skill, not a failure of the tool.</p><p>For experienced developers, these assistants are transformative. The mythical &#8220;10x productivity&#8221; boost is largely a myth, but a consistent 1.5x to 2x multiplier is very real. I&#8217;ve seen it in my own projects. Code assistants act as the new IntelliSense, handling the mind-numbing boilerplate and letting me focus on the architectural challenges. I may now only write 20% of the final characters in the codebase, but I am still the author of 90% of the critical ideas. This is not a crutch; it&#8217;s leverage.</p><p>And beyond these consumer-facing applications lies an even larger world of traditional machine learning that is indispensable to modern science and industry.</p><p>From drug discovery and genomic sequencing in biotech to predictive maintenance and supply chain optimization in manufacturing, decades of successful applications of AI in the industry today delivers billions of dollars in quantifiable value. Their success is measured in efficiency gains and scientific breakthroughs, not hype cycles.</p><p>But the more fundamental point is this: the debate over a &#8220;Generative AI&#8221; bubble distracts from the fact that the broader field of AI has already won its place. We haven&#8217;t had a true AI winter since the 1990s because AI stopped being a distinct, speculative field and became the foundational plumbing of the modern world. The search engine that found this article? That&#8217;s AI. The recommendation algorithm that determines your social media feed? AI. The logistics network that delivered your last package, the facial recognition that unlocks your phone, the voice transcription that takes your meeting notes&#8212;it&#8217;s all AI. Not Generative AI (for the most part), but AI nonetheless.</p><p>The line between computer science and AI has become so blurred that it&#8217;s practically meaningless. To talk about an AI winter today is like talking about an Internet winter in 2005. The technology is simply too embedded to fail.</p><p>However, as we&#8217;ve argue, there will be some painful correction. That much is, I think, almost undeniable. If that&#8217;s indeed the case, here are some optimistic arguments for why it may all be for the better in the end.</p><h2>The Renaissance of AI Research</h2><p>When the unsustainable hype collides with this resilient foundation, a fundamental law of economics reasserts itself: there is no free lunch. An AI autumn is the inevitable trade-off for a period of unchecked exuberance. A wave of consolidation will wash away unprofitable startups, and the market&#8217;s strategic focus will pivot from &#8220;bigger is better&#8221; to efficiency.</p><p>But this period of commercial cooldown has a powerful, if counter-intuitive, silver lining: a renaissance of real research. History shows us that AI&#8217;s greatest winters have been fertile ground for its most important breakthroughs. The hype recedes, and with it, the noise. The crushing pressure for short-term commercial returns is replaced by the intellectual freedom to tackle fundamental, long-term challenges.</p><p>Many of the core technologies fueling today&#8217;s boom were born in the quiet of previous winters. The backpropagation algorithm, popularized by Geoffrey Hinton in the 1980s, was refined during a period of deep skepticism about neural networks. Most famously, the Long Short-Term Memory (LSTM) architecture, which was a cornerstone of natural language processing for decades, was developed by Hochreiter and Schmidhuber in 1997, the absolute heart of the last AI winter.</p><p>The coming autumn will trigger a similar cycle. As the brightest minds are freed from the scaling hype, the real work on the next generation of AI can begin. We are already seeing the intellectual seeds of this shift. AI pioneers are openly discussing the deep limitations of current models. Yann LeCun is championing his Joint Embedding Predictive Architecture (JEPA) as a path toward &#8220;world models&#8221; that learn abstract representations of reality.</p><p>The field of Neuro-Symbolic AI, which fuses neural nets with structured logic, is experiencing a surge in interest. These are not incremental improvements; they are explorations of entirely new paradigms.</p><h3>Conclusion: No Retreat, Just Normalization</h3><p>So, where does that leave us? The coming correction is not an apocalypse; it&#8217;s a maturation. The frantic, gold-rush energy will dissipate, and in its place, something far more durable will emerge. The deflation of the hype bubble will not send talent fleeing the field or cause us to abandon the tools we&#8217;ve built. Instead, it will mark the end of the beginning.</p><p>The great irony is that the very thing that guarantees AI&#8217;s long-term survival&#8212;its commoditization into reliable &#8216;plumbing&#8217;&#8212;is what makes the current industry valuations so precarious. Plumbing is a low-margin, utility business, not a world-dominating monopoly. This disconnect between utility and valuation is the financial fault line where the industrial earthquake will hit. The era of breathless, revolutionary promises will give way to the slow, difficult, and necessary work of integration. </p><p>This is the natural lifecycle of any transformative technology. It moves from a speculative curiosity to a reliable, if sometimes challenging, part of the professional toolkit. Generative AI will not become the all-knowing oracle we were promised, but it has already secured its place as a uniquely powerful tool for thought, creation, and productivity.</p><p>The question was never really <em>if</em> AI would change the world; the underlying technology has been doing that for decades. The real question is how we manage the transition. This industrial autumn will be cushioned, to some extent, by geopolitical reality. The race between the US and China ensures that a certain level of state-sponsored R&amp;D will continue, preventing a total 1980s-style collapse. </p><p>But for the people working in the field, the transition will still be jarring. The future of AI isn&#8217;t a simple story of success or failure. It&#8217;s the messy, often painful process of separating a world-changing technology from the unsustainable industry that&#8217;s driving it, and going back to drawing board, back to building new and even cooler stuff.</p>]]></content:encoded></item><item><title><![CDATA[How to Train your Chatbot - Chapter Zero]]></title><description><![CDATA[Announcing a new series on building practical AI applications]]></description><link>https://blog.apiad.net/p/how-to-train-your-chatbot-chapter</link><guid isPermaLink="false">https://blog.apiad.net/p/how-to-train-your-chatbot-chapter</guid><dc:creator><![CDATA[Alejandro Piad Morffis]]></dc:creator><pubDate>Mon, 20 Oct 2025 11:22:51 GMT</pubDate><enclosure url="https://images.unsplash.com/photo-1577563908411-5077b6dc7624?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyfHxjaGF0fGVufDB8fHx8MTc2MDg5NDczNXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" length="0" type="image/jpeg"/><content:encoded><![CDATA[<blockquote><p><strong>TL;DR:</strong> In this upcoming series, we'll build an autonomous LLM-based agent from scratch, focusing on the fundamentals. We'll go from zero to fully autonomous deep research, one feature at a time, staying as close to the metal as possible. Subscribe to receive all posts in your inbox for free.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.apiad.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.apiad.net/subscribe?"><span>Subscribe now</span></a></p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://images.unsplash.com/photo-1577563908411-5077b6dc7624?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyfHxjaGF0fGVufDB8fHx8MTc2MDg5NDczNXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://images.unsplash.com/photo-1577563908411-5077b6dc7624?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyfHxjaGF0fGVufDB8fHx8MTc2MDg5NDczNXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1577563908411-5077b6dc7624?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyfHxjaGF0fGVufDB8fHx8MTc2MDg5NDczNXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1577563908411-5077b6dc7624?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyfHxjaGF0fGVufDB8fHx8MTc2MDg5NDczNXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1577563908411-5077b6dc7624?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyfHxjaGF0fGVufDB8fHx8MTc2MDg5NDczNXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw"><img src="https://images.unsplash.com/photo-1577563908411-5077b6dc7624?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyfHxjaGF0fGVufDB8fHx8MTc2MDg5NDczNXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" width="5472" height="3648" data-attrs="{&quot;src&quot;:&quot;https://images.unsplash.com/photo-1577563908411-5077b6dc7624?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyfHxjaGF0fGVufDB8fHx8MTc2MDg5NDczNXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:3648,&quot;width&quot;:5472,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;three crumpled yellow papers on green surface surrounded by yellow lined papers&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="three crumpled yellow papers on green surface surrounded by yellow lined papers" title="three crumpled yellow papers on green surface surrounded by yellow lined papers" srcset="https://images.unsplash.com/photo-1577563908411-5077b6dc7624?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyfHxjaGF0fGVufDB8fHx8MTc2MDg5NDczNXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1577563908411-5077b6dc7624?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyfHxjaGF0fGVufDB8fHx8MTc2MDg5NDczNXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1577563908411-5077b6dc7624?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyfHxjaGF0fGVufDB8fHx8MTc2MDg5NDczNXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1577563908411-5077b6dc7624?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyfHxjaGF0fGVufDB8fHx8MTc2MDg5NDczNXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Photo by <a href="https://unsplash.com/@lunarts">Volodymyr Hryshchenko</a> on <a href="https://unsplash.com">Unsplash</a></figcaption></figure></div><p>Have you looked at the latest AI breakthroughs&#8212;agents that can plan, code, and research&#8212;and thought, "That's cool and all but&#8230; how the hell do one actually build something like that?"</p><p>If you're just getting started with Python development, or even if you&#8217;re a seasoned expert, you've probably felt this gap too. Maybe you've played with the demos, called an API or two, and seen just enough of the magic to get excited.</p><p>But moving from just prompting an LLM to building intelligent, autonomous agents is a different beast entirely. Things start to break pretty soon as you stack up more and more complexity. And it seems every tutorial out there is oblivious to this. Anyone can show you how to code a simple chatbot maybe with some RAG, but very few resources take you from that humble beginnings to something that actually looks like a modern multi-agent system that works autonomously.</p><p>This series is my answer to that problem.</p><p>Over the next 10 or so posts&#8212;the next couple of months&#8212;we are going to build a comprehensive, autonomous AI agent. And we're going to do it incrementally.</p><p>We'll start with the simplest possible application: the basic chatbot loop that stores conversation history. Why? Well, because even if everyone has already seen this, we need proper foundations to build on top. We need to see the full flow&#8212;how messages are stored, how a system prompt works, how to interact with the LLM, and what's really happening under the hood.</p><p>But from there we'll level up pretty fast, adding new capabilities piece by piece until we have a personal agent capable of doing deep research and generate long, coherent reports all on its own.</p><h2>Tools of the Trade</h2><p>Now, you might be thinking, why build from scratch and not just use LangChain, or LlamaIndex, or some other well known framework?</p><p>Here's my thesis. When you start with those massive, professional frameworks, you learn more about the framework than you do about the fundamentals. Those tools are powerful, but they are also black boxes. They hide the logic and complexity behind layers of abstraction. You end up learning their way of doing things, their design patterns, and their API. And frameworks change, but fundamentals remain.</p><p>I believe it's far more valuable to learn how to do things as close to the metal as possible. Our goal is to write the least amount of boilerplate code possible, but&#8212;and this is the key&#8212;without hiding the any of core logic. We will build this entire thing ourselves, understanding every single piece of the puzzle.</p><p>We'll use a tech stack that gets out of our way. Only three tools:</p><p>For the UI we&#8217;ll use <strong>Streamlit</strong>. For those who don&#8217;t know it, Streamlit is a radically simple application framework that turns any Python script into a fully fledged web app. The key here is what it doesn't make you do: you write zero boilerplate, zero layout, zero presentation, and zero state management code. You just write core logic, and a good enough web UI appears.</p><p>For the actual chatbot we'll use the <strong>ARGO</strong> framework. I like to call it FastAPI for AI agents because it has that same lightweight, decorator-based feel. It is the simplest, most Pythonic agent framework you&#8217;ve seen, and I promise you&#8217;ll fall in love with it. Despite being super simple, it forces you to build the logic using a very clean and lightweight skills-based pattern for modularizing our agent's intelligence.</p><p>And for data we&#8217;ll <strong>BeaverDB</strong>. This is a lightweight wrapper on top of SQLite that gives you everything you could ever want from a modern database. But it is not a modern database server. It's just a single, embedded binary file on your disk. There's no Docker, no authentication, no connection strings, no schemas, no boilerplate. Yet, on top of this simplicity, BeaverDB gives us a very Pythonic, very comfortable API for a document database with vector and full text search, and cool features like persistent dictionaries, lists, priority queues, and even pub/sub. It makes building something like a RAG pipeline extremely simple without any complex setup.</p><p>That's it. With just these three basic libraries, we will build a fully autonomous agent.</p><h2>The Journey Ahead</h2><p>Here's the roadmap I've thought so far. We'll add a new level of intelligence in each post.</p><ul><li><p><strong>Level 0: The Conversationalist</strong>. The foundation. A simple Streamlit UI with a stateful chat loop that understands conversation history and system prompts.</p></li><li><p> <strong>Level 1: The Assistant</strong>. We'll give our agent long-term memory. We'll build a full Retrieval-Augmented Generation (RAG) pipeline from scratch using BeaverDB so it can answer questions based on a private knowledge base.</p></li><li><p><strong>Level 2: The Researcher</strong>. We'll break our agent out of its box and give it a tool to search the web, allowing it to access up-to-the-minute information.</p></li><li><p><strong>Level 3: The Analyst</strong>. This is where it gets really fun. We'll give the agent the ability to write and execute its own Python code in a safe sandbox to solve problems, analyze data, and even debug its own mistakes.</p></li><li><p><strong>Level 4: The Editor</strong>. We'll upgrade our UI from a simple chat to a persistent, shared canvas, where you and the agent can work side-by-side to co-write documents and refine plans.</p></li><li><p><strong>Level 5: The Scientist</strong>. The finale. We'll integrate all previous capabilities, giving the agent a master prompt that allows it to take a high-level goal, create a multi-step plan, and execute it independently from start to finish.</p></li></ul><p>So here is my promise. It may take us more than 5 posts, but we are going to learn how to build a chatbot by building a chatbot, step by step. No unnecessary theory, just working code and clear intuitions. </p><p>I'll be providing a public GitHub repository with all the code, and each post will be a direct, straightforward Python script that you can type in half an hour.</p><p>When you're done with this series, it won't matter if you want to keep using ARGO and BeaverDB&#8212;which I hope you will&#8212;or move on to more &#8220;professional&#8221; tools. You'll be ready anyway, because these tools are so thin, what you'll have learned isn't how to use a specific library. You will have learned how the logic works underneath.</p><p>You'll understand how to orchestrate the different components involved in a typical chatbot, how to modularize the agent's brain, and more generally, how to think about building capable, intelligent, LLM-based applications. That's the real takeaway.</p><p>In the next post, we'll jump straight into <strong>Level 0: The Conversationalist</strong>. So hit that subscribe button and you'll get all posts in your inbox, 100% free.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.apiad.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.apiad.net/subscribe?"><span>Subscribe now</span></a></p><p>Ready to build?</p>]]></content:encoded></item><item><title><![CDATA[How I Built the Database of my Dreams]]></title><description><![CDATA[And how you can use it to build AI apps 100x faster]]></description><link>https://blog.apiad.net/p/how-i-built-the-database-of-my-dreams</link><guid isPermaLink="false">https://blog.apiad.net/p/how-i-built-the-database-of-my-dreams</guid><dc:creator><![CDATA[Alejandro Piad Morffis]]></dc:creator><pubDate>Sat, 20 Sep 2025 21:27:31 GMT</pubDate><enclosure url="https://images.unsplash.com/photo-1656528181090-0808f91fa894?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyfHxiZWF2ZXJzfGVufDB8fHx8MTc1ODQwMzM2Mnww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://images.unsplash.com/photo-1656528181090-0808f91fa894?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyfHxiZWF2ZXJzfGVufDB8fHx8MTc1ODQwMzM2Mnww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://images.unsplash.com/photo-1656528181090-0808f91fa894?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyfHxiZWF2ZXJzfGVufDB8fHx8MTc1ODQwMzM2Mnww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1656528181090-0808f91fa894?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyfHxiZWF2ZXJzfGVufDB8fHx8MTc1ODQwMzM2Mnww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1656528181090-0808f91fa894?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyfHxiZWF2ZXJzfGVufDB8fHx8MTc1ODQwMzM2Mnww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1656528181090-0808f91fa894?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyfHxiZWF2ZXJzfGVufDB8fHx8MTc1ODQwMzM2Mnww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw"><img src="https://images.unsplash.com/photo-1656528181090-0808f91fa894?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyfHxiZWF2ZXJzfGVufDB8fHx8MTc1ODQwMzM2Mnww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" width="6363" height="4242" data-attrs="{&quot;src&quot;:&quot;https://images.unsplash.com/photo-1656528181090-0808f91fa894?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyfHxiZWF2ZXJzfGVufDB8fHx8MTc1ODQwMzM2Mnww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:4242,&quot;width&quot;:6363,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;a bug on a log in the water&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="a bug on a log in the water" title="a bug on a log in the water" srcset="https://images.unsplash.com/photo-1656528181090-0808f91fa894?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyfHxiZWF2ZXJzfGVufDB8fHx8MTc1ODQwMzM2Mnww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1656528181090-0808f91fa894?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyfHxiZWF2ZXJzfGVufDB8fHx8MTc1ODQwMzM2Mnww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1656528181090-0808f91fa894?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyfHxiZWF2ZXJzfGVufDB8fHx8MTc1ODQwMzM2Mnww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1656528181090-0808f91fa894?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyfHxiZWF2ZXJzfGVufDB8fHx8MTc1ODQwMzM2Mnww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Photo by <a href="https://unsplash.com/@derekotway">Derek Otway</a> on <a href="https://unsplash.com">Unsplash</a></figcaption></figure></div><p>I really love to code. That much I&#8217;m sure you can tell. So I have this very common problem, that I&#8217;m sure many of you have. A gazillion ideas in my head wanting to burst out, and very little time to try them all. To try and get around this, I like to prototype really fast and dirty implementations of anything that makes me curious, just to see if it leads somewhere. (This is why I have hundreds of unfinished projects in my Github.)</p><p>So, where was I? Yeah, this is one of the reasons I&#8217;ve made Python, and specifically <code>streamlit</code>, a key part of my toolkit. It lets me build the very simplest semblance of an app extremely fast, without considering anything remotely unnecessary (for a first prototype) like authentication and user roles, or worrying about the user interface.</p><p>Now, here is the thing. Recently, I&#8217;ve been building <em>a lot</em> of AI stuff. Chatbots galore for all sorts of tasks from general question answering to specific things like storytelling, evolutionary computation, automated coding, you name it. And I keep finding a specific pain point that grinds me to a halt in the most exciting part of prototyping: <em>the database</em>.</p><p>Very soon, my simple AI prototype needs a vector database for RAG or a key-value store for simple configurations, or a message queue for background tasks and a persistent list to store conversation history. And this is before anything remotely close to a production-ready feature is in the horizon. This is just the second thing you&#8217;ll need right after streaming the first few tokens from OpenAI. It&#8217;s unescapable for anything but the simplest toys.</p><p>So, before I know it, I&#8217;m wrestling with two or three containers&#8212;or worse, juggling credentials for three different cloud services&#8212;and writing boilerplate code just to glue it all together. I&#8217;ve become an accidental, and very grumpy, DevOps engineer. My worst nightmare!</p><p>What I want is something in the spirit of <code>streamlit</code> but for managing data. A minimalistic, no-bullshit database that just supports the basic modalities we all need in modern applications, like, dunno, JSON storage, maybe? And vector search? Yeah, that, but also combined with full-text and fuzzy, persistent lists and queues&#8230; hell, even a decent pub-sub system if we&#8217;re asking! Is it too much to ask for someone to make SQLite but for modern data!</p><p>Well&#8230; I guess if you want something done, you might as well do it yourself.</p><p>So this is the story of how I built BeaverDB&#8212;a Pythonic and multi-modal interface on top of SQLite that just works out of the box for rapid prototyping. And it scales to medium sized projects just fine (SQLite is damn fast!). BeaverDB my attempt to build the tool I wish I had&#8212;a library that provides the high-level data structures modern AI applications need, without sacrificing the radical simplicity of a single .db file.</p><h2>Introducing Beaver</h2><p>The guiding principle behind Beaver&#8217;s API is a minimal surface area with a fluent interface. You only ever need to use two classes. Everything else flows naturally from the main database instance, returning dedicated wrapper objects with a rich, Pythonic interface.</p><p>One key idea for this design is that it should just work. Zero configuration, just sensible out of the box decisions. So, no schema, everything indexed by default, no need to declare tables or entities or anything before hand, collections just get created when first used, you know, that kind of thing. It should be as easy as instantiating a class and calling two methods.</p><p>Let&#8217;s take a quick tour on what you can today with it.</p><p><strong>Key-Value &amp; Caching</strong></p><p>Need to store configuration, user profiles, or cache expensive API calls? The namespaced dictionary is your go-to. It behaves just like a Python dict but is backed by the database, with optional TTL support.</p><pre><code><code># Use a dictionary for caching API calls for 1 hour
api_cache = db.dict("api_cache")
api_cache.set("weather", {"temp": "15C"}, ttl_seconds=3600)
print(f"Cached weather: {api_cache.get('weather')}")</code></code></pre><p><strong>Persistent Lists</strong></p><p>This is the most straightforward way to manage ordered sequences. For a chatbot, it&#8217;s the perfect way to maintain the turn-by-turn history of a conversation. It works like a Redis list, with all bells and whistles, implementing a full Pythonic list interface, but backed in the DB. It supports appending and removing from the head and tail, as well as inserting or removing anywhere in between, blazingly fast, because all operations are indexed.</p><pre><code><code># Manage a conversation with a user
chat_history = db.list("conversation_with_user_123")
chat_history.push({"role": "user", "content": "Hello, Beaver!"})
chat_history.push({"role": "assistant", "content": "Hello! How can I help?"})
print(f"Conversation length: {len(chat_history)}")</code></code></pre><p><strong>Priority Queues</strong></p><p>A priority queue is the essential tool for orchestrating an autonomous agent. It ensures the agent always works on the most critical task first, regardless of when it was added. The API is extremely simplified on purpose. For anything more complicated, use a full-featured list.</p><pre><code><code># An AI agent's task list
tasks = db.queue("agent_tasks")
tasks.put({"action": "summarize_news"}, priority=10)
tasks.put({"action": "respond_to_user"}, priority=1) # Higher priority

# Agent always gets the most important task first
next_task = tasks.get() # -&gt; Returns the "respond_to_user" task</code></code></pre><p><strong>Real-time Pub/Sub</strong></p><p>Need to build a decoupled, event-driven system? The pub/sub channel allows different parts of your application&#8212;or even different processes&#8212;to communicate in real-time. Beautifully designed with an extremely simple fluent API, but extremely performant, thread-safe, even works across different processes. Plus, it comes with an optional <code>async</code> interface if you&#8217;re feeling fancy.</p><pre><code><code># In one process, a monitor publishes an event
system_events = db.channel("system_events")
system_events.publish({"event": "user_login", "user": "alice"})

# In another, a logger subscribes and receives the message
with db.channel("system_events").subscribe() as listener:
    for message in listener.listen(timeout=1):
        print(f"Event received: {message}")</code></code></pre><p><strong>Collections &amp; Hybrid Search</strong></p><p>This is the core component for any Retrieval-Augmented Generation task. It&#8217;s a multi-modal collection of structured documents that understands vectors, and text, allowing you to combine search strategies for the best results. It also performs fuzzy search on demand with a very clever indexing strategy that I&#8217;ll tell you all about in the next section.</p><pre><code><code>from beaver import BeaverDB, Document
from beaver.collections import rerank

docs = db.collection("articles")
doc = Document(embedding=[...], content="Python is fast.")
docs.index(doc, fuzzy=True)

# Combine vector and full-text/fuzzy search for better results
vector_results = docs.search(vector=[0.15, 0.85, 0.15])
text_results = docs.match(query="pthon", fuzziness=1)
best_context = rerank(vector_results, text_results)</code></code></pre><p>And there&#8217;s a lot more. You can connect documents with relations and build a knowledge graph that you can later query to find similar documents or implement graph-based recommender system.</p><p>All in one freaking database file. No docker. No servers. No headache.</p><h2>A Peek Under the Hood</h2><p>Beaver is built on a series of pragmatic design decisions intended to leverage SQLite&#8217;s native capabilities to the fullest, avoiding slow application-layer logic wherever possible.</p><p>For one, it never creates new tables when storing stuff. Everything is stored and indexed in cleverly designed global tables that are created at startup time (only the very first time the DB file is created). This also means you get virtually infinite lists, dicts, queues, collections, etc., because these aren&#8217;t different tables (which would be a pain in the arsenal to maintain). And&#8230; (roll drums)&#8230; no migrations!</p><p>I want to highlight two specific features to tell you a bit about the underlying implementation details, so you can see the lengths it goes to try and be efficient out of the box.</p><p><strong>The Pub/Sub System</strong></p><p>The pub/sub system is the greatest example of efficiency by design. It&#8217;s built on a single, append-only log table with an index for the channel name.</p><p>For each channel, a single background thread polls this table for new entries and fans them out to any number of in-memory queues, one for each subscriber. The key insight here is that because the database is only ever touched by one polling thread per channel, adding a second, third, or hundredth subscriber adds <em>zero</em> additional load to the database. From the database&#8217;s perspective, new listeners are basically free.</p><p>Even better, this polling is only activated when there is at least one listener, and stops immediately after the last listener disconnects. This means we only ever use the minimum resources necessary.</p><p><strong>Hybrid Text Search: The Two-Stage Filter</strong></p><p>The other feature I think is kinda beautiful is the fuzzy search. Calculating the Levenshtein distance between two strings is computationally expensive, and there is no way to build an index beforehand without a combinatorial explosion in storage size. But running it across every document in a collection would be unusable for anything beyond a few thousand entries.</p><p>Beaver solves this with a two-stage process. First, it uses a pre-computed trigram index in SQLite to instantly find a very small set of candidate documents that share a high percentage of 3-character &#8220;chunks&#8221; with the query. This is a very fast SQL query that already gives us a very good approximation to the response. Then, we run the expensive Levenshtein algorithm <em>only</em> on this small, pre-filtered set of candidates in memory.</p><h2>The Vector Store Dilemma</h2><p>Vector storage is my biggest concern at the moment. The current implementation is, I think, pretty good for a typical use case of infrequent indexing and fast retrieval. But it can be much improved.</p><p>Right now, Beaver uses an in-memory k-d tree, which provides excellent search speed once the index is loaded into RAM. However, the index is ephemeral&#8212;it lives and dies with the application process. This creates a significant bottleneck: every time your application starts, the <em>entire</em> index must be rebuilt from scratch by reading all vectors from the database. Furthermore, indexing a new document requires a full, blocking rebuild of the entire tree.</p><p>As long as you index documents in a background process, infrequently&#8212;which is what most RAG-based applications do&#8212;this works just fine. But I&#8217;m not satisfied with this implementation, so here is my plan for its update.</p><p>The roadmap goes through integrating a state-of-the-art ANN library like <code>faiss</code>. This will enable a persistent, on-disk index that can be loaded instantly. But more importantly, we need to support incremental additions, which means newly added documents don&#8217;t rebuild the entire index. The key to achieve this is to have a persistent, large index, and a small, in-memory, temporal index of new additions that gets added to the persistent index from time to time. Keep these two in sync is far from trivial, but I already have a somewhat detailed plan.</p><h2>Conclusion</h2><p>Beaver is the result of a couple of weekends of focused frustration, and it&#8217;s already the backbone of my own AI prototypes. Even if it never reaches any level of maturity, I&#8217;m extremely satisfied with it because it already works for my use cases, and I&#8217;ve learned so much building it. Damn, who knew building databases could be so fun!</p><p>Now, what&#8217;s in it for you? Well, Beaver is not meant to replace your production PostgreSQL cluster. That&#8217;s unthinkable. But it might be the &#8220;good enough&#8221; database that lets you go from an idea to a working prototype in minutes, not hours. So, stop being an accidental DevOps engineer and go build something cool.</p><p>You can get started with a simple <code>pip install beaver-db</code>. Check out the <a href="https://github.com/apiad/beaver">GitHub repository</a> for a lot of examples. And leave me any questions or comments either here or in the repository issues.</p><p>Have fun!</p>]]></content:encoded></item><item><title><![CDATA[The Four Fallacies of Modern AI]]></title><description><![CDATA[And Why Believing in Them Hinders Progress]]></description><link>https://blog.apiad.net/p/the-four-fallacies-of-modern-ai</link><guid isPermaLink="false">https://blog.apiad.net/p/the-four-fallacies-of-modern-ai</guid><dc:creator><![CDATA[Alejandro Piad Morffis]]></dc:creator><pubDate>Wed, 10 Sep 2025 11:30:43 GMT</pubDate><enclosure url="https://images.unsplash.com/photo-1503792070985-b4147d061915?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw3M3x8cmFuZG9tfGVufDB8fHx8MTc1NzQzMzM5NXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://images.unsplash.com/photo-1503792070985-b4147d061915?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw3M3x8cmFuZG9tfGVufDB8fHx8MTc1NzQzMzM5NXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://images.unsplash.com/photo-1503792070985-b4147d061915?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw3M3x8cmFuZG9tfGVufDB8fHx8MTc1NzQzMzM5NXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1503792070985-b4147d061915?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw3M3x8cmFuZG9tfGVufDB8fHx8MTc1NzQzMzM5NXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1503792070985-b4147d061915?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw3M3x8cmFuZG9tfGVufDB8fHx8MTc1NzQzMzM5NXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1503792070985-b4147d061915?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw3M3x8cmFuZG9tfGVufDB8fHx8MTc1NzQzMzM5NXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw"><img src="https://images.unsplash.com/photo-1503792070985-b4147d061915?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw3M3x8cmFuZG9tfGVufDB8fHx8MTc1NzQzMzM5NXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" width="3163" height="2230" data-attrs="{&quot;src&quot;:&quot;https://images.unsplash.com/photo-1503792070985-b4147d061915?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw3M3x8cmFuZG9tfGVufDB8fHx8MTc1NzQzMzM5NXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2230,&quot;width&quot;:3163,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;closeup photo of Yale 19 key against black background&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="closeup photo of Yale 19 key against black background" title="closeup photo of Yale 19 key against black background" srcset="https://images.unsplash.com/photo-1503792070985-b4147d061915?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw3M3x8cmFuZG9tfGVufDB8fHx8MTc1NzQzMzM5NXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1503792070985-b4147d061915?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw3M3x8cmFuZG9tfGVufDB8fHx8MTc1NzQzMzM5NXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1503792070985-b4147d061915?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw3M3x8cmFuZG9tfGVufDB8fHx8MTc1NzQzMzM5NXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1503792070985-b4147d061915?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHw3M3x8cmFuZG9tfGVufDB8fHx8MTc1NzQzMzM5NXww&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Photo by <a href="https://unsplash.com/@mattartz">Matt Artz</a> on <a href="https://unsplash.com">Unsplash</a></figcaption></figure></div><p>I've spent the last few years trying to make sense of the noise around Artificial Intelligence, and if there's one feeling that defines the experience, it's whiplash. One week, I'm reading a paper that promises AI will cure disease and unlock unimaginable abundance; the next, I'm seeing headlines about civilizational collapse. This dizzying cycle of AI springs, periods of massive investment and hype, followed by the chilling doubt of AI winters isn't new. It's been the engine of the field for decades.</p><p>After years of this, I've had to develop my own framework just to stay grounded. It&#8217;s not about being an optimist or a pessimist; it&#8217;s about rejecting both extremes. For me, it&#8217;s a commitment to a tireless reevaluation of the technology in front of us; to using reason and evidence to find a path forward, because I believe we have both the power and the responsibility to shape this technology&#8217;s future. That begins with a clear-eyed diagnosis of the present.</p><p>One of the most useful diagnostic tools I've found for this comes from computer scientist Melanie Mitchell. In a seminal paper back in 2021, she identified what she claims are four foundational fallacies, four deeply embedded assumptions that explain to a large extent our collective confusion about AI, and what it can and cannot do.</p><p>My goal in this article isn't to convince you that Mitchell is 100% right. I don't think she is, either, and I will provide my own criticism and counter arguments to some points. What I want is to use her ideas as a lens to dissect the hype, explore the counterarguments, and show why this intellectual tug-of-war has real-world consequences for our society, our economy, and our safety.</p><h2>Deconstructing the Four Fallacies</h2><p>For me, the most important test of any idea is its empirical validation. No plan, no matter how brilliant, survives its first encounter with reality. I find that Mitchell&#8217;s four fallacies are the perfect tool for this. They allow us to take the grand, sweeping claims made about AI and rigorously test them against the messy, complicated reality of what these systems can actually do.</p><h3>Fallacy 1: The Illusion of a Smooth Continuum</h3><p>The most common and seductive fallacy is the assumption that every impressive feat of narrow AI is an incremental step on a smooth path toward human-level Artificial General Intelligence (AGI). That is, that intelligence is a single, unidimensional metric on a continuum that goes from narrow to general.</p><p>We see this everywhere. When IBM's Deep Blue beat Garry Kasparov at chess, it was hailed as a first step towards AGI. The same narrative emerged when DeepMind's AlphaGo defeated Lee Sedol. This way of thinking creates, according to Mitchell, a flawed map of progress, tricking us into believing we are much closer to AGI than we are. It ignores the colossal, unsolved challenge known as the commonsense knowledge problem&#8212;the vast, implicit understanding of the world that humans use to navigate reality.</p><p>As philosopher Hubert Dreyfus famously said, this is like claiming that the first monkey that climbed a tree was making progress towards landing on the moon. Well, in a sense, maybe it is, but you get the point. We didn't get to the moon until we invented combustion rockets. Climbing ever taller trees gets us nowhere closer, it's just a distraction. In the same sense, mastering a closed-system game may be a fundamentally different challenge than understanding the open, ambiguous world.</p><p>But here's the nuance. While beating Kasparov isn't a direct step to having a conversation, the methods developed can be surprisingly generalizable. The architecture that powered AlphaGo was later adapted into MuZero, a system that mastered Go, chess, and Atari games without being told the rules. </p><p>Furthermore, can we really call a Large Language Model narrow in the same way? Its ability to write code and summarize text feels like a qualitative leap in generality that the monkey-and-moon analogy doesn't quite capture.</p><p>This leaves us with a forward-looking question: How do recent advances in multimodality and agentic AI test the boundaries of this fallacy? Does a model that can see and act begin to bridge the gap toward common sense, or is it just a more sophisticated version of the same narrow intelligence? Are world models a true step towards AGI or just a higher branch in a tree of narrow linguistic intelligence?</p><h3>Fallacy 2: The Paradox of Difficulty</h3><p>We have a terrible habit of projecting our own cognitive landscape onto machines, assuming that what's hard for us is hard for them, and what's easy for us is easy for them. For decades, the opposite has been true.</p><p>This is Moravec's Paradox, named after the roboticist Hans Moravec, who noted it's easier to make a computer exhibit adult-level performance on an IQ test than to give it the sensory and motor skills of a one-year-old.</p><p>This explains why we have AI that can master the ridiculously complex game of Go, while a fully self-driving car remains stubbornly just over the horizon. The "easy" things are built on what Mitchell calls the "invisible complexity of the mundane." This paradox causes a chronic mis-calibration of our progress and priorities, leading us to be overly impressed by performance in formal domains while underestimating the staggering difficulty of the real world.</p><p>Of course, some would argue this isn't a fundamental barrier, but a temporary engineering hurdle. They&#8217;d say that with enough data and compute, the "invisible complexity" of the real world can be learned, just like the complexity of Go was.</p><p>From this perspective, the problem isn't one of kind, but of scale. This forces us to ask: as sensor technology and robotics improve, are we finally starting to overcome Moravec's Paradox? Or are we just discovering even deeper layers of complexity we never knew existed?</p><h3>Fallacy 3: The Seduction of Wishful Mnemonics</h3><p>Language doesn't just describe reality; it creates it. In AI, we constantly use anthropomorphic shorthand, saying a system "learns," "understands," or has "goals." Mitchell argues this practice of using "wishful mnemonics" is deeply misleading, fooling not just the public but the researchers themselves.</p><p>When a benchmark is called the "General Language Understanding Evaluation" (GLUE) and a model surpasses the human baseline, headlines declare that AI now understands language better than humans. But does it?</p><p>The term "stochastic parrot" was coined as a powerful antidote, reframing what LLMs do as sophisticated mimicry rather than comprehension. This isn't just a semantic game, Mitchell argues; it creates a flawed mental model that leads to misplaced trust, encouraging us to deploy systems in high-stakes situations where a lack of true understanding can have serious consequences.</p><p>A fair critique is that these terms are a necessary cognitive shorthand. At a certain level of complexity, a system's emergent behavior becomes functionally indistinguishable from "understanding," and arguing about whether it really understands is an unprovable philosophical distraction.</p><p>But that still leaves a crucial question: can we develop a more precise, less anthropomorphic vocabulary to describe AI capabilities? Or is our human-centric language the only tool we have to reason about these new forms of intelligence, with all the baggage that entails?</p><h3>Fallacy 4: The Myth of the Disembodied Mind</h3><p>This is the most philosophical, and in my opinion, the most important fallacy. It's the deep-seated assumption that intelligence is, like software, a form of pure information processing that can be separated from its body.</p><p>This "brain-as-computer" metaphor leads to the belief that AGI is simply a matter of scaling up compute to match the brain's raw processing power. It's challenged by Mitchell and many others with the thesis of embodied cognition, a view from cognitive science which holds that intelligence is inextricably linked to having a body that interacts with the world. If this is correct, then our current approach may just be creating ever-more-sophisticated systems that are fundamentally brittle because they lack grounded understanding.</p><p>This is where we hit the great intellectual battle line in modern AI. The primary counterargument can be framed in terms of Rich Sutton's famous essay, "The Bitter Lesson," which argues that the entire history of AI has taught us that attempts to build in human-like cognitive structures (like embodiment) are always eventually outperformed by general methods that just leverage massive-scale computation.</p><p>From this viewpoint, embodiment isn't a magical prerequisite for intelligence; it's just another fiendishly complex problem that will yield to more data and processing power.</p><p>This tension poses a critical question for the future: do multimodal models that can process images and text represent a meaningful step toward solving the embodiment problem? Or are they just a more sophisticated version of the same disembodied mind, a brain in a slightly larger digital vat?</p><h2>What is Intelligence, Really?</h2><p>As we dig into these fallacies, a deeper pattern emerges. They aren't just four isolated mistakes; they're symptoms of a fundamental schism in how the AI world thinks about intelligence itself. Again, my goal isn't to pick a side but to avoid falling prey to cheap heuristics or ideological banners, and instead evaluate which of these paradigms gives us a more useful map of reality.</p><p>On one side, you have what I&#8217;ll call the Cognitive Paradigm, championed by thinkers like Mitchell and her mentor, superstar AI researcher and philosopher Douglas Hofstadter. This view sees intelligence as a complex, integrated, and embodied phenomenon. It assumes that the things we associate with human intelligence&#8212;common sense, emotions, values, a sense of self&#8212;are likely inseparable components of the whole, emerging from rich interaction with a physical and social world.</p><p>From this perspective, the path to AGI requires a deep, scientific understanding of these integrated components, not just more processing power.</p><p>On the other side is the Computationalist Paradigm, which is the implicit philosophy behind many of today's leading labs, and best captured by The Bitter Lesson. This posits that the biggest breakthroughs have always come from general methods that leverage massive-scale computation&#8212;in other words, from scaling things up.</p><p>In this paradigm, intelligence is a more abstract, substrate-independent quality of optimization. Problems like embodiment aren't fundamental barriers; they are just incredibly complex computational tasks that will eventually be solved by ever-larger models and ever-faster chips.</p><p>Of course, it's not a perfect binary. Most researchers are pragmatists, like me, working somewhere in the messy middle. But these two paradigms represent the poles of the debate, and the tension between them defines the entire field. It shapes which research gets funded, which systems get built, and ultimately, which vision of the future we are collectively racing toward.</p><h2>Why This Debate Matters</h2><p>This debate isn't just an academic parlor game. These fallacies have a massive ripple effect across society because they obscure a fundamental rule of technology and economics: there's no free lunch, only trade-offs.</p><p>The hype generated by fallacious thinking isn't just an innocent mistake; it's the fuel for a powerful economic engine. The intense competition between tech giants, the flood of venture capital, and the geopolitical AI race all depend on a constant narrative of imminent, world-changing breakthroughs. This political economy of hype forces us into a series of dangerous trade-offs.</p><p>First, we trade long-term progress for short-term hype.</p><p>The fallacies create an unstable, boom-and-bust funding cycle. During an AI spring, capital flows to projects that can produce impressive-looking demos, often based on narrow benchmarks. This starves the slow, methodical, foundational research needed to solve the hard problems like common sense and reasoning. The result is a field that lurches from one hype bubble to the next, leaving a trail of abandoned projects and unfulfilled promises that trigger the inevitable AI winter.</p><p>Second, we trade public trust for market excitement.</p><p>The cycle of over-promising and under-delivering is deeply corrosive. When we use wishful mnemonics to describe a system that "understands," and it then fails in spectacular, nonsensical ways in the real world, it breeds public anxiety and skepticism. Recent studies show the public perceives AI scientists more negatively than almost any other field, specifically because of a perceived lack of prudence. This isn't a vague feeling; it's a direct reaction to the unintended consequences of deploying brittle, overhyped systems.</p><p>Finally, and most critically, we trade responsible validation for speed to market.</p><p>This is where the consequences become most severe. Believing a system is on a continuum with general intelligence, or that it truly "understands" language, leads to its premature deployment in high-stakes domains.</p><p>When a mental health chatbot, which is fundamentally, at least today, a sophisticated pattern-matcher, gives harmful advice to a person in crisis, it&#8217;s a direct result of these fallacies. When we over-rely on brittle systems in healthcare, finance, or autonomous vehicles, we are making a dangerous bet, trading real-world safety for the illusion of progress.</p><h2>Conclusion</h2><p>So where does this leave us? The value of Mitchell's fallacies isn't just in spotting hype, but in exposing the deep, productive tension between these two powerful ways of thinking about intelligence. We can't ignore the fallacies, but we also can't deny the incredible, world-altering power of the scaling paradigm that fuels them.</p><p>Mitchell in her paper compares modern AI to alchemy. It produces dazzling, impressive results but it often lacks a deep, foundational theory of intelligence.</p><p>It&#8217;s a powerful metaphor, but I think a more pragmatic conclusion is slightly different. The challenge isn't to abandon our powerful alchemy in search of a pure science of intelligence. The goal, at least from a pragmatist point of view, should be to infuse our current alchemy with the principles of science, to make scaling smarter, safer, and more grounded by integrating the hard-won insights about how intelligence actually works.</p><p>The path forward, I believe, requires more than just intellectual humility. It also requires a willingness to synthesize these seemingly opposed worldviews, and a commitment to a tireless reevaluation of the technology before us. The ultimate question is not if we should choose the path of scaling or the path of cognitive science, but how we can weave them together to guide the raw power of our modern AI alchemy with the deep understanding of a true science of intelligence.</p>]]></content:encoded></item><item><title><![CDATA[The Era of AI is Here]]></title><description><![CDATA[How Artificial Intelligence is becoming the mediator of all human interactions, and why that is so scary]]></description><link>https://blog.apiad.net/p/the-hyper-spectacle-of-ai</link><guid isPermaLink="false">https://blog.apiad.net/p/the-hyper-spectacle-of-ai</guid><dc:creator><![CDATA[Alejandro Piad Morffis]]></dc:creator><pubDate>Fri, 29 Aug 2025 21:04:56 GMT</pubDate><enclosure url="https://images.unsplash.com/photo-1494138030114-a8cf519b022b?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyOXx8aGFuZHxlbnwwfHx8fDE3NTY1MDE0MzN8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://images.unsplash.com/photo-1494138030114-a8cf519b022b?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyOXx8aGFuZHxlbnwwfHx8fDE3NTY1MDE0MzN8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://images.unsplash.com/photo-1494138030114-a8cf519b022b?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyOXx8aGFuZHxlbnwwfHx8fDE3NTY1MDE0MzN8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1494138030114-a8cf519b022b?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyOXx8aGFuZHxlbnwwfHx8fDE3NTY1MDE0MzN8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1494138030114-a8cf519b022b?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyOXx8aGFuZHxlbnwwfHx8fDE3NTY1MDE0MzN8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1494138030114-a8cf519b022b?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyOXx8aGFuZHxlbnwwfHx8fDE3NTY1MDE0MzN8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw"><img src="https://images.unsplash.com/photo-1494138030114-a8cf519b022b?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyOXx8aGFuZHxlbnwwfHx8fDE3NTY1MDE0MzN8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080" width="6000" height="3376" data-attrs="{&quot;src&quot;:&quot;https://images.unsplash.com/photo-1494138030114-a8cf519b022b?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyOXx8aGFuZHxlbnwwfHx8fDE3NTY1MDE0MzN8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:3376,&quot;width&quot;:6000,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;person hand reaching for the sky&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="person hand reaching for the sky" title="person hand reaching for the sky" srcset="https://images.unsplash.com/photo-1494138030114-a8cf519b022b?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyOXx8aGFuZHxlbnwwfHx8fDE3NTY1MDE0MzN8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 424w, https://images.unsplash.com/photo-1494138030114-a8cf519b022b?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyOXx8aGFuZHxlbnwwfHx8fDE3NTY1MDE0MzN8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 848w, https://images.unsplash.com/photo-1494138030114-a8cf519b022b?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyOXx8aGFuZHxlbnwwfHx8fDE3NTY1MDE0MzN8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1272w, https://images.unsplash.com/photo-1494138030114-a8cf519b022b?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wzMDAzMzh8MHwxfHNlYXJjaHwyOXx8aGFuZHxlbnwwfHx8fDE3NTY1MDE0MzN8MA&amp;ixlib=rb-4.1.0&amp;q=80&amp;w=1080 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Photo by <a href="https://unsplash.com/@jeremyperkins">Jeremy Perkins</a> on <a href="https://unsplash.com">Unsplash</a></figcaption></figure></div><p>What if the most significant revolution of our time isn't just about faster computers or smarter algorithms, but about a fundamental re-wiring of how we experience reality itself? In an age often polarized between unbridled techno-optimism and dire predictions of collapse, I believe a more nuanced, pragmatic approach is essential. We must look beyond the hype and the fear to understand the true nature of the technological shifts unfolding around us.</p><p>In this essay, I will argue that we are not merely witnessing the rise of powerful new tools; we are entering the <strong>Era of Artificial Intelligence</strong>, a profound societal transformation where AI increasingly mediates our relationships with objective reality and, crucially, with each other.</p><p>To understand the magnitude of this shift, consider how past technological eras have reshaped human existence. For the purpose of this article, I'll define an "era" as a period marked by a fundamental shift where the majority of relationships, both with objective reality and among people themselves, become mediated by a specific technology stack.</p><p>Let's look at a few examples.</p><p>First, consider what we may call the Era of Electricity. Slowly first, but increasingly accelerated, our interactions with the physical world were completely transformed. Light, heat, power, all mediated by electric currents. Work in factories and farms became mediated by electricity, then transportation, then entertainment and education. Electricity even changed how we sleep. As people started staying outside after dark, we changed from a sleep cycle, often biphasic, that mirrored the behavior of the Sun, to our monophasic cycle that is completely artificial&#8212;we go to sleep way before its dark, and we wake up often before the Sun comes out.</p><p>Another example is the Era of Computers. Our engagement with reality became even more profoundly mediated, not just by physical devices, but by the logic and processing power of silicon. From personal computing to industrial automation, the computer became the invisible hand guiding our interaction with information and machinery. Think how many jobs started to function exclusively with a computer mediating between our intentions and their results. We started to use computers to pay for things, where before a simple exchange of paper was enough. Even our language changed with the advent of computers.</p><p>The Internet Era took this mediation a step further, fundamentally altering human relationships. We started flirting over the internet, ordering food, arguing about politics, and making friends over the Internet. Our social fabric itself became woven through digital threads. All our life is intricately connected to the Internet, there is almost no interaction between ourselves and with the real world that doesn't happen through some sort of online application or platform. Even this interaction between me, the author, and you, the reader, is only possible because the Internet mediates. We wouldn't know each other otherwise.</p><p>Now, Artificial Intelligence stands poised to inherit this mantle. In the remaining of this article I want to argue why AI is already becoming the primary mediator of most meaningful human interactions, and will continue to do so in the near future. And while no one can deny the progress of technology brings incredible benefits---and I'm the first to acknowledge it and embrace the immense power that AI grants---it is also undeniable that this mediation brings some pretty scary consequences, some which are already unfolding.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.apiad.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.apiad.net/subscribe?"><span>Subscribe now</span></a></p><h2>AI as the Ultimate Mediator</h2><p>AI's mediating power extends across every facet of our lives, subtly, yet profoundly, shaping how we perceive, interact with, and relate to the world and each other.</p><p>Consider how AI is already changing our perception and understanding of the world. In medicine, AI analyzes X-rays and MRIs, often detecting anomalies like tumors with greater speed and accuracy than the human eye. It mediates how doctors and patients <em>perceive</em> health conditions.</p><p>In industry, AI monitors machinery to predict failures, mediating our understanding of the physical world's future state. AI even helps us perceive the planet's health by processing satellite imagery to track deforestation or pollution.</p><p>Think about your phone's camera: when you take a photo, AI often automatically adjusts the lighting, focus, and even smooths skin, mediating how you perceive the scene or person you're photographing. It's not just capturing reality; it's enhancing or altering it before you even see the final image.</p><p>Beyond perception, AI is increasingly mediating our interaction and control over reality. Autonomous vehicles, from cars to drones, are changing how we physically move through and interact with our environment.</p><p>Smart home systems, managed by AI, mediate our control over our immediate living spaces. In manufacturing, AI-driven robots optimize production lines, mediating how we create and manipulate physical goods. When you ask a voice assistant like Alexa or Google Home to play music or turn off lights, AI is mediating your control over your environment through voice commands, rather than you physically interacting with switches or devices.</p><p>Many modern cars also have AI-powered features like adaptive cruise control or lane-keeping assist, mediating your interaction with the road by taking over some aspects of driving.</p><p>And then there's the realm of creation and augmentation. Generative AI, like Midjourney or ChatGPT, creates art, music, and text from simple prompts, mediating our creative output and expanding what's possible.</p><p>AI accelerates scientific discovery by analyzing vast datasets, mediating our ability to discover and augment knowledge. When you use an AI assistant or just an innocent grammar checker like Grammarly, AI is mediating your writing process and creative output, suggesting changes to make your sentences clearer or more impactful. Many video editing apps now have AI features that can automatically cut scenes or add background music, mediating your creative process in video production.</p><p>Our daily consumption of information is also heavily mediated by AI. Algorithms curate our news feeds, suggesting articles and entertainment based on past behavior, mediating what information and experiences we consume from the vast digital reality. Think about your Netflix recommendations; AI is constantly analyzing your viewing habits to mediate what shows and movies you see, often leading you down rabbit holes of similar content.</p><p>But the impact of AI on human relationships is even more profound. Communication itself is being redefined. Real-time language translation, powered by AI, enables seamless conversations across linguistic divides, mediating interpersonal understanding. AI-powered chatbots handle customer service, mediating interactions between businesses and consumers. And in a more intimate, and perhaps concerning, development, AI companions are emerging, offering conversational interaction and emotional support, mediating personal and emotional relationships in ways we're only beginning to comprehend.</p><p>Have you ever used a dating app? Or just Facebook or LinkedIn? AI algorithms are mediating who you see as potential partners by suggesting matches based on your profile and preferences, fundamentally changing how many people initiate relationships, romantic or otherwise. </p><p>When you send an email and your email client suggests auto-completing your sentences or offers quick replies, AI is mediating your written communication, subtly influencing your phrasing and the very nature of your interaction with the person at the other side of that email.</p><p>Consider how ride-sharing apps like Uber or Lyft connect drivers and passengers; AI mediates who you interact with for transportation, replacing traditional hailing or dispatch systems. Many online gaming platforms also use AI to match players for competitive games, mediating who you collaborate or compete with.</p><p>Finally, AI is beginning to mediate our very identity and self-perception. Deepfakes and synthetic media, generated by AI, create realistic images and videos of people, blurring the lines of authenticity and mediating how we perceive others and ourselves online. Personalized digital avatars, often AI-assisted, mediate our online identity.</p><p>Think about the filters you use on social media that can alter your appearance in photos or videos; AI is mediating how you present yourself to others online and, in turn, how you might perceive your own image. AI is starting to mediate even your relationship with your own mind.</p><h2>The Perils of AI Mediation</h2><p>The immediate benefits of Artificial Intelligence are undeniable. AI offers unprecedented efficiency, personalized experiences, and the ability to solve complex problems that have long eluded us. This potential for human flourishing is immense, provided we make conscious choices about its use.</p><p>However, this era, like all technological advancements, comes with no free lunch. While the benefits are compelling, they are inextricably linked to profound, often insidious, risks. To focus solely on these gains would be to miss the critical trade-offs inherent in this new era. As AI imposes more layers of mediation between us and our world, we risk a dangerous detachment from immediate reality, from each other, and ultimately, from our own agency.</p><p>This is where the prescient critique of Guy Debord, articulated in his seminal 1967 work <em>The Society of the Spectacle</em>, becomes chillingly relevant. Debord argued that modern capitalist society had transformed "all that was once directly lived" into "mere representation." The spectacle, for Debord, was not just a collection of images, but a social relationship mediated by images, where appearances replaced authenticity, and passive consumption supplanted active participation.</p><p>He warned of a world where reality itself was subsumed by its image, leading to profound alienation and a loss of genuine human experience. If Debord's mid-20th century world was a spectacle, the Era of AI threatens to usher in a <em>Hyper-Spectacle</em>&#8212;a phenomenon of mediation so pervasive and sophisticated that it would dwarf anything he could have imagined. I think there are three fundamental, distinct concerns about this hyper-spectacular reality that we should analyze in depth.</p><h3>Disconnection from Immediate Reality</h3><p>In the Hyper-Spectacle of AI, our direct engagement with reality is increasingly replaced by an <em>algorithmic illusion</em>. When AI curates all our information, news, and even sensory input, we risk living in an algorithmic filter bubble on steroids. Imagine an AI-powered AR overlay that filters out undesirable elements of reality, presenting a highly curated, potentially distorted, and ultimately unreal version of the world.</p><p>This is the ultimate triumph of Debord's "mere representation", where the mediated reality becomes more compelling, more perfectly tailored, than the actual reality. We become passive consumers of AI-generated solutions, outsourcing our direct engagement with the physical world and its challenges. This leads to a loss of practical skills and a diminished sense of agency, as our capacity for direct experience atrophies.</p><p>Think about how many people rely solely on their phone's GPS for directions, even in familiar areas. AI mediates their perception of their physical surroundings, often leading them to ignore street signs or landmarks, and potentially diminishing their spatial awareness. If the GPS is wrong, they're lost because they haven't engaged directly with the real world.</p><p>Similarly, when you watch a heavily AI-edited video or listen to an AI-generated song, you might be consuming something that feels real but was never "created" by a human in the traditional sense, mediating your experience of art and authenticity.</p><p>Then comes the black box issue&#8212;where AI systems make critical decisions based on opaque algorithms&#8212;which means we benefit from outcomes without understanding <em>why</em> or <em>how</em>. The "truth" presented by the AI is accepted without question, becoming an unchallengeable spectacle. This opaque authority reinforces the illusion, as the underlying mechanisms of control are hidden, and the mediated outcome is presented as objective fact, further separating us from the direct understanding of cause and effect.</p><p>For example, many people now rely on AI-powered smart assistants to answer questions or provide information without ever checking the source. If the AI gives a subtly incorrect or biased answer, it's often accepted as fact, mediating their understanding of truth without critical engagement.</p><p>Or consider a smart refrigerator that automatically reorders groceries based on AI predictions of your consumption; while convenient, it mediates your direct engagement with your food choices and shopping habits, potentially leading to a loss of awareness about what you're actually consuming or spending.</p><h3>Disconnection from Each Other</h3><p>The Hyper-Spectacle extends its reach into our most fundamental human connections, fostering a <em>simulated relationship</em> that replaces authentic interaction. The rise of sophisticated AI companions, so convincing they fulfill emotional or social needs, poses a chilling question: what happens to genuine human interaction?</p><p>The "relationship" with the AI is a performance, a simulation of connection, a perfect example of Debord's social relationship mediated by images where the image of connection supplants its substance. This risks eroding empathy, social skills, and the capacity for deep, complex human relationships, leading to profound loneliness in a hyper-connected, yet isolated, world.</p><p>Think about how many people prefer to text or message rather than call or meet in person. Even without AI, the digital mediation of communication can lead to a simulated connection where nuances of tone, body language, and spontaneous interaction are lost, potentially eroding deeper social skills. Now add an ever-present, never-angry chatbot that can replace all your friends and family.</p><p>Moreover, AI-driven algorithms optimize content delivery to maximize engagement, often by showing us more of what we already agree with. This fuels algorithmic polarization and echo chambers, making shared understanding and collective action incredibly difficult. Different groups live in entirely different "spectacles" of reality, curated by algorithms, making genuine dialogue and the bridging of divides increasingly impossible.</p><p>The spectacle here is not just what we see, but who we see and how we see them, fragmenting society into isolated, algorithmically-defined bubbles. If your social media feed is constantly showing you content from only one political viewpoint, AI is mediating who you see and what opinions you encounter, making it harder to understand or empathize with opposing views. You're living in an algorithmically-defined bubble.</p><h3>Servitude to Embedded Ideologies</h3><p>Perhaps the most insidious risk in the Hyper-Spectacle is the subtle, yet pervasive, <em>servitude to ideologies </em>embedded within these mediating layers. AI systems new and old, when trained on biased historical data, can perpetuate and amplify existing societal inequalities in areas like hiring, lending, or criminal justice. This ideology of past discrimination gets baked into the future, presented as objective truth by AI. </p><p>This is the spectacle's ultimate power: presenting its biased outcomes as neutral, objective reality, subtly reinforcing existing power structures and values without overt coercion.</p><p>Imagine an AI-powered resume screening tool that, because it was trained on historical hiring data, subtly favors candidates from certain demographics or educational backgrounds, even if those biases aren't explicitly programmed. This mediates who gets professional opportunities, reinforcing past inequalities.</p><p>To make things even worse, AI systems are increasingly designed to subtly influence our choices&#8212;what to buy, what news to read, whom to vote for&#8212;through personalized recommendations and persuasive interfaces. Our choices feel free, but they are meticulously guided by an unseen hand, a spectacle of autonomy that masks underlying control.</p><p>This is the essence of Debord's critique of commodity fetishism extended to every aspect of life: our desires and decisions are manufactured and mediated, not genuinely our own. When you're shopping online, AI-powered "recommended for you" sections or "customers also bought" suggestions are constantly mediating what products you see and are encouraged to buy, often leading you to spend more or buy things you didn't initially intend to. Your choice feels free, but it's heavily influenced.</p><p>When the development and deployment of powerful AI systems are concentrated in the hands of a few corporations or governments, it gives them unprecedented control over information, resources, and even human behavior. The AI-mediated world becomes a spectacle designed and controlled by an elite, shaping narratives and realities to their advantage, further cementing our role as passive spectators in a world we no longer truly govern.</p><h2>The Future is Not Predetermined</h2><p>This Era of AI is not a distant future; it is here, and it is accelerating at a pace that leaves little room for avoidance. We cannot, and perhaps should not, try to stop it. But we absolutely must shape its trajectory. The technology is inevitable, but its <em>direction</em> and <em>impact</em> are not predetermined. This requires a proactive, collective effort, guided by reason and evidence, not by fear or blind faith.</p><p>I have absolutely no idea how to "fix" the problem of AI mediation. But I'm sure there are at least three critical approaches we have to embrace, lest we become mere spectators of this hyper-spectacular reality.</p><p>First, we must demand regulation that keeps choice firmly in the hands of the people. This requires advocating for the principles of algorithmic choice. We should always be able to determine when an algorithm is being used to guide our choices, and crucially, we must have the agency to understand, challenge, and more importantly, select which algorithms we let influence ourselves. Just as we choose our food or news sources&#8212;because you do, right?&#8212;, we should demand the right to choose our algorithmic lenses.</p><p>This means not only having, e.g., options to turn off algorithmic recommendation in Netflix or Twitter, but also to connect our own algorithmic filters and recommendation engines to replace the built in functionality. To achieve this, platforms need to embrace open protocols for algorithmic recommendation, for example, such that anyone can implement a different feed sorting procedure for Twitter. Platforms like Mastodon and Bluesky are pioneering this approach, but without explicit regulation there is no hope the largest platforms will follow.</p><p>It is clear then that we need robust institutional oversight. This means proactive, informed government regulation to set standards and enforce accountability. But it also means empowering non-governmental watchdogs, independent organizations, academia, and civil society, to monitor, critique, and advocate. Their role is crucial in preventing overreach, ensuring fairness, and keeping powerful AI entities in check. This collective vigilance is essential to prevent the concentration of power that fuels the spectacle.</p><p>And finally, beyond regulation and oversight, a more fundamental and personal shift is needed. We must consciously and actively strive to reclaim our direct experience of reality and shed the algorithmic lenses that increasingly mediate our lives. This means cultivating critical awareness of when and how AI is shaping our perceptions, fostering genuine human connections that bypass algorithmic curation, and seeking out unmediated experiences.</p><p>It means valuing authentic engagement over passive consumption, and actively participating in the world rather than merely observing its mediated representation. This is a call to intellectual and experiential rebellion against the Hyper-Spectacle, a commitment to living directly rather than through the image.</p><p>The Era of AI is upon us. It promises to mediate nearly every aspect of our lives. The question is not <em>if</em> it will mediate, but <em>how</em>. Answering this in a way that encourages human flourishing demands an unending, tireless reevaluation of our position towards technology. There are no banners, no ideology we can trust blindly. Techno-optimists and AI-doomers are both wrong, because they believe the outcome of technology is predetermined. And that makes us, by definition, spectators.</p><p>So here is my call to action for you today. Dare to stop living inside this spectacle of reality for a minute, put down your phone, close your laptop, or turn off your display. Look up, the world is still out there, waiting for you. Call one of your friends and ask them out for a pizza or a coffee or a beer&#8212;or all of them for a triple bonus. Hugh your kids if you have any, and your parents if you're lucky to still have them around. Go outside, feel the warmth of the sun in your face, the smell of the city, or the countryside, or the ocean, the roughness of dirt or pavement or grass under your bare feet. If it hurts just a little bit, even better.</p><p>Dare to choose how you want experience the actual, raw, unmediated world. The choice, I believe, is still yours. But not for long.</p>]]></content:encoded></item></channel></rss>