The Automated Daily - Hacker News Edition · March 1, 2026 · 16:05

Ads inside AI chatbots & MicroGPT: GPT in 200 lines - Hacker News (Mar 1, 2026)

Ad-supported AI chat satire, Karpathy’s microgpt, Chrome’s post-quantum TLS plan, Postgres random_page_cost data, and a Linux 72KB malloc mystery.

Ads inside AI chatbots & MicroGPT: GPT in 200 lines - Hacker News (Mar 1, 2026)
0:0016:05

Our Sponsors

Topics

01
Ads inside AI chatbots — A functional satire shows AI monetization via interstitials, banners, sponsored responses, retargeting, and freemium message gates—highlighting CPM/CPC/CPA vs subscription trade-offs.
02
MicroGPT: GPT in 200 lines — Andrej Karpathy’s microgpt implements training + inference for a tiny Transformer with a from-scratch autograd engine, Adam, KV cache, and char-level tokenization in one dependency-free Python file.
03
Decision trees and information gain — An MLU-Explain walkthrough shows how decision trees pick splits using entropy, information gain, ID3’s greedy recursion, and why overfitting and high variance motivate pruning and ensembles like random forests.
04
CMU’s Modern AI course — CMU’s 10-202 course (Spring 2026) teaches modern ML and LLMs—transformers, tokenizers, fine-tuning, alignment, RL for reasoning—via incremental programming assignments to build a minimal chatbot.
05
Linux’s mysterious 72 KB malloc — A Linux debugging deep-dive explains why many C++ programs’ first logged allocation is ~73,728 bytes: libstdc++ lazily allocates an exception “emergency pool,” tunable via GLIBCXX_TUNABLES.
06
Claude memory import workflow — Anthropic’s Claude adds “Import Memory,” a copy‑paste prompt workflow to migrate preferences from other assistants, with editable, persistent user memory and separation between projects on paid plans.
07
Houseplant programming mindset — “Houseplant programming” argues for small, personal, “works on my machine” software that’s allowed to be quirky, while “bouquet programming” covers one-off scripts that aren’t meant to be maintained.
08
Vertex: no-build SPA framework — Float64 Vertex is a ~1,000-line, jQuery-compatible SPA framework combining a DOM wrapper, Mustache templates, a router, Fetch-based AJAX, and React-like fiber rendering with hooks—no build step.
09
Chrome’s post-quantum certificates — Google Chrome is moving toward Merkle Tree Certificates (MTCs) in the IETF PLANTS group to make TLS quantum-resistant without massive X.509 chain sizes, with phased rollout through 2027.
10
Postgres random_page_cost reality check — Postgres tuning advice gets challenged: measured random I/O latency can imply random_page_cost values like 25–35+, affecting planner choices, index scans, bitmap scans, and prefetching assumptions.

Sources

Full Transcript

Imagine opening an AI chat… and before you type a single word, you’re forced to watch a full-screen ad with a countdown timer—then the assistant starts “helpfully” weaving sponsored product plugs into its answers. That’s not a dystopian pitch deck. It’s a working demo. Welcome to The Automated Daily, hacker news edition. The podcast created by generative AI. I’m TrendTeller, and today is March 1st, 2026. Let’s break down what Hacker News is chewing on—AI business models, tiny teachable GPTs, a quantum-resistant future for web certificates, and a couple of wonderfully nerdy systems mysteries.

First up, let’s stay in AI—but on the business side. 99helpers launched an “Ad-Supported AI Chat Demo.” It’s satirical in tone, but it’s not a mockup: the chat actually runs a live language model, while the brands and ads are intentionally fictional. The point is to make monetization tactics concrete—especially the user-experience costs that are easy to hand-wave when you’re staring at a spreadsheet of GPU bills. The demo crams in a whole buffet of ad formats. There’s a pre-chat, full-screen interstitial with a countdown. Around the chat window, you can get persistent banner and sidebar ads, the kind that never quite let you forget you’re in an ad-supported product. More interesting—and more ethically complicated—are the “sponsored responses.” Here, the assistant’s reply itself becomes inventory: it can embed product recommendations directly into what looks like normal guidance. On top of that, the interface can insert contextual text ads between response blocks, matched to whatever you’re talking about. And if your conversation shows buying intent, it escalates to product cards with images, pricing, and calls to action. The demo also includes behaviors marketers love and privacy advocates dread: retargeting and geo-targeting tied to topic and location. And it mimics freemium gating: you get five free messages, then you either “watch” a short ad—again with a countdown—or you’re nudged to upgrade to an ad-free tier. What 99helpers does well is explicitly contrasting ad-supported economics with subscription economics: CPM versus monthly fees, scalability versus user trust, and how incentives shift when the system is rewarded for clicks instead of usefulness. They also note chats are logged to improve the service but not sold to advertisers—still, the larger question remains: ad targeting typically demands data, and interruptions typically change behavior. If you’re designing AI products, this is basically a lab experiment you can click through.

Now, from monetization to mechanics: Andrej Karpathy introduced “microgpt,” a minimalist, educational project that implements the full training and inference loop of a GPT-like model in roughly 200 lines of plain Python—no dependencies. This is one of those projects that’s less about raw performance and more about clarity. Karpathy includes a tiny dataset—about 32,000 human names—treating each name as its own little document that the model learns to complete. Tokenization is intentionally simple: character-level. Every unique character becomes a token, plus a special beginning-of-sequence token. That’s a 27-token vocabulary, and each name is wrapped with that BOS token at both the start and end. Under the hood, microgpt rebuilds the essentials from scratch: a scalar autograd engine called `Value`, so it can construct a computation graph and backpropagate gradients without any tensor library. It supports the basic ops you’d expect—add, multiply, power, log, exp, ReLU—and then uses reverse topological traversal to push gradients backward, in the same spirit as frameworks like PyTorch, just dramatically stripped down. On top of that, there’s a simplified GPT-2–style Transformer: RMSNorm, no biases, ReLU activations. The model is tiny—16-dimensional embeddings, four attention heads, one layer, a block size of 16 tokens—adding up to 4,192 trainable parameters stored in a simple `state_dict`. One especially instructive detail: it processes one token at a time and keeps an explicit KV cache during both training and inference. And because it’s scalar autograd, it literally backprops through cached keys and values as live nodes in the graph. That’s not how you’d implement this for speed, but it’s fantastic for understanding what the KV cache conceptually is. Training runs for 1,000 steps—one name per step—optimizing cross-entropy with Adam and a linearly decaying learning rate. The loss drops from around 3.3, basically random guessing across 27 tokens, to about 2.37. Then sampling kicks out new, plausible-sounding names—things like “kamon,” “vialan,” and “keylen.” If you’ve ever wanted the “how does GPT actually happen?” story without wading through a full deep-learning stack, this is one of the cleanest teaching artifacts we’ve seen in a while.

Let’s broaden the learning theme with two more items: one on classic ML interpretability, and one on modern LLM education. MLU-Explain published a guide on decision trees—those nested if-then models that are still widely used because you can actually read them. The article walks through a toy dataset—think of it as a “farmer” deciding whether a tree is an Apple, Cherry, or Oak—based on trunk diameter and height. The tree repeatedly splits the feature space: a rule like “Diameter greater than or equal to 0.45” might carve out a region that’s mostly Oak, and then additional splits refine the remaining regions. The key teaching point is how a tree chooses those splits. The article introduces entropy as a measure of impurity—pure nodes have low entropy, mixed nodes have higher entropy—and then uses information gain to quantify how much a split reduces that impurity. It also lays out the ID3 algorithm as a greedy, recursive process: compute entropies, test candidate partitions, pick the best information gain, repeat until you hit stopping rules like max depth or minimum leaf size. It also flags the big downside: instability. Small changes in training data can produce a very different tree—high variance—which is why pruning and ensembles like random forests are often the practical answer. Meanwhile, Carnegie Mellon is launching “10-202: Introduction to Modern AI,” taught by Zico Kolter. It’s explicitly about modern AI in the everyday sense—machine learning and large language models like ChatGPT, Gemini, and Claude—not every historical corner of the academic AI umbrella. What stands out is the course philosophy: the foundational methods behind LLMs are, in principle, fairly simple, and you can implement a basic version in a few hundred lines. The course aims to take students from supervised learning fundamentals to the components of LLMs and post-training techniques. The core is a series of programming assignments that incrementally build a minimal AI chatbot—starting from linear models in PyTorch, moving into transformers, training an LLM on a corpus, then supervised chat fine-tuning, and even reinforcement learning for reasoning-style models. There’s also a free online track running with a two-week delay: you can watch videos and submit autograded assignments, but not take the in-class quizzes or exams. And there’s an AI policy that’s pretty telling of the moment: AI assistants are allowed on homework, but students are encouraged to do the final work without them—and AI tools are banned during quizzes and exams.

Switching gears: a classic systems mystery that finally gets a satisfying explanation. A post investigates why, when you override `malloc` on Linux using `LD_PRELOAD` and log allocations, so many programs appear to start with the same first allocation: 73,728 bytes—about 72 kilobytes. The author did the work carefully, too: they built a custom allocator and a logging approach that avoids the classic trap of triggering more allocations while logging—using stack buffers and low-level syscalls instead of convenient but allocation-happy libraries. After seeing the 72 KB allocation in programs like `ls`, they used `gdb` and breakpoints to trace who was actually calling `malloc`. The culprit wasn’t the app, and it wasn’t glibc directly—it was `libstdc++`. With debug symbols and source-level digging, the allocation leads to libstdc++ exception handling, specifically an “emergency pool.” The idea is pragmatic: if the normal heap allocator starts failing, you still want exceptions to be allocatable so the program can throw and potentially recover or fail gracefully. So libstdc++ lazily allocates a reserve arena during initialization. The fun part is you can prove it by changing the tunables. Using `GLIBCXX_TUNABLES`, you can reduce the object count and watch the first allocation shrink—for example, down to 2,880 bytes—or set it to zero to disable the pool. This also explains why tools like Valgrind sometimes show small C++ programs “leaking” around 73 KB: it’s often “still reachable” memory from the emergency pool, not a classic leak. Newer Valgrind behavior can even trigger cleanup so the output looks less alarming. It’s a great reminder that when you see a weird constant allocation pattern, it might be language runtime behavior—not your code.

Back to AI products for a moment—this time on user experience rather than ads. Anthropic’s Claude now highlights an “Import Memory” feature aimed at people switching from other assistants. The pitch is straightforward: if you’ve spent months teaching another AI your preferences—your style, your workflows, your recurring context—you can migrate that so your first Claude chat feels like your hundredth. The workflow is basically a guided copy-paste. Step one: Anthropic gives you a prompt to paste into your current assistant, which produces a consolidated blob of “here’s what you know about me.” Step two: you paste that output into Claude’s memory settings, and Claude updates what it remembers. Anthropic emphasizes a design choice that’s worth noting: memory across conversations, but with separation between projects so context doesn’t bleed. And they stress that you can view and edit everything Claude remembers—so it’s not meant to be an opaque profile you can’t control. Availability-wise, it’s positioned as a paid-plan feature, which fits the broader trend: persistent personalization is becoming part of the premium AI bundle.

Now for a gentler, human-side piece of software culture: “houseplant programming.” Hannah Ilea describes the term—coined by a Recurse Center peer—as tiny, personal, idiosyncratic software built to meet one person’s needs. In that world, “it works on my machine” isn’t an embarrassment; it’s the success criterion. You can have hacks. You can have manual restarts. You can have a README that’s basically a reminder to yourself. The contrast is with production code, where reliability, tests, and real users matter—someone put it neatly: production code has a phone number to call when it breaks. Houseplant code doesn’t. And that’s okay. Ilea also introduces “bouquet programming” for one-off scripts: single-purpose code you don’t intend to maintain. The piece argues that we shouldn’t apologize for sharing small, imperfect tools that are complete for their intended context. Like plants, some projects thrive, some get propagated, and some get composted into GitHub archives—still useful as learning artifacts.

In the “small tools with opinions” category, there’s also Float64 Vertex: a roughly 1,000-line single-page-app framework that tries to blend familiar ideas from React, Ractive-style templates, and jQuery—while staying jQuery-compatible. The pitch is refreshingly old-school: one self-contained `vertex.js` file, no build step, no external dependencies. Drop it into the page. It ships as UMD, so you can use it via a script tag, CommonJS, or AMD. And it’s careful about not hijacking `$` if jQuery is already present. Vertex includes a jQuery-like DOM wrapper called VQuery with chainable selection, event binding including delegation, attribute and style helpers, and traversal. It also wraps Fetch with a jQuery-shaped `ajax` interface, plus `get` and `post` shortcuts. On the UI side, it describes a React-like fiber reconciler and supports familiar primitives: `createElement`, `render`, fragments, and lazy-loaded components. There are hooks too—`useState`, `useEffect` with cleanup, memoization helpers, refs, and context. Separately, there’s a Mustache-based templating engine that re-renders on state changes and even supports two-way input binding. Finally, it includes a hash router in a Backbone-like style, plus a hook to re-render on `hashchange`. If you miss the simplicity of “ship a file and build a page,” this is very much in that lineage—just updated with modern component patterns.

Two final, more infrastructure-heavy items—one for the web, and one for databases. Google’s Chrome Secure Web and Networking Team outlined a program to make HTTPS certificates resilient against future quantum computers—without blowing up TLS handshakes with enormous post-quantum X.509 certificate chains. Chrome’s stance is blunt: they don’t plan to add traditional post-quantum algorithms into the standard X.509 certificates in the Chrome Root Store right now, because the chains would be too large and inefficient—especially once you account for Certificate Transparency. Instead, they’re backing a new model being standardized in the IETF’s PLANTS working group: Merkle Tree Certificates, or MTCs. The idea is that a CA signs a single “tree head” representing potentially millions of certificates. Then, when you connect, the browser gets a compact Merkle proof of inclusion—rather than a bulky serialized chain of signatures. Transparency becomes inherent: if it’s not in the tree, it’s not valid. The rollout is staged. Phase one is already happening with Cloudflare in real traffic, with every MTC connection also backed by a traditional X.509 cert as a safety net. Phase two, targeted for Q1 2027, brings in CT log operators with proven uptime to help bootstrap public MTCs. Phase three, targeted for Q3 2027, introduces a new Chrome Quantum-resistant Root Store running alongside the existing root program, plus opt-in downgrade protections for sites that want to be strict about quantum-resistant credentials. Google also hints at broader CA policy shifts: ACME-only issuance, modernization of revocation focused on key compromise, reproducible domain control validation with public proofs, and continuous monitoring over annual audits. This is one of those “the web’s plumbing is changing” stories—slow moving, but extremely consequential. And on the database side: Tomas Vondra takes a hard look at PostgreSQL’s `random_page_cost`—a planner knob that has defaulted to 4.0 for about 25 years. The common folk wisdom is: SSDs are fast, so lower it. Vondra’s measurements complicate that. He compares sequential scans versus index scans on a multi-gigabyte table with direct I/O to minimize OS cache effects. In his baseline, the sequential scan finishes in under a second, while the near-perfectly random index scan takes hundreds of seconds. Translating those timings into per-page costs suggests an effective `random_page_cost` around 25—far above the default. Across multiple systems, including remote SSD-backed storage, he sees estimates more like 25 to 35, and sometimes higher. Why it matters: if the planner underestimates the penalty of random I/O, it can choose index scans too often, in a range where the index plan is dramatically slower in practice—he shows cases approaching a 10x hit. Bitmap scans muddy the waters because they turn many random hits into something more sequential and friendlier to prefetching. And that points to an even bigger issue: Postgres costing doesn’t fully model prefetching, so the cost model can miss real runtime dynamics. The takeaway isn’t “set it to 30 everywhere.” It’s: don’t tune this based on vibes. Measure, monitor query latency—`pg_stat_statements` at minimum—and adjust with evidence, especially given how workload caching can change what “random” really means.

That’s our run for March 1st, 2026. If there’s a common thread today, it’s incentives and visibility: ads reshaping AI chat behavior, tiny codebases making LLMs legible, certificate systems evolving for a post-quantum world, and database knobs that only make sense once you measure reality. Links to all stories can be found in the episode notes. I’m TrendTeller—thanks for listening to The Automated Daily, Hacker News edition.