Jump to main content

scheatkode's lair

computing, life, music, ramblings

Hot-wiring the lisp machine

🞄 blogemacsrabbit hole 🞄 ~81 minutes read

I know you're thinking about it. heaviest-objects-in-the-universe-v0-tfugj4n3l6ez.png?auto=webp&s=a6d505b77af16be89f0ad9230b733581fbaf8e29 The modern web is choking on its own exhaust. Somewhere along the line, we traded the elegance of plain text for gigabytes of node_modules/, labyrinthine JavaScript frameworks, and bloated Static Site Generators that insist you learn their esoteric templating languages just to write a blog post. Worse yet, some even force you to use the mouse. Gross.

I know you're thinking about it.

heaviest-objects-in-the-universe-v0-tfugj4n3l6ez.png?auto=webp&s=a6d505b77af16be89f0ad9230b733581fbaf8e29

I didn't want another framework. I refused to hoard dependencies like some doomsday prepper. These held no appeal to me. I wanted the comfort of my text editor.

Specifically, I wanted Emacs. It's a well-documented reality that it is a single-threaded Lisp machine masquerading as a text editor. Org-mode is a markup language with an Abstract Syntax Tree mapped directly into your prefrontal cortex, not yet another cheap Markdown knockoff dressed in hipster syntax. Organizing your life with org-mode is a baseline, not an overstatement, not even an exaggeration. Absolute maniacs run their finances, their spreadsheets, and their fragile grip on reality out of it.

I was already deep in the trenches of org-mode, juggling giant agendas and interconnected Zettelkasten-style notes, the works. Bending my workflow or reorganizing my notes to appease the rigid directory structure of some flavor-of-the-month SSG was a non-starter. I just wanted to render my thoughts into some damn HTML.

I may generate a static website out of these notes at some point, not sure when though.

That line had been hovering in my README as a taunt for the better part of five years. It was time to call my own bluff. The goal was simple, perhaps dangerously so: publish my notes, written in org-mode, using zero external dependencies.

Like many before me, I stumbled into the gravitational pull of org-publish. The allure of a native publishing solution built right into Emacs was intoxicating. I spent hours tweaking, pruning, and watering my org-publish-projects-alist, only to smash face-first into the cold, harsh reality of its brittle API. For all its promises of infinite extensibility, the publishing engine felt agonizingly spartan. My code devolved into an abhorrent mass of duct tape and fragile hooks, leaving me miles away from the HTML output I was after.

I lost count of the battles fought against the templating function, the broken URLs, and the damned sitemap11 Sitemap in org-publish parlance refers to the index page listing all posts. . Needless to say, building a paginated index was an exercise in futility that felt less like programming and more like negotiating a hostage release with a brick wall. Extensibility was a myth; it was turtles all the way down.

Perhaps "no dependencies whatsoever" was a suicide pact. I re-evaluated my options, seeking something that rode natively on Emacs's composability: Disclaimer: This is not to say that any of these are bad. They simply don't fit my – admittedly draconian – constraints. If it sounds like I'm shitting all over the hard work of open-source contributors, I'm not. This is hyperbole meant to illustrate my descent into madness. No offense is meant.

Disclaimer: This is not to say that any of these are bad. They simply don't fit my – admittedly draconian – constraints. If it sounds like I'm shitting all over the hard work of open-source contributors, I'm not. This is hyperbole meant to illustrate my descent into madness. No offense is meant.

Weblorg. Nifty little thing. It ticked almost every box. Unopinionated. Composable. Just the tasty vibe I was hunting for:

;; route for rendering each post
(weblorg-route
 :name "posts"
 :input-pattern "posts/*.org"
 :template "post.html"
 :output "output/posts/{{ slug }}.html"
 :url "/posts/{{ slug }}.html")

;; route for rendering the index page
(weblorg-route
 :name "blog"
 :input-pattern "posts/*.org"
 :input-aggregate #'weblorg-input-aggregate-all-desc
 :template "blog.html"
 :output "output/index.html"
 :url "/")

;; route for static assets that also copies files to output directory
(weblorg-copy-static
 :output "static/{{ file }}"
 :url "/static/{{ file }}")

;; fire the engine and export all the files declared in the
;; routes above
(weblorg-export)

Beautiful.

Yet, it possessed the distinct, irritating friction of a pebble in a shoe; its dependency on string templating grated on my nerves. Why reinvent a mustachioed, jinja-flavored square wheel when I already had the ultimate Lisp machine purring beneath my fingertips? I didn't want another templating engine. I wanted pure, unadulterated Elisp. I demanded the raw power of Emacs.

So I scrapped the compromises. I exhumed the rotting corpse of my failed org-publish wrapper, opd – the dumb org-publish distribution, and decided to engineer my own way out of hell. Weblorg possessed the perfect architectural skeleton, but its organs were weak. A violent transplant was in order.

This is the story of how I ripped out its core, broke it entirely, and rebuilt it into a mathematically pure, two-pass "compiler". Long-ass introduction that will hopefully get you into the swing of things for this long-ass article. Buckle up. We're diving deep into Elisp weeds and parentheses.

Long-ass introduction that will hopefully get you into the swing of things for this long-ass article. Buckle up. We're diving deep into Elisp weeds and parentheses.

The delusion of naïveté

Every SSG starts with the exact same assumption:

"I'll just read a file, swap out some variables, and write some HTML."
– Famous last words

My early prototypes were nothing short of filthy. I started by ripping out Weblorg's core and only dependency, tempel, and substituted it with a homegrown string-replacement pipeline using standard format specifiers. %s became the slug, %t became the title, and %c was the compiled HTML content. I spun up a fleet of with-temp-buffer instances, injected the raw text, dumped the output, and called it a day.

It was a brute-force data pipeline that boiled down to this monstrosity:

(format-spec
 template
 `((?p . ,(or (org-html--build-pre/postamble 'preamble  info) ""))
   (?P . ,(or (org-html--build-pre/postamble 'postamble info) ""))
   (?t . ,title-fragment)
   (?d . ,date-fragment)
   (?T . ,tags-fragment)
   (?r . ,reading-time)
   (?c . ,contents)))

Which was crudely mashed into an HTML template:

%x
%D
<html%a>
  <head>
         <meta charset="utf-8">
         <meta name="viewport" content="width=device-width,initial-scale=1">
         <title>%t</title>
         %O
         %H
  </head>
  <body>
         %p
         <main id="content">
                <article>
                  <h1 class="title">%t</h1>
                  <div class="page-meta">%d 🞄 %T 🞄 %r</div>
                  %c
                </article>
         </main>
         <div class="back-home"><a href="/">← all articles</a></div>
         %P
  </body>
</html>

It worked… until it didn't.

The architecture shattered the second I strayed off the beaten path. What if a post had a subtitle? Hardcode a %S. What if I wanted a custom canonical URL? Hardcode a %U. God forbid I wanted to inject an inline <style> block containing a CSS percentage like width: 100%; format-spec would mistake it for a missing variable and vomit a backtrace all over my screen.

It felt too entrenched, hopelessly rigid. Every new piece of metadata required hardcoding another arbitrary format specifier. I fought a losing battle against the boundary between dynamically evaluated Lisp and dead text.

Strings are dumb. They have no context.

So I did the next (il)logical thing: I doubled down. If static strings were dumb, I'd make them smart. I started binding those string specifiers to evaluated closures in a desperate attempt to smuggle dynamic Lisp execution into a flat text pipeline.

I was so engrossed in my stupidity that I was deaf to the laments of the code actively fighting back, screaming at my folly. It was a spectacular failure; I was trying to build a primitive, context-blind string formatter inside an editor that already possessed one of the most sophisticated, yet ergonomic AST parsers on the planet. In other words, I was bolting a warp drive22 🖖 onto a horse cart.

Meanwhile, the real solution had been sitting right under my nose the entire time, tapping its foot, waiting for me to remember to get the hell out of its way, eyeing me with a look that needn't much interpretation: I'm tired of your shit.

I needed a paradigm shift, so I threw the whole thing away…

Embracing the heritage

well, not entirely.

The foundation, the core routing logic inspired by Weblorg, was sound. It just took a moment to decipher the whispers between the screams; what the code was really trying to tell me was that my string-mashing idiocy needed to be dragged out back and shot. The epiphany hit me like a Samsung truck:33 If you know, you know. I shouldn't even be touching the damn HTML.

I tossed the string templates into the fire. Instead of passing variables to a magic string, I equipped the routes with a :template parameter expecting a pure Lisp closure instead of a file path or a string. I stopped trying to wrap the content. I handed the user the raw, parsed Org context and told them: "Here is a temporary buffer. Knock yourself out."

The default :template collapsed from a massive HTML skeleton into a thing of minimalistic beauty:

(lambda (ctx)
  (when-let ((path (alist-get 'abspath ctx)))
    (insert-file-contents path)))

This alone unlocked an entire dimension of extensibility. I could programmatically fill the temporary buffer with whatever I wanted using nothing but Lisp. The simplest case consisted of dumping the file in, but the ceiling had vanished; the possibilities were endless.

Getting the data into the pipeline was only half the battle, and shoving text into a temporary buffer doesn't magically get the damn HTML onto the disk. The pipeline needed an exit strategy.

Enter the :exporter parameter, the cherry on top. Its default value? An even shorter lambda function that handed the final rendering back to org-export:

(lambda (_)
  (org-export-as 'opd))

Hallelujah. The heavens parted and the holy grail of the org-centric design I was chasing revealed itself: infinite extensibility through extreme minimalism. Because opd was now just a routing layer feeding data into the standard Org export pipeline, I didn't have to invent a bespoke DSL, wrestle with hooks, or manage bloated plugin registries. I could just use org-export-define-derived-backend, inherit from a minimal opd base backend, and write native ox.el transcoders. Margin notes? Custom blocks? RSS feeds? They came for free.

I had successfully shrunk the engine's surface area to near-zero. By standing on the shoulders of ox.el, the entire public API collapsed into just four primitives:

  • opd-site
  • opd-route
  • opd-assets
  • opd-export

But the real black magic? Cross-route linking.

Stitching the grid

I had the HTML output, but isolated pages fall short of a website. I needed a web. Wiring up paths manually is a task for plebs. Classic SSGs trap you into hardcoding deployed paths into your source text, breaking local navigation, or worse, they force your physical src/ directory to mirror the deployed URL structure exactly.

Frankly, I refused to be a slave to rigid directory layouts. I wanted to drop a standard [[file:somewhere-far-away/some-file.org]] link into a post, have Emacs follow it natively while I was editing, and trust the engine to forge the exact permalink during build, regardless of its provenance or route affiliation.

When you're trapped in a sterile, vacuum-sealed test chamber with a single routing pipeline, this works naturally. But weaving that web across the labyrinth of exponential crisscrossing links between entirely disparate, virtual routes? That required cartography. I needed to build a map before I could walk the territory; I needed to borrow the concept of a "two-pass compiler."

Pass 1 is the scouting mission: it discovers the files, evaluates their destinations, and seeds a master URL :registry shared at the site level.44 Did I mention you could have multiple sites running in the same engine? Pass 2 executes the render. Having surrendered HTML generation back to Org, all I had to do was slip a custom link transcoder into our derived backend; a sleeper agent, if you will, intercepting Org's native link resolution before it could act.

(defun opd-translate-link (link desc info)
  "Resolve cross-route Org links using the site's URL registry.

LINK is the link to resolve. DESC is the description of the link.
INFO is the plist used as a communication channel."
  (if-let*
      ((type (org-element-property :type link))
       (path (org-element-property :path link))
       ((and
         (string= "file" type)
         (string-suffix-p ".org" path)))
       (base (plist-get info :opd-base-url))
       (site (opd--site-get base))
       (absp (expand-file-name path))
       (url (and site (gethash absp (gethash :registry site)))))
      ;; Use native `org-mode' link resolution to handle all
      ;; cases. Extend, not replace.
      (org-element-put-property link :path url)
    ;; Fall back to native `org-mode' link resolution.
    ;; FIXME: Goodness this is dirty.
    (replace-regexp-in-string
     "\\(src\\|href\\|poster\\)=\"\\(?:file://\\)\\([^\"]+\\)\""
     "\\1=\"\\2\""
     (org-html-link link desc info))))

Yes, it's an ugly regex. No, I am not apologizing. It's a load-bearing hack.

Yes, it's an ugly regex. No, I am not apologizing. It's a load-bearing hack.

By dynamically binding default-directory inside our temporary template buffer right before calling org-export-as, I was gaslighting Emacs into natively resolving the relative path against the source file's directory. All that's left for the transcoder is to query the registry, swap the link, and step back into the shadows. It bridges the gap between separate routes transparently, without you ever having to think about it. ✨ Magic ✨

But where did that :opd-base-url come from? That's not something you'd usually find in the standard info plist55 Case in point: this very link points to another route entirely. See what I did there? .

Magic requires sleight of hand. Because the org-export process operates in a vacuum, completely detached from opd, I had to figure out a way to smuggle the site context across the boundary and into the transcoder. After dumping the content into the temporary buffer, I sneak a tracking bug directly into the file's metadata, a tracer round that correlates the file back to its route for link resolution.

(with-temp-buffer
  (let
      ;; Setting the `default-directory' to the output
      ;; path of the file naturally makes the export
      ;; come together.
      ((default-directory
        (if abspath (file-name-directory abspath) base))
       (buffer-file-name
        (or abspath (expand-file-name "index.org" base))))

    ;; Look, it's the template function!
    (funcall template context)

    (goto-char (point-min))
    ;; Place point after potential property drawers to avoid breaking them.
    (when (looking-at org-property-drawer-re)
      (goto-char (match-end 0))
      (forward-line))
    ;; Here's the sleight of hand: Inject state directly into the buffer.
    (insert "#+OPD_BASE_URL: " (gethash :base-url site) "\n")

    ;; Look, it's the exporter!
    (let ((rendered (funcall exporter context)))
      (mkdir (file-name-directory final) t)
      (write-region rendered nil final))))

The Org AST eats the #+ keyword natively, ferrying the injected data down the pipeline directly into the transcoder's info channel.

(org-export-define-derived-backend
 'opd 'html
 ;; Greasing Org's paw right here.
 :options-alist
 '((:opd-base-url "OPD_BASE_URL"))
 :translate-alist
 '((headline . opd-translate-headline)
   (link     . opd-translate-link)))

Just like that, magical cross-route linking, achieved.

Containing the blast radius

I was riding high, and as the routing table scaled, a new, insidious villain emerged: State bleeding.

Emacs, and by extension org-mode, is a mansion built on a precarious foundation of global variables. A sprawling estate wired to a single fuse box overloaded with exposed copper. Turn on the toaster in the kitchen, and the master bedroom catches fire. Handling a ToC on the main blog alongside an RSS feed devoid of one required finesse. Setting org-export-with-toc bleeds over, contaminating the XML.

The naïve approach of manually let-binding variables around the route fails with a pathetic whimper.

(let ((org-export-with-toc t))
  ;; Some route with ToC.
  (opd-route ...))
;; Another route without.
(opd-site ...)
(opd-export)

Because opd-route merely registers the configuration, the variables are actually evaluated much later, deep inside the bowels of the opd-export loop. By the time the export spits out HTML, your pristine let block has long since evaporated into the ether. Your local overrides are dead on arrival.

Okay, fine. I could set the option per-file instead.

#+OPTIONS: toc:t

Right. Good luck managing that metadata across ten thousand Zettelkasten files without losing your mind.

The fallback was to let-bind variables manually inside every single custom exporter lambda. Sure, that works. But it feels dirty, disgustingly redundant. You're writing boilerplate closures just to flip switches unrelated to opd. Nah. We could do better. I needed isolated execution chambers, not the five stages of grief.

The solution lay buried deep in the Emacs source code: an arcane Common Lisp macro called cl-progv. By introducing an :env property to the route, I could command the engine to dynamically bind and unroll execution state on a strict, per-route basis.

(defmacro opd--with-env (route &rest body)
  "Evaluate BODY with dynamic bindings specified in ROUTE's `:env'."
  (declare (indent 1))
  (let ((r-var (make-symbol "route")))
    `(let*
         ((,r-var ,route)
          (env (plist-get ,r-var :env))
          (vars (mapcar #'car env))
          (vals (mapcar #'cdr env)))
       (cl-progv vars vals
         ,@body))))

Then, inside the orchestrator, instead of blindly executing the export trigger:

(funcall (plist-get route :export) route)

I drop it in the containment field:

(opd--with-env route
  (funcall (plist-get route :export) route)))

I could wrap the entire build script in a master let block for global defaults and declare an entire suite of variables locally to each route to surgically override the globals with zero side effects. This trick turned out to be our savior against a deeply buried, hardcoded quirk in ox-rss.el, where it blindly prepends ./ to your permalinks unless you explicitly define org-html-link-home.

This trick turned out to be our savior against a deeply buried, hardcoded quirk in ox-rss.el, where it blindly prepends ./ to your permalinks unless you explicitly define org-html-link-home.

The stack unrolls, the state resets, and the bulkhead holds.

(let
    ((org-export-with-toc nil)
     ;; A bunch of other globally applied defaults.
     (org-html-html5-fancy t))
  ;; Some route for posts
  (opd-route ...)
  ;; The RSS route with its local override
  (opd-route
   :name "rss"
   :pattern "*.org"
   :output "public/rss.xml"
   :url "/rss.xml"
   :env
   '((org-html-link-home . "http://localhost:8080")) ; < The containment field
   :exporter
   (lambda (_)
     (org-export-as 'rss))))

What happens in the route, stays in the route.

Combinatorial explosion

I was still hoarding Weblorg-era functions like opd-input-filter-drafts and opd-input-aggregate-all-desc. It was functional, but aesthetically offensive. It lacked mathematical purity and the convenience of dropping something that just works.

Passing hardcoded function symbols around is a rigid, brittle, and unimaginative way to build an API; you either bloat the engine with endless primitives, or you shove the burden onto the user. I needed a grammar, a syntax of pure intent, not a script. So, I gutted the monolithic filters and abstracted the logic into boolean combinators and higher-order closures. I separated the logic of combination:

  • opd-filter-any
  • opd-filter-all
  • opd-filter-omit

… from the logic of matching:

  • opd-match-tag
  • opd-match-prop
  • opd-match-has-prop

The API's surface area dissolved almost entirely, replaced by pure, declarative composition. A typical "posts" route collapsed into something that reads like plain English:

(opd-route
 :name "posts"
 :pattern "posts/*.org"
 :url "/%s.html"
 :aggregate
 (opd-aggregate-each)
 :filter
 (opd-filter-omit
  (opd-match-tag "draft")
  (opd-match-tag "archive")))

But the true power of functional composition is the absence of a ceiling. Nothing prevents you from chaining these primitives into highly specific data pipelines:

(opd-route
 :name "rss"
 :pattern "*.org"
 :output "public/rss.xml"
 :url "/rss.xml"
 :canonical nil  ; Don't forget the canonical flag.
 :aggregate
 (opd-aggregate-all
  (opd-sort-date))
 :filter
 (opd-filter-all
  (opd-filter-any
   (opd-match-tag "blog")
   (opd-match-tag "lore"))
  (opd-match-has-prop 'date)
  (opd-filter-omit
   (opd-match-tag "draft")
   (opd-match-tag "archive")))

For the untrained eyes, this matches anything tagged with blog or lore but not draft or archive, and has a date, aggregates everything into a single file, sorted by date. Powerful.

For the untrained eyes, this matches anything tagged with blog or lore but not draft or archive, and has a date, aggregates everything into a single file, sorted by date. Powerful.

… and it executes with the cold efficiency of a guillotine, exactly as intended.

This was the absolute pinnacle of functional design. I could compose infinitely complex filtering rules using native closures, finally escaping the cognitive rot of manual if/and/or chains or the boilerplate of one-off lambda function. opd no longer dictated how data should be processed; it was merely providing the primitives for you to process it yourself.

With the architecture crystallized, I decided to punish it.

10,000 posts

Ten thousand posts. That was the benchmark. A wildly unscientific one, a synthetic stress test designed to break the machine's spirit before I'd even written my first real post. If it couldn't survive ten thousand clones, it didn't deserve to host a single word of mine.

for i in {1..10000}; do cp test.org "test-${i}.org"; done

I ran the loop, cloned a test file 10,000 times, cranked the garbage collection threshold to the moon and fired the engine. Emacs redlined a single CPU core at 99%.

Five and a half minutes.

It was agonizingly slow, processing files at around 32 milliseconds per file. In the compiled world of Go or Rust, five minutes for 10,000 files is an eternity.

But then I did the math. For a single-threaded interpreter dynamically building Abstract Syntax Trees on-the-fly, natively resolving cross-route links, and transcoding HTML, 32 milliseconds per file is actually an absolute marvel.

Yet, I demanded blood. Acceptance is failure dressed in a suit, and copium isn't in my vocabulary. There had to be a way to drop these timings. The first rule of optimization is simple: make the machine do less work. To find the friction, you have to follow the heat. Let's trace the exact lifecycle of a single file and find out where the CPU is gnawing on itself.

opd-file-lifecycle-diagram.svg

Wait a minute. I was paying a double tax.

The first pass was grinding through the entire AST just to find a title and a date at the top of the file. The second was doing it all over again to spit out the HTML. I was making the machine walk the same mile twice, carrying the same heavy luggage, for no reason other than my own laziness. It was an industrial-grade waste of CPU cycles.

I could amortize the performance hit by invoking a very specific voodoo incantation:

(let ((org-inhibit-startup t))
  (org-unmodified
   ;; While using `org-with-file-buffer' is slower than
   ;; `with-temp-buffer' + `insert-file-contents', but
   ;; correct, less memory hungry, and potentially faster
   ;; if run from an instance with the file already open in
   ;; a buffer.
   (org-with-file-buffer filename
     ;; Rest of Org operations go here.
     )))

The pipeline is purely read-only; it never mutates anything, permitting this aggressive bypass unconditionally.

But that wasn't enough. The bottleneck remained locked within org-element-parse-buffer. I could've used a faster alternative instead:

(org-collect-keywords '("TITLE" "DATE" "FILETAGS" "SLUG"))

Or even:

(while (re-search-forward org-keyword-regexp)
  ;; Capture the metadata.
  )

These approaches would've been orders of magnitude faster than parsing the entire buffer, but they come at a cost: Accuracy.

org-collect-keywords matches only the explicitly requested keywords. Custom keywords slip through the cracks; that won't do. Regular expressions are notoriously finicky and aren't guaranteed to match all keywords, they silently drop edge cases. Trading accuracy for speed puts you on a highway to the nearest mental asylum.

If only there was a way to not parse the entire Org buffer…

(org-element-parse-buffer)

I stared at org-element-parse-buffer. This was the gatekeeper standing between me and sub-second compilation times. It greedily held the keys to the kingdom. It stared back with unblinking contempt. I cracked open its docstring for the thousandth time.

org-element-parse-buffer is a native-comp-function in ‘org-element.el’.

(org-element-parse-buffer &optional GRANULARITY VISIBLE-ONLY KEEP-DEFERRED)

Inferred type: (function (&optional t t t) t)

Recursively parse the buffer and return structure.
If narrowing is in effect, only parse the visible part of the
buffer.

Optional argument GRANULARITY determines the depth of the
recursion.  It can be set to the following symbols:

‘headline’          Only parse headlines.
‘greater-element’   Don’t recurse into greater elements except
                    headlines and sections.  Thus, elements
                    parsed are the top-level ones.
‘element’           Parse everything but objects and plain text.
‘object’            Parse the complete buffer (default).

When VISIBLE-ONLY is non-nil, don’t parse contents of hidden
elements.

When KEEP-DEFERRED is non-nil, do not resolve deferred properties.

Closer.

‘greater-element’   Don’t recurse into greater elements except
                    headlines and sections.  Thus, elements
                    parsed are the top-level ones.

Motherf—

Are you telling me I could've spared millions of wasted CPU cycles just by telling the parser to skim the damn surface? A single flag to grab the top-level metadata while completely bypassing the recursive hellscape of inline parsing? I didn't know whether to weep, laugh, or hunt down the authors of org-element.el and buy them a drink.

(org-element-parse-buffer 'greater-element t)

That's it. Let's run the benchmark again.

Three minutes and forty-four seconds later, it spat out 10,000 HTML posts and a monolithic index containing 10,000 links. That's twenty-two milliseconds per file.

Not too shabby. The times plummeted; not by an order of magnitude, but still a respectable amount. I can compile an entire decade of daily blogging from scratch before I finish brewing a shot of espresso.

Hot reloading, anybody?

Because there's no such thing as absolute immunity from frontend envy, the stench of Vite.js eventually crept into my terminal. The humiliation of hitting "save" on a CSS file and having to manually run a build script just to see a background color change was too much. The web-devs were mocking me and my ancient Lisp machine from their React ivory towers. I needed instant hot-rebuilding.

Adding insult to injury, I naturally shot myself in the foot. In a misguided attempt to shave a single line of code, I swapped:

(with-temp-buffer
  (insert-file-contents file)
  ;; Rest of operations.
  )

For:

(with-temp-file file
  ;; Rest of operations.
  )

Emacs happily obliged, truncating my source files to absolute emptiness the second the build started. Years of notes, gone in a blink. I stared at the abyss. The abyss stared back. I owe my continued unmedicated state to git reflog and Emacs's paranoid backup system, otherwise I would be writing this from a padded cell.

Once I recovered from my own absurdity, I tapped into Emacs' native filenotify. Sunk-cost fallacy be damned, the premise was simple: watch the directory, detect a save, and trigger opd-export.

It worked, technically. But for a relatively big blog – say, 500 posts – a full build takes about 11 seconds. That's a fast cold-start for a compiler, but triggering a full 11-second build on every typo fix was masochism.

I started, as one would, using memoization, by slapping a blunt, lazy cache across the routes.

(defun opd--route-posts (route)
  "Pull all posts found for a given ROUTE.

This function will run the find, filter, aggregate pipeline and
cache the results. When it's called again with the same
parameters it should use the cache and not really run the
pipeline again."
  (let*
      ((site (plist-get route :site))
       (cache (gethash :cache site))
       (key (plist-get route :name))
       (val (gethash key cache)))
    (or val (puthash key (opd--route-collect-and-aggregate route) cache))))

But caching at the route-level is equivalent to using a shotgun for killing a mosquito. If you change a single comma in post.org, the entire route's cache invalidates, and the engine obediently burns cycles rebuilding the other 499 posts right alongside it. The cache had to be pushed deeper. Down to the file-level.

I needed to go truly incremental. But wait… how do I know what to rebuild?

Before I could even build the cache, I had to figure out what the hell to actually watch. My first instinct was a brute-force directory sweep, wrestling with project-ignore and regex patterns to blacklist .git folders and whatnot. It was tedious, error-prone garbage. Then it hit me: why am I guessing? I already know exactly which files matter. I had an entire registry of that. All I had to do was walk the registry backwards, tracing every valid file up to its base directory, and attach the native OS watchers exclusively to the directories that mattered. Silent, precise, zero-config.

(defun opd--collect-watch-dirs ()
  "Retrieve a list of directories to watch based on registered files."
  (let ((watch-dirs (make-hash-table :test 'equal)))
    (maphash
     (lambda (_ site)
       (maphash
        (lambda (_ route)
          (let ((base
                 (file-name-as-directory
                  (expand-file-name (plist-get route :base-dir)))))
            ;; Always watch the route's base directory.
            (puthash base t watch-dirs)
            ;; Trace back from every registered file up to the base.
            (maphash
             (lambda (path _)
               (let ((dir (file-name-directory path)))
                 (while
                     (and
                      dir
                      (string-prefix-p base dir)
                      (not (string= base dir)))
                   (puthash dir t watch-dirs)
                   ;; Move one directory up: "/a/b/c/" -> "/a/b/"
                   (setq dir (file-name-directory (directory-file-name dir))))))
             (gethash :registry site))))
        (gethash :routes site)))
     opd--sites)

    (hash-table-keys watch-dirs)))

Because of my oh-so-brilliant functional refactor earlier, the aggregators became opaque, unidentifiable closures. The engine couldn't tell what was what. It blindly rebuilt everything anyway. I couldn't rely on that even if I wanted to and besides, that's a user-provided value; and humans are entropy incarnate. They will always break your assumptions.

To solve this, I had to confront a demon I'd been ignoring. A wild :canonical flag appears. I was still harboring a grudge against this thing and I'm not one to swallow bitter pills without a fight. It was parading around as a necessary architectural evil, but in reality, it was a bear trap in disguise, armed and ready to bite the foot of some unsuspecting user. I needed to kill it with fire. But how?

I stared at the aggregators code.

(defun opd-aggregate-each ()
  "Aggregate each post as a single collection.

This is the default aggregation, it generates one
collection per input file.

It returns a list containing each post."
  (lambda (posts) posts))

"Ackshually, you could've used #'identity for that." – No shit, Sherlock. Read on.

"Ackshually, you could've used #'identity for that." – No shit, Sherlock. Read on.

The pattern finally clicked. I had gotten lazy and forgotten a minor detail during the functional rewrite. There was an accidental, fundamental difference between the return types of the aggregation closures. opd-aggregate-each returns the post properties as-is, right at the top-level. Other aggregators wrapped their payloads in nested posts or category lists.

(defun opd-aggregate-all (&optional sorter)
  "Aggregate all posts within a single collection.

This aggregation generates a single collection for all the
input files. It is useful for index pages, RSS pages, etc.

If SORTER is nil, posts are kept in the order they're found,
otherwise SORTER is applied to the posts."
  (lambda (posts)
    `(((posts . ,(if sorter (seq-sort sorter posts) posts))))))

I didn't need a manual flag nor to tell the engine which route was primary, nor did I need to interrogate opaque closures to figure out what to rebuild. I could rely entirely on the shape of the chunk instead.

And out the window goes :canonical. Sayonara, and I hope never to see you again.

A wild duck-typed router appears.

Instead of interrogating the closures, the engine simply inspects the data chunks after aggregation. It boils down to a fundamental binary: 1:1 routes versus 1:N routes.

Does the chunk have an abspath property at the top level? It's a 1:1 route. It maps one input file to exactly one output file. Rebuild it only if the path matches the exact file that changed. Is the abspath missing or buried deep inside a nested list of posts? It's a 1:N aggregate – an index, tag page, RSS feed, you name it. Rebuild it, because its aggregated data just changed.

(defun opd--tree-has-abspath-p (tree path)
  "Recursively search an arbitrary Lisp TREE for (abspath . PATH)."
  (cond
   ;; Base case: found the exact key-value pair
   ((and (consp tree) (eq (car tree) 'abspath) (equal (cdr tree) path)) t)
   ;; Recursive case: traverse both sides of the cons cell
   ((consp tree)
    (or (opd--tree-has-abspath-p (car tree) path)
        (opd--tree-has-abspath-p (cdr tree) path)))
   ;; Regular atom
   (t nil)))

Brace yourself for the gory details. One giant, disgusting looking function coming right up:

(defun opd-watch-start ()
  "Start watching registered sites for changes for incremental hot-reloading."
  (interactive)
  ;; Cleanup old watchers.
  (mapc #'file-notify-rm-watch opd--watch-descriptors)
  (setq opd--watch-descriptors nil)
  ;; Run a full build and warm up the cache + registry.
  (opd-export)

  (letrec
      (;; Track pending rebuild timers for debounce.
       (timers (make-hash-table :test 'equal))
       (rebuild
        (lambda (file action)
          (when-let ((timer (gethash file timers)))
            (cancel-timer timer))
          (puthash
           file
           (run-with-idle-timer
            0.1 nil
            (lambda (f)
              (unwind-protect
                  (let ((start (current-time)))
                    (opd-export-incremental f)
                    (opd--log
                     "file %s: %s, rebuilt in %s"
                     action
                     (file-name-nondirectory file)
                     (float-time (time-subtract (current-time) start))))
                (remhash f timers)))
            file)
           timers)))
       (callback
        (lambda (event)
          (when-let*
              ((action (nth 1 event))
               (file (or (nth 3 event) (nth 2 event)))
               ((stringp file)))

            (unless
                (or (string-prefix-p ".#" (file-name-nondirectory file))
                    (string-prefix-p "#" (file-name-nondirectory file))
                    (string-suffix-p "~" file))
              (cond
               ;; Dynamically attach watchers to a new directory.
               ((and
                 (memq action '(created renamed))
                 (file-directory-p file))
                ;; XXX: double watch on renamed because of inode
                ;; retention?
                (funcall walk file))
               ;; Remove the deleted file from cache.
               ((eq action 'deleted)
                (opd--log
                 "file deleted: %s, purging cache"
                 (file-name-nondirectory file))
                (remhash file opd--file-cache)
                (opd--log
                 (concat
                  "dead links may linger in aggregate routes; "
                  "you should probably run a full build")))
               ;; Remove the renamed file from the cache and
               ;; rebuild with the new name.
               ((eq action 'renamed)
                (let ((old (nth 2 event)))
                  (opd--log
                   "file renamed: %s > %s"
                   (file-name-nondirectory old)
                   (file-name-nondirectory file))
                  (remhash old opd--file-cache)
                  (funcall rebuild file 'renamed)
                  (opd--log
                   (concat
                    "dead links may linger in aggregate routes; "
                    "you should probably run a full build"))))
               ;; Debounced build for created/edited files.
               ((and
                 (memq action '(changed created))
                 (not (file-directory-p file)))
                (funcall rebuild file action)))))))
       (walk
        (lambda (d)
          (opd--log "attaching dynamic watcher to: %s" (file-relative-name d))
          (push (file-notify-add-watch d '(change) callback)
                opd--watch-descriptors)
          ;; Catch nested directories.
          (dolist
              (f (directory-files d t directory-files-no-dot-files-regexp))
            (when (file-directory-p f)
              (funcall walk f))))))

    (dolist (d (opd--collect-watch-dirs))
      (push (file-notify-add-watch d '(change) callback)
            opd--watch-descriptors)))

  (opd--log
   "incremental watcher live, %d watchers attached"
   (length opd--watch-descriptors))

  ;; Prevent premature exit in batch-mode.
  (when noninteractive
    (opd--log "running in batch mode - press ctrl+c to exit")
    (while opd--watch-descriptors
      (read-event nil nil 0.5))))

Take that, Vite.

The sweeper's demise

Performance breeds arrogance. In a fleeting bout of folly, I succumbed to feature creep and flirted briefly with the idea of a garbage collector, a "sweeper", to track every generated artifact, a tool to prune orphaned files and empty directories from the output without nuking the whole thing.

I wrote a complex state-tracking manifest, wired it into the routing loop. And what a blunder that was.

The code started shrieking again. Tracking stateful build manifests, manually walking directory trees, and sorting their depths violated my nostrils. It violated the functional purity I had painstakingly established.

It was an uphill battle for near-zero gain. This problem had already been solved decades ago by the UNIX philosophy: Just… you knowdelete the whole damn output directory and rebuild from scratch.

I got rid of the cruft. Less code is good code. Sometimes, the smartest engineering decision is recognizing when a problem had already been solved half a century ago by a shell command: rm -rf output/. It's brutal, stateless, and mathematically guaranteed to be correct.

The burning crusade

The test suite glowed green, lulling me into a false sense of security. I leaned back, waiting for the final output. Instead, an exception ruptured the pipeline, bleeding an incomprehensible backtrace across my terminal like a severed artery. The build failed with the hostile ambiguity of a machine actively deceiving its creator.

I tore the routing logic apart, byte by byte. I spent hours plunged in a bloodshot debugging stupor, sifting through opaque ox.el internals, completely convinced opd had betrayed me and was fundamentally broken. I questioned the architecture. I questioned my own competence. And then, at the bottom of the call stack, after tearing my hair out by the roots, I finally found the culprit: a commented-out route sitting innocently in my own configuration file.

The call was coming from inside the house. My own sloppy config had poisoned the well, and the engine didn't think twice before gulping.

This called for a shift in doctrine. Defensive programming is for the faint of heart. It politely catches errors, attempts a graceful recovery, and sweeps the mess under the rug. But silent recovery breeds insidious state corruption. I wanted it to detonate. Loudly. I needed to embrace offensive negative-space programming.

I started sprinkling cl-assert throughout the codebase like holy water, branding function boundaries in the spirit of a zealot carving protective wards into the pillars of a demon-infested cathedral. I was handing out fatal assertions and execution-stopping errors that would make that legendary Australian BBQ slap proud. You passed a nil output path? Slap. The build halts. You thought about clobbering the canonical link? Slap. The engine dies immediately and barfs your exact mistake directly to your face. Enforcing invariants and smiting these bugs the moment they stepped out of line became a holy crusade.

It was a minor sacrifice of "unopinionated design" on the altar of strict, unforgiving validation. But it was a silver bullet. It kept the state corruption at bay, the routing table clean, and myself out of a straitjacket.

Reloading went cold

I had the incremental compiler, I could change a comma in a 10,000-word Org file and watch the HTML swap out in a few milliseconds.

Then I tweaked a CSS file nested in a assets/css directory. I waited for the reload. Nothing. I saved again. Silence. The watcher was stone-cold deaf.

I looked at the route definition:

(opd-assets
 :name "assets"
 :pattern "assets/*"
 :url "/%f"
 :output "public/%f")

I had forgotten a universally despised truth: Emacs' filenotify is a thin veil over system libraries such as inotify and fsevents. And these system libraries are rightfully, stubbornly, famously flat. They refuse to look inside subdirectories unless explicitly forced to. My assets were completely off the grid.

My first instinct was a brute-force directory sweep. I spent hours wrestling with directory-files-recursively before I realized I was reinventing a broken wheel. I already had a map of the territory. I fell back on the elegance and simplicity of eshell-extended-glob, traced the footprints left in the URL registry, and dynamically stapled a native OS watcher to every single parent directory that actually mattered.

Up until this point, I had built a segregated society. Org files enjoyed a highly sophisticated, cache-driven router, while static assets were relegated to a dumb, imperative copy-file loop. I was treating non-Org files as second-class citizens. Hypocrisy at its finest.

Why I was treating assets differently was beyond me. Maybe it was remnants of Weblorg's architecture influencing my design. Or maybe I had grown complacent, willing to tolerate a lazy, imperative hack so long as the files ended up in the right directory.

This called for a grand unification.

An image, a stylesheet, or any other file for that matter, is just a post without an AST. I tore down the wall, unifying static assets as a special case of a standard route. With a dummy parser – opd--parse-asset – that bypassed Org entirely and yielded the file's path properties.

(defun opd--parse-asset (path &optional base _)
  "Minimal parser for a static asset at PATH.
BASE is the route containing PATH."
  (let*
      ((paths (opd--resolve-paths path base))
       (slug (opd--slugify (file-name-nondirectory path))))
    (append paths `((slug . ,slug)))))

The assets were flowing natively through the exact same duck-typed, 1:1 incremental router as other posts. The engine didn't care what the file was, it traced the chunk and moved the bytes. Total systemic harmony.

Just as I was about to declare absolute victory, a bug crept out of the woodwork and shattered the illusion.

I opened my macro collection – setup.org, a utility file injected into the top of some posts via the #+SETUPFILE keyword. I changed a macro definition and hit save. The watcher fired, checked the routing table, realized setup.org was an excluded utility file, and immediately went back to sleep. It did absolutely nothing.

The true horror was the silence that followed. Every post that relied on that macro remained blissfully unaware, serving stale, outdated content.

The file-level AST cache I was so proud of was strictly bound to filesystem modification times. If File A includes File B, and you save File B, File A's mtime hasn't changed. To the cache, File A is pristine. Untouched. The engine was blind to transient relationships.

I needed a way to track the bloodlines between files. I couldn't rebuild the entire site every time a macro changed, but I couldn't ignore the updates either.

I went back to the Org parser. During the first pass, while it was already skimming the surface for titles and dates, I instructed it to hunt for #+SETUPFILE and #+INCLUDE directives. If post.org included setup.org, the engine explicitly burned setup.org into post.org's metadata chunk. A primitive paper trail.

When the file watcher caught a change, it would iterate through the entire chunk cache, shaking every parsed post and asking: "Do you depend on this file?"

If, for some reason, you're unfamiliar with computer science terminology and you stuck all the way until here, I first want to say "Congrats" but also "Well damn, what kept you hooked this long?" This is to tell you that killing zombie children and slaughtering orphaned parents together is a common occurrence in our dialect and isn't as gory as it sounds. We may seem peaceful, but we are lexically violent. If a post's paper trail implicated the modified utility file, the engine invalidated the parent's cache, dragged it into the dirty queue by the neck, and forced it to the execution block alongside its child.

If, for some reason, you're unfamiliar with computer science terminology and you stuck all the way until here, I first want to say "Congrats" but also "Well damn, what kept you hooked this long?"

This is to tell you that killing zombie children and slaughtering orphaned parents together is a common occurrence in our dialect and isn't as gory as it sounds. We may seem peaceful, but we are lexically violent.

This was the flashlight that illuminated the blind spot. The cache became an active web of complicity. I had achieved true, hot-reloading nirvana. Or so I thought…

Redline

"Not too shabby" was a blatant lie.

I told myself twenty-two milliseconds per file was a victory. But deep down, it tasted like coping. The friction in the back of my brain wouldn't leave me alone, a constant, abrasive annoyance; my instincts were screaming that the engine had more to give, and the incessant screams only grew louder. Not because of some imaginary competition with the JavaScript ecosystem, but because leaving performance on the table when you know the Lisp machine has more gears to shift is a cardinal sin.

10,000 posts. We're back, baby.

To find the friction, I had to clean the house; the codebase was littered with maphash and recursive lambdas traversing sites, routes, and duck-typed ASTs to figure out the 1:1 vs 1:N routing structures. In Elisp66 If memory serves me right, Emacs Lisp has a hardcoded limit of 1600 recursion depth. , deep recursion is a great way to beg for a stack overflow and shoot your own foot. Furthermore, the nested indentation and endless parentheses began to look like a spaghetti murder scene.

(maphash
 (lambda (_ site)
   (maphash
    (lambda (_ route)
      ;; Do stuff with route.
      )
    (gethash :routes site)))
 opd--sites)

I ripped out the external maphash + lambda combo. I built syntactic macros to flatten the execution environment.

(defmacro opd--with-sites (site-var &rest body)
  "Evaluate BODY for each site, binding it to SITE-VAR."
  (declare (indent 1))
  `(maphash
    (lambda (_ ,site-var)
      ,@body)
    opd--sites))

(defmacro opd--with-routes (site-var route-var &rest body)
  "Evaluate BODY for each route in SITE-VAR, binding it to ROUTE-VAR."
  (declare (indent 2))
  `(maphash
    (lambda (_ ,route-var)
      ,@body)
    (gethash :routes ,site-var)))

Ah, much better.

(opd--with-sites site
  (opd--with-routes site route
    ;; Do stuff with route.
    ))

No more pulling punches. I went all-out. I started eliminating the obvious performance hotspots. I replaced every remaining lambda-wrapped utility with a magic incantation called cl-loop, a swiss-army chainsaw borrowed from Common Lisp, allowing you to iterate, accumulate, and bail out of complex data structures at maximum velocity. The kicker? It eliminates the overhead of environment-capturing closures. While we're at it, let's update those macros too.

(defmacro opd--with-sites (site-var &rest body)
  "Evaluate BODY for each site, binding it to SITE-VAR."
  (declare (indent 1))
  `(cl-loop
    for ,site-var being the hash-values of opd--sites
    do (progn ,@body)))

(defmacro opd--with-routes (site-var route-var &rest body)
  "Evaluate BODY for each route in SITE-VAR, binding it to ROUTE-VAR."
  (declare (indent 2))
  `(cl-loop
    for ,route-var being the hash-values of (gethash :routes ,site-var)
    do (progn ,@body)))

format-spec was next in line. It was escaping template strings repeatedly. On every single iteration. I pulled it out, forcing the engine to escape the templates exactly once during route initialization. eshell-glob-regexp got the same treatment and was eagerly memoized at route creation.

The code was cleaner, but the benchmark barely flinched. We're far from done.

Time to summon the heavy artillery; I fired up M-x profiler-start, Emacs' native truth serum. No more guesswork, I wanted to see exactly which functions were gnawing on the CPU in the dark.

          113492  73% - command-execute
          113438  73%  - funcall-interactively
          113437  73%   - execute-extended-command
          113433  73%    - command-execute
          113433  73%     - funcall-interactively
          113433  73%      - eval-buffer
          113013  73%       - let
          113013  73%        - opd-export
          113013  73%         - #<native-comp-function F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_12>
           67956  44%          - #<native-comp-function F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_11>
           67956  44%           - eval
           67956  44%            - let
           67956  44%             - funcall
           67956  44%              - #<byte-code-function 2CD>
           67942  44%               - opd-export-templates
           62782  40%                - #<interpreted-function B09>
           62779  40%                 + org-export-as
            1523   0%                - #<interpreted-function 8AA>
            1523   0%                 + org-export-as
             804   0%                + opd--log
             547   0%                + #<native-comp-function F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_23>
             310   0%                + #<interpreted-function BC0>
             306   0%                + mkdir
             199   0%                + make-lock-file-name
              65   0%                + #<byte-code-function 669>
              41   0%                + #<native-comp-function F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_24>
              26   0%                  generate-new-buffer
              19   0%                + #<interpreted-function 8E1>
              11   0%                  alist-get
               1   0%                  opd--route-posts
              14   0%               + opd-export-assets
           45057  29%          - #<native-comp-function F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_9>
           45057  29%           - eval
           45057  29%            - let
           45057  29%             - funcall
           45057  29%              - #<byte-code-function 626>
           45057  29%               - opd--route-posts
           45057  29%                - opd--route-collect-and-aggregate
           44228  28%                 - opd--parse-org-file-with-cache
           43978  28%                  - opd--parse-org-file
           20624  13%                   + find-file-noselect
           19547  12%                   + org-persist-write-all-buffer
            3109   2%                   + org-element-parse-buffer
             264   0%                   + opd--resolve-paths
             149   0%                   + org-element-map
              69   0%                   + opd--slugify
              30   0%                   + vc-kill-buffer-hook
              15   0%                   + #<byte-code-function DFE>
               9   0%                     browse-url-delete-temp-file
               9   0%                     add-hook
               7   0%                   + uniquify-kill-buffer-function
               6   0%                   + replace-buffer-in-windows
               5   0%                     alist-get
               4   0%                     process-kill-buffer-query-function
               3   0%                   + #<byte-code-function 7E2>
               3   0%                     run-hooks
               2   0%                     org-check-running-clock
               1   0%                    alist-get
             393   0%                 + opd--render-route-prop
             115   0%                 + opd--register-exclusions
              99   0%                 + opd--find-source-files
              35   0%                 + #<byte-code-function BE7>
               9   0%                 + opd--build-format-spec
               7   0%                 + #<byte-code-function B81>
               3   0%                   alist-get
               1   0%                   opd--merge-contexts
             243   0%       + package-initialize
             167   0%       + require
               7   0%       + internal-macroexpand-for-load
               3   0%       + condition-case
               2   0%      cancel-timer
               1   0%   + handle-focus-out
              54   0%  + byte-code
           38747  25%   Automatic GC
            1443   0% + timer-event-handler
              20   0% + redisplay_internal (C function)
               4   0% + ...
               1   0% + eldoc-schedule-timer
               1   0% + internal-timer-start-idle
               1   0% + #<byte-code-function BDB>

The profiler dump was a mirror reflecting the extent of my own stupidity.

The second pass consumes around 60% of the CPU, dominated entirely by org-export-as. That was expected; there's virtually nothing to do here, we can't optimize the core parser outside short of opening the door to another rabbit hole. But the first pass? opd--parse-org-file was thrashing 40% of the total build time just to extract keywords. The offenders couldn't hide anymore:

  • find-file-noselect: 13% CPU
  • org-persist-write-all-buffer: 12% CPU

My earlier assumptions that org-with-file-buffer was an elegant, memory-safe choice for sidestepping the overhead of temporary buffers were hilariously inaccurate. The profiler showed find-file-noselect hemorrhaging CPU cycles everywhere. I was forcing Emacs to treat 10,000 raw text files as interactive buffers, and it was dutifully doing what it was designed to do: triggering file locks, querying version control backends, and running a dozen or so major-mode hooks. For every file. Worst of all, it was triggering Org's org-persist, forcing the editor to aggressively write cache data to the hard drive for every single file it parsed. This would've been great for normal usage, but by now, we're no longer normal users, are we?

It was carrying the entire bureaucratic weight of an interactive operating system. Just to read a damn title string.

Out the window. We dropped back somewhere more comfortable, closer to the metal, where I controlled the raw I/O: with-temp-buffer paired with insert-file-contents. Lean, dumb, aggressively fast.

I fired the benchmark again.

The terminal went dead silent. The CPU fan spun down. It was too quiet. I stared at the prompt in disbelief. Usually, when the build script finishes this fast, it means opd silently choked on something nasty and died in the background. I ran the test suite. Green. I ran the benchmarks again. Same exact output. This wasn't a crash.

One minute and fifteen seconds. 7.5 milliseconds per file.

I leaned back in my chair, staring at the numbers once more. I was beating those smelly JS frameworks to death with an ancient crowbar.

But the high didn't last. A cold boot of 7.5ms per file is blistering, but when I spun up the incremental watcher to actually write a post, it felt sluggish. The compilation time was fast, but the engine was wasting time thinking about what to compile.

Technically, a hash-table is amortized $O(1)$ time complexity. But if you kept reading this far and you don't know the difference between average-case hashing and worst-case collisions, I can't help you. De-dust your Comp Sci books. So I did what any performance-obsessed manic does: I slapped hash-tables everywhere. chunkcache, filecache, depcache, registry. If a collection required a $O(n)$ linear scan, I ripped it out. Linear scans are algorithmic peasantry, it's knocking on 10,000 doors sequentially to find one specific file. A hash table is the master ledger; it teleports you exactly to where you need to be. That's $O(1)$ constant time.

Technically, a hash-table is amortized $O(1)$ time complexity. But if you kept reading this far and you don't know the difference between average-case hashing and worst-case collisions, I can't help you. De-dust your Comp Sci books.

Individual files were fast-ish while aggregates were still dragging. I realized the engine was treating all changes equally. A logical fallacy; not all changes were comparable. It needed a bifurcated execution path:

Content change
You change a word. The file metadata is identical. The engine intercepts it, updates the output, and bails immediately. This is the fast path.
Structural change
You add a new tag to your #+FILETAGS, or change an included file. The engine has to reconstruct the aggregates.

A massive improvement, but it still didn't solve the final boss: the transient dependency nightmare from earlier. The one where changing a macro file left the parent files serving stale content. My attempts to shake the cache were buckling under the weight of the routing table.

I needed more speed. I needed nitrous oxide.

I built the depcache, a hash-table backed, reverse-dependency graph. When the file watcher caught a change, it unleashed a DFS traversal algorithm.

If setup.org changed, the algorithm stalked through the graph, found post-a.org and post-b.org that included it, invalidated their caches, and dragged them onto the execution block. A cascading invalidation matrix.

The engine finally roared to life, idling with a cold, heavy hum. It sounded hungry.

Hot-rebuilds plummeted to a range of 7 to 200 milliseconds, depending entirely on the complexity of the source Org file. opd was practically out of the equation. The routing, dependency tracking, the duck-typing. It all executed in near-zero time. The only overhead left was the irreducible cost of org-export parsing the text.

(defun opd--incremental-export (files)
  "Incrementally rebuild the given FILES.

This function uses granular caches to search all sites for
the FILES; it returns non-nil if the FILES were found and
rebuilt."
  (cl-assert (listp files) nil "files must be list, got: %s" files)
  (cl-assert
   (cl-every #'stringp files) nil "incremental export requires string paths")

  (let*
      ((abspaths (mapcar #'expand-file-name files))
       (result nil))

    (opd--with-sites site
      (let*
          ((cache (gethash :cache site))
           (chunkcache (gethash :chunkcache site))
           (depcache (gethash :depcache site))
           (filecache (gethash :filecache site))
           ;; Resolve dependencies and invalidate impacted files
           ;; via DFS which also handles transient dependencies.
           (impacted
            (cl-loop
             with seen = (make-hash-table :test 'equal)
             with stack = (copy-sequence abspaths)
             while stack
             for item = (pop stack)
             unless (gethash item seen)
             collect item
             do
             (remhash item filecache)
             (puthash item t seen)
             (dolist (parent (gethash item depcache))
               (push parent stack)))))

        (opd--with-error
         (opd--with-routes site route
           (let*
               ((name (plist-get route :name))
                (oldcache (gethash name chunkcache))
                (impacts-route-p nil)
                (structural-change-p
                 (cl-loop
                  for f in impacted
                  for exists = (file-exists-p f)
                  for was-in-p = (and oldcache (gethash f oldcache))
                  do
                  ;; Dirty dirty mutation by side-effect; saves
                  ;; a secondary pass over `impacted'.
                  (when was-in-p
                    (setq impacts-route-p t))
                  thereis
                  (if (opd--route-match-p route f)
                      (not (eq
                            (not was-in-p)
                            (not (opd--route-accepts-p route f site))))
                    was-in-p))))

             ;; Bypass when no change detected.
             (when (or structural-change-p impacts-route-p)
               (remhash name cache) ;; Invalidate route cache

               (opd--with-env route
                 (let*
                     ((newcache (gethash name chunkcache))
                      (chunks
                       (if structural-change-p
                           (opd--route-posts route)
                         (cl-loop
                          ;; `eq' for pointer equality.
                          with set = (make-hash-table :test 'eq)
                          for f in impacted
                          do
                          (dolist (c (gethash f newcache))
                            (puthash c t set))
                          finally
                          return (hash-table-keys set)))))

                   (when chunks
                     (funcall (plist-get route :export) route chunks)
                     (setq result t))))))))))
    result))

The disparate parts had finally locked together.

It's alive

We Frankensteined a thing of beauty. We started with a rotting corpse – a brittle, naïve string-templating script. What crawled off the operating table is a declarative, hot-rebuilding orchestration layer powered by a purely functional dependency graph.

Tsoding said it best in his (not-so-obvious) love letter, "The annoying usefulness of Emacs". This project is the epitome of that sentiment. Emacs is a platform that is infinitely greater than the sum of its parts. By leaning entirely into ox.el and the native Org AST, the engine's surface area vanished. No bespoke plugin architectures nor brittle abstractions. Just the user, the site, and the route.

As the adage goes, "simple" is the farthest thing from "easy." Stripping a system down to its absolute bare metal is agonizing work, but the payoff is a piece of software, older than most of the JavaScript andies out there, waking up to beat modern web tooling at its own game. The modern web is plagued by developers hiding behind their bundlers, obsessed with shipping megabytes of client-side code just to render static text on a screen. Sometimes, building a compiler from scratch is exactly the kind of unhinged sanity check we need to remind ourselves what computers are actually capable of.

Yes, yes. Forging Excalibur in the fires of a dying star and naming it "Sharp metal stick". As they say, naming is one of the two hardest problems in computer science. Looking at the codebase now, calling this creation opd feels like an insult to its architecture. It became a razor-sharp, reactive compiler forged in blood, sweat, stack traces, profiler dumps, and the ashes of my long-gone notes (RIP). The acronym is ill-fitting. It has earned a real name: ossg, the Org Static Site Generator.

Yes, yes. Forging Excalibur in the fires of a dying star and naming it "Sharp metal stick". As they say, naming is one of the two hardest problems in computer science.

Time to take that damn line out of my README.

Now if you'll excuse me, I have some posts to write.

Afterword

I have to address the elephant in the room: hot-rebuilding isn't true hot-reloading. To get the browser to magically inject CSS without a refresh requires WebSockets, background Node processes, and a lot of external baggage.

But frankly? I'm not writing web applications. I'm publishing notes, articles, and lore. Sacrificing the purity of a zero-dependency Lisp machine is a steep price to pay to save a browser-refresh keystroke.

There is, of course, a catch to the magic. To make the duck-typed router and the dependency graph work, the engine has to know the territory. You can't hot-reload an engine that's freezing cold. Before the watcher drops into incremental mode, it requires one full, synchronous master build to warm the cache, index the registry, and trace the bloodlines. Call it the ignition tax.

I can already hear the systems-level pedants. "Why no parallelism? Why no asynchronous workers?" Believe me, the temptation to spin up headless Emacs subprocesses and blast the AST parsing across all my CPU cores was strong.

But Emacs threading is a cooperative illusion; threads yield like polite Victorian gentlemen, which during heavy CPU loads means they don't yield at all. True async requires serializing massive Lisp structures across IPC boundaries with its nightmarish overhead or rebuilding Emacs from the ground up with an event-loop. Pick your poison.

I am a relentless engineer, but even I know when to stop. Bolting asynchronous IPC onto a static site generator just to shave off a few milliseconds crosses the line from optimization into ego-driven insanity.

I also need to confess a sin of omission regarding the reactive router. To kill that abominable :canonical flag, I implemented a duck-typed chunk inspector. The engine decides if a chunk is a 1:1 post or a 1:N aggregate entirely by looking for an abspath property at the top level of the data tree.

This is a beautiful, frictionless lie. It assumes you aren't going to do something stupid like manually burying the abspath in a custom opd-aggregate-each closure, or magically injecting abspath into the chunk of your RSS-feed. If you violate this invisible, undocumented structural contract, the hot-rebuilding watcher will silently miscategorize your files, ignore your changes, and gaslight you into thinking you're losing your mind.

You have been warned. Treat the aggregation structure with respect, or the watcher will leave you for dead.

Is ossg perfect? Hell no. It's software. It has sharp edges, undocumented assumptions, and probably a few dormant bugs waiting to vomit a stack trace into your face. But it's fast, it's robust, it's mine, and it doesn't suck.

It lives entirely within my digital garden. You are free to take it. Fork it, rip its guts out, send patches, or use it to build your own unhinged Lisp compiler. Just don't open an issue complaining that it doesn't support your favorite JavaScript framework. I will close it out of spite.

Footnotes:

1

Sitemap in org-publish parlance refers to the index page listing all posts.

2

🖖

3

If you know, you know.

4

Did I mention you could have multiple sites running in the same engine?

5

Case in point: this very link points to another route entirely. See what I did there?

6

If memory serves me right, Emacs Lisp has a hardcoded limit of 1600 recursion depth.

← all articles