Notes from building a tiny long-running agent harness
Twenty cents and a workbench for open-weights models
Notes from building a tiny long-running agent harness
I built Tilth as a workbench: a ~600-line Python agent harness that runs autonomously against any OpenAI-compatible endpoint. I wanted my own mechanism for actually doing work with open-weights models, not just chatting with them, and I wanted to internalize the current thinking on long-running agents while I did it. Addy Osmani’s trilogy on long-running agents is a great treatment of that thinking; read it, it’s what nudged me to build this. This post is a check-in on what I learned doing it.
The architecture, distilled
Three components, independently replaceable:
Brain. A thin wrapper around the
openaiSDK pointed at any OpenAI-compatible base URL. Worker and judge can sit on different providers.Hands. A per-session git worktree, bash + file tools, allow-listed.
Session. An append-only
events.jsonlplus a checkpoint, enough towake(session_id)on a fresh process.




