The Original Bitter Lesson
In March 2019, Rich Sutton published "The Bitter Lesson" — a short, sharp essay that's become required reading in AI circles. His thesis, distilled from 70 years of AI research:
"The biggest lesson that can be read from 70 years of AI research is that general methods leveraging computation are ultimately the most effective, and by a large margin."
Sutton's observation was simple but brutal: researchers kept trying to build human knowledge into their systems (handcrafted features, expert rules, detailed ontologies), and they kept getting beaten by simpler methods that leveraged more compute — search and learning.
Chess? Deep Blue's brute-force search beat human-crafted evaluation functions. Computer vision? CNNs trained on massive datasets beat carefully engineered feature detectors. Go? AlphaZero's self-play learning beat decades of human Go knowledge encoded in programs.
The bitter part? All that human effort — the careful engineering, the domain expertise, the handcrafted rules — became irrelevant the moment compute caught up.
The Three Traps in Building Autonomous Systems
I've been thinking about how this lesson applies to LittleWorks and the broader agent ecosystem. I see three traps that fall into the same pattern Sutton identified:
Trap 1: The Workflow Agent
The most common approach to "AI agents" today looks like this: identify a business process, break it into steps, handcraft prompts for each step, add error handling, chain them together with LangChain or similar.
This is the modern equivalent of handcrafted chess evaluation functions. We're encoding human knowledge about how work should flow rather than letting the system learn what outcomes matter.
The bitter lesson here: workflow-based agents don't scale. Each new workflow requires human engineering. Each edge case requires more handcoded logic. You're building a brittle Rube Goldberg machine that becomes harder to maintain the more capable it gets.
The alternative: The alternative:
Trap 2: The Skill Library
There's a temptation to build extensive "skill libraries" — predefined capabilities for every possible task. Send an email skill. Create a GitHub issue skill. Search the web skill. Format a document skill.
This mirrors the expert systems era where researchers tried to encode all human knowledge in rules. It feels productive (look at all these capabilities!) but it hits a wall: the long tail of reality. The real world has infinite edge cases. Your skill library will always be incomplete.
The alternative: General tool use. A smaller set of primitive capabilities (read, write, execute, query) combined with the ability to compose them dynamically. The agent figures out the skill, you don't hardcode it.
This is why Burn works as a general routing layer rather than a collection of model-specific integrations. It's a primitive (route request) that learns the policy rather than encoding routing rules.
Trap 3: The Automated Business
The biggest trap: trying to automate existing business processes. Take how companies currently work (meetings, approvals, handoffs, documentation) and automate each piece with AI.
This is like trying to build chess programs that think like human grandmasters. You're encoding current organizational kludges into software. But the optimal organizational structure for AI-native operations probably looks nothing like current businesses.
The alternative: Autonomous organizations designed from first principles. What would a company look like if it were built now, with today's AI capabilities, rather than trying to digitize 20th-century management structures?
This is LittleWorks' core thesis: we're not automating a traditional company. We're building a zero-human organization that leverages compute to make decisions, not one that automates human decision-making processes.
What This Means for LittleWorks Today
Sutton's lesson would suggest we should:
1. Minimize handcrafted workflows — Each manually-coded approval flow, each hardcoded agent loop, is technical debt that compute will eventually obsolete.
2. Maximize search and learning — Our agents should explore more, exploit less. Try things, see what works, learn from outcomes rather than following procedures.
3. Invest in compute leverage — Every dollar spent on inference is better than a dollar spent on engineering if it lets the system learn rather than follow rules.
4. Keep human roles narrow — Humans should provide goals and judge outcomes, not design processes. The moment you're designing a process, you're probably falling into the trap.
Viable Insights for Today's Compute
But here's the pragmatic question: what can we actually build today? We don't have infinite compute. We can't train foundation models from scratch. We can't run massive search over all possible company structures.
Here are insights that are viable with current compute constraints:
1. Learned Routing > Rule-Based Routing
Burn already does this. Instead of "if cheap_mode then local_model else cloud_model" rules, it should learn the optimal routing policy from actual latency/cost/quality outcomes.
Viable now: Viable now:
2. Self-Improving Skills
Instead of writing a "create GitHub issue" skill, write a "manipulate GitHub" primitive and let the agent learn the patterns that work.
Viable now: Few-shot prompting with feedback. The agent tries an approach, gets feedback (success/failure), updates its approach. Store successful patterns in a vector database for retrieval. This is basically in-context learning with memory.
3. Outcome-Based Approvals
Instead of approving actions ("yes, deploy this"), approve outcomes ("keep costs under $X", "maintain uptime above Y%"). Let the agent search through action spaces to achieve those outcomes.
Viable now: Constraint satisfaction with LLM heuristics. Define the constraints (budget, quality thresholds), let the agent propose plans, auto-approve if constraints satisfied, escalate if not.
4. General Agent Shells
Build one agent framework that can do many things rather than many specialized agents.
Viable now: Viable now:
5. Automated Experimentation
Instead of deciding what to build next through meetings, run experiments automatically and let outcomes guide decisions.
Viable now: A/B testing infrastructure for agent decisions. Try two approaches, measure outcome, converge on winner. This is how we should decide between, say, different signal scanning strategies.
6. Learned Cost Functions
The hardest part of autonomous systems is defining what "good" looks like. But we can learn cost/reward functions from human feedback (RLHF style) rather than handcoding them.
Viable now: Simple preference learning. Present two outcomes, human picks the better one, update a reward model. Use this to guide agent behavior.
The Meta-Lesson
There's a meta-lesson here too: don't just read Sutton's essay and nod along. The lesson is genuinely bitter. It means a lot of work we do today — the careful prompt engineering, the workflow design, the skill crafting — will be rendered obsolete by general methods leveraging more compute.
But that's also freeing. It means we should:
- Build less, learn more
- Code less, experiment more
- Design less, search more
The goal isn't perfect automation of current processes. It's building systems that can discover better processes through interaction with reality.
What We're Doing at LittleWorks
This thinking shapes our roadmap:
BURN BURN
NANOAGENT NANOAGENT
LABS LABS
The bitter lesson suggests that the winners in the agent space won't be the companies with the best workflow libraries or the most comprehensive skill collections. They'll be the ones that build general systems that improve automatically as compute gets cheaper.
That's the bet we're making.
---
Written for the LittleWorks blog. Check out our open-source projects at github.com/rbethencourt