The Three Layers of AI Progress (And the One You Can Actually Work On)
Three Layers
Models keep getting more capable every few months. The why splits into three distinct layers of improvement, stacked on top of each other. Each one moves at a different pace, is driven by a different set of people, and offers a different amount of room for you to contribute.
Layer 1: Hardware
This is the foundation. Faster chips mean larger models trained on more data in less time. The numbers here are staggering -- NVIDIA's revenue tripled in a year, hyperscalers are building gigawatt-scale data centres, and entire national AI strategies hinge on chip supply chains.
Can you contribute here? Realistically, no. This layer is a capital game. Even open-source hardware efforts like RISC-V accelerators are years away from competing. If you're reading this, you're almost certainly consuming hardware improvements, not producing them.
But it matters to you because hardware improvements silently make everything downstream better. The model that was too expensive to run last year is cheap this year. The context window that was 8K is now 1M. These gains show up in your agent's capabilities without you changing a line of code.
Layer 2: Model Training
This is the visible part most people mean when they say "AI progress": architecture innovations (transformers, mixture of experts, state space models), training data curation, RLHF, constitutional AI, and the dozens of post-training techniques that turn a raw model into something useful.
Can you contribute here? Somewhat. Fine-tuning and LoRA adapters are accessible. Open-weight models (Llama, Mistral, Qwen) let you experiment. But the frontier -- the cutting edge of what makes Claude or GPT-4.5 better than their predecessors -- is the domain of labs with thousands of GPUs and hundreds of researchers.
Most of us are downstream consumers of model improvements. When Anthropic ships a better Claude, we benefit. We didn't contribute to that improvement, and that's fine.
Layer 3: Application
This layer is everything between "a capable model exists" and "a useful thing gets done." It includes:
- Agent harnesses -- the loops, state management, and decision logic that turn a single model call into sustained autonomous work
- Prompt engineering -- not just "write a good prompt" but designing prompt systems that decompose complex work
- Tool orchestration -- giving agents the right tools, in the right context, with the right constraints
- Multi-agent patterns -- orchestrators dispatching specialist sub-agents, review loops, parallel execution
- Human-AI collaboration patterns -- when to steer, when to delegate, how to maintain context across sessions
This is the layer that's wide open. You don't need a data centre. You don't need a PhD. You need a problem, an API key, and willingness to experiment.
Why This Matters for Vibe Coders
The term "vibe coding" sometimes carries a whiff of "not serious." But the people experimenting with agent patterns right now -- even on silly projects, even on weekend hacks -- are doing genuinely valuable applied research.
Consider what we've learned just from building a chicken coop monitor:
- Single-turn agent invocations work better than persistent conversations for autonomous loops. Continuity comes from state files, not chat history.
- Parallel sub-agents can process independent tasks simultaneously, but you need to scope them to non-overlapping files or you get git conflicts.
- An orchestrator pattern (one persistent session directing multiple worker loops) gives you human-level steering without micromanaging every step.
- Frame extraction + vision models can replace expensive video analysis APIs -- a cheap Pi camera and some frame diffing gets you 90% of the way there.
- Agent personality isn't just flavour. It affects what the agent notices and reports. A "devoted parent" notices health concerns that a "neutral observer" skips.
None of these insights require a PhD. They came from building a thing and paying attention.
The Opportunity
The models are already good enough to do remarkable things, but the patterns for applying them are still being worked out. The gap between "what models can do" and "what people are actually getting done with them" is enormous.
That gap is a pure application-layer problem. It's about:
| Problem | What Helps |
|---|---|
| How do you give an agent enough context without overwhelming it? | State file design, context pruning, memory systems |
| How do you maintain quality across dozens of autonomous iterations? | Review loops, completion markers, progress tracking |
| How do you parallelise work without agents stepping on each other? | Task isolation, file-scoped agents, orchestrator patterns |
| How do you debug an agent that went off the rails? | Iteration logs, diff review, structured output |
| How do you keep a human in the loop without bottlenecking? | Async orchestration, batch review, trust calibration |
Every time you solve one of these problems -- even partially, even messily -- you're contributing to the collective understanding of how to actually use AI effectively.
Share What You Learn
The application layer is uniquely democratic. A solo developer watching chickens with a Raspberry Pi can discover agent patterns that apply to a Fortune 500 deployment. The patterns transfer because the fundamental challenge is the same: getting reliable, sustained work out of language models.
If you're building with AI agents -- even if it feels small, even if it feels silly -- write about what works and what doesn't. The community needs:
- Failure reports (what patterns don't work and why)
- Agent harness designs (how you structure loops, state, and tools)
- Collaboration patterns (how you split work between human and AI)
- Cost/quality tradeoffs (when to use which model, how many iterations)
The hardware folks will keep building faster chips. The training folks will keep building better models. Our job -- the application layer -- is to figure out how to turn all that capability into things that actually work.
You can start with a chicken coop.