Building With LLMs in 2026: Lessons From the Past Year
We shipped dozens of LLM-backed products. Which patterns scaled, which collapsed — field notes from production.
We shipped dozens of LLM-backed products. Which patterns scaled, which collapsed — field notes from production.
Through 2025 we built 14 LLM-powered products. This piece walks through which architectural choices scaled with users and which broke once products matured.
Locking into a single model is the most common scaling mistake we see.
RAG was a generous starting point but rarely enough alone. Production-grade quality needs reranking, query rewriting and domain-specific filtering on top. Trusting a single embedding model is risky.
Cost control is the most overlooked architectural concern. Smart model routing — Haiku for simple queries, Sonnet for complex reasoning — can halve your bill.
Next.js 16, Tailwind v4, server components, the edge runtime — what should you actually build a web product with in 2026?
The technical details of dropping a $48K/month AWS bill to $15K.