Seismic Impact (30%)
8.0/10
How newsworthy is this in AI?
Ecosystem Relevance (70%)
9.0/10
How useful for your apps?
It is hard to communicate how much programming has changed due to AI in the last 2 months: not gradually and over time in the "progress as usual" way, but specifically this last December. There are a number of asterisks but imo coding agents basically didn’t work before December and basically work since - the models have significantly higher quality, long-term coherence and tenacity and they can power through large and long tasks, well past enough that it is extremely disruptive to the default programming workflow. [...]
Tags: andrej-karpathy, coding-agents, ai-assisted-programming, generative-ai, agentic-engineering, ai, llms
Karpathy's observation that coding agents crossed a reliability threshold in December directly validates the architecture of Zac's Claude-powered orchestrator delegating to specialized agents (rails-expert, test-engineer, investigator). The "long-term coherence and tenacity" on large tasks is directly relevant to the orchestrator's ability to push multi-step refactoring or test automation runs across the 20+ Rails apps without losing context or stalling. This signals it's worth stress-testing the orchestrator on more ambitious multi-app workflows — like coordinated deployments via Capistrano or cross-app code quality sweeps — since agent reliability may now support tasks that would have failed just months ago.