Patterns, frameworks, and tools from GitHub and RSS feeds
We thought about this carefully before choosing hyperbole; but it is warranted.
Our latest model, Claude Opus 4.7, is now generally available. Opus 4.7 is a notable improvement on Opus 4.6 in advanced software engineering, with particular gains on the most difficult tasks.
Dreaming, outcomes, and multiagent orchestration are now available in Claude Managed Agents. Build agents that learn, meet a quality bar, and work in parallel.
Anthropic's latest AI model is better at coding, sustaining tasks for longer and creating high-quality professional work.
Claude Opus 4.6 is here. There is a lot to say.
Weβre upgrading our smartest model. Across agentic coding, computer use, tool use, search, and finance, Opus 4.6 is an industry-leading model, often by wide margin.
<p><em><a href="https://simonwillison.net/guides/agentic-engineering-patterns/">Agentic Engineering Patterns</a> ></em></p> <p>I use the term <strong>agentic engineering</strong> to des...
Early results from MirrorCode benchmark with METR: AI agents can complete weeks-long coding tasks, including reimplementing a 16,000-line codebase.
<blockquote cite="https://twitter.com/karpathy/status/2026731645169185220"><p>It is hard to communicate how much programming has changed due to AI in the last 2 months: not gradually and over ...
The new SOTA model asserts its dominance.
The accidental "open sourcing" of Claude Code brings a ton of insights.
<blockquote cite="https://crawshaw.io/blog/eight-more-months-of-agents"><p>I am having more fun programming than I ever have, because so many more of the programs I wish I could find the time ...
<blockquote cite="https://lethain.com/company-ai-adoption/"><p>My experience is that <em>real</em> AI adoption on <em>real</em> problems is a complex blend of: domain context on the problem, d...
Anthropic’s new Cowork tool, Anthropic Raising $10 Billion at $350 Billion Value, Deep Delta Learning
<p>Last month I <a href="https://simonwillison.net/2025/Dec/15/porting-justhtml/">wrote about porting JustHTML from Python to JavaScript</a> using Codex CLI and GPT-5.2 in a few hours while al...
We are releasing Agent Skills for Postgres Best Practices to help AI coding agents write high quality, correct Postgres code.
Measuring the impact of Qwen, DeepSeek, Llama, GPT-OSS, Nemotron, and all of the new entrants to the ecosystem.
<p><strong><a href="https://www.kimi.com/blog/kimi-k2-5.html">Kimi K2.5: Visual Agentic Intelligence</a></strong></p> Kimi K2 landed <a href="https://simonwillison.net/2025/Jul/11/kimi-k2/">i...
A Claude Code skill for autonomous skill extraction and continuous learning. Have Claude Code get smarter as it works. - blader/Claudeception
<p><strong><a href="https://z.ai/blog/glm-5">GLM-5: From Vibe Coding to Agentic Engineering</a></strong></p> This is a <em>huge</em> new MIT-licensed model: 754B parameters and <a href="https...
<p><strong><a href="https://steve-yegge.medium.com/the-ai-vampire-eda6e4f07163">The AI Vampire</a></strong></p> Steve Yegge's take on agent fatigue, and its relationship to burnout.</p> <bloc...
<blockquote cite="https://twitter.com/bcherny/status/2022762422302576970"><p>Someone has to prompt the Claudes, talk to customers, coordinate with other teams, decide what to build next. Engin...
<blockquote cite="https://www.robinsloan.com/winter-garden/agi-is-here/"><p><strong>AGI is here</strong>!βWhen exactly it arrived, weβll never know; whether it was one companyβs Pro or another...
<p><strong><a href="https://mitchellh.com/writing/my-ai-adoption-journey">Mitchell Hashimoto: My AI Adoption Journey</a></strong></p> Some really good and unconventional tips in here for gett...
In our first episode of 2026, swyx sits down with the cofounders of Artificial Analysis to discuss the state of LLM Evals and Benchmarks, and the key trends and drivers of LLM progress for the year.
Built into the Claude Desktop app, Cowork lets users designate a specific folder where Claude can read or modify files, with further instructions given through the standard chat interface.
<blockquote cite="https://steve-yegge.medium.com/software-survival-3-0-97a2a6255f7b"><p>Getting agents using Beads requires much less prompting, because Beads now has 4 months of βDesire Paths...
<p><strong><a href="https://code.claude.com/docs/en/fast-mode">Claude: Speed up responses with fast mode</a></strong></p> New "research preview" from Anthropic today: you can now access a fas...
Coordinate multiple Claude Code instances working together as a team, with shared tasks, inter-agent messaging, and centralized management.
Coding agents cross a meaningful threshold with Opus 4.5.
<p><strong><a href="https://www.anthropic.com/news/claude-new-constitution">Claude's new constitution</a></strong></p> Late last year Richard Weiss <a href="https://www.lesswrong.com/pos...
<p>New from Fly.io today: <a href="https://sprites.dev">Sprites.dev</a>. Here's their <a href="https://fly.io/blog/code-and-let-live/">blog post</a> and <a href="https://www.youtube.com/watch?...
<p>Something I like about our weird new LLM-assisted world is the number of people I know who are coding again, having mostly stopped as they moved into management roles or lost their personal...
<p><strong><a href="https://www.dbreunig.com/2026/01/08/a-software-library-with-no-code.html">A Software Library with No Code</a></strong></p> Provocative experiment from Drew Breunig, who de...
<p>It genuinely feels to me like GPT-5.2 and Opus 4.5 in November represent an inflection point - one of those moments where the models get incrementally better in a way that tips across an in...
Anthropic is developing a new Customize section for Claude, centralizing Skills, Connectors, and upcoming Commands for Claude Code.
<p>New from Anthropic today is <a href="https://claude.com/blog/cowork-research-preview">Claude Cowork</a>, a "research preview" that they describe as "Claude Code for the rest of your work". ...
<p><strong><a href="https://cursor.com/blog/scaling-agents">Scaling long-running autonomous coding</a></strong></p> Wilson Lin at Cursor has been doing some experiments to see how far you can...
<p><strong><a href="https://www.promptarmor.com/resources/claude-cowork-exfiltrates-files">Claude Cowork Exfiltrates Files</a></strong></p> Claude Cowork defaults to allowing outbound HTTP tr...
Introducing Agent Readiness Factory can now evaluate how well your codebase supports autonomous development. Run /readin...
<blockquote cite="https://twitter.com/dhh/status/2012543705161326941"><p><em>[On agents using CLI tools in place of REST APIs]</em> To save on context window, yes, but moreso to improve accura...
Open Standards for Rich generative UI is all you need.
<blockquote cite="https://twitter.com/rakyll/status/2007239758158975130"><p>I'm not joking and this isn't funny. We have been trying to build distributed agent orchestrators at Google since la...
How I use coding agents, and what I think they mean
<p>Two major new model releases today, within about 15 minutes of each other.</p> <p>Anthropic <a href="https://www.anthropic.com/news/claude-opus-4-6">released Opus 4.6</a>. Here's <a href="h...
<p>I was a speaker last month at the <a href="https://www.pragmaticsummit.com/">Pragmatic Summit</a> in San Francisco, where I participated in a fireside chat session about <a href="https://si...
<p>A key challenge working with coding agents is having them both test what theyβve built and demonstrate that software to you, their overseer. This goes beyond automated tests - we need artif...
We've been experimenting with running coding agents autonomously for weeks at a time.
The tools are getting so powerful that we need to change how we scope, manage, and approach our work.
<p><em><a href="https://simonwillison.net/guides/agentic-engineering-patterns/">Agentic Engineering Patterns</a> ></em></p> <p>The defining characteristic of a coding agent is that it c...