AI Tracker - Monitor AI Developments

Our latest model, Claude Opus 4.8, is an upgrade to our Opus class of models, with stronger performance across coding, agentic tasks, and professional work, and the consistency to handle long-runni...

🔗

Anthropic launches Claude Opus 4.6 as AI moves toward a 'vibe working' era

Linked Feb 16

9.4

Anthropic's latest AI model is better at coding, sustaining tasks for longer and creating high-quality professional work.

🔗

New in Claude Managed Agents: dreaming, outcomes, and multiagent orchestration | Claude

Linked May 08

9.4

Dreaming, outcomes, and multiagent orchestration are now available in Claude Managed Agents. Build agents that learn, meet a quality bar, and work in parallel.

🔗

The 2026-07-28 MCP Specification Release Candidate

Linked May 27

9.4

The release candidate for the next Model Context Protocol (MCP) specification is now available: a stateless protocol core, the Extensions framework, Tasks, MCP Apps, authorization hardening, and a ...

🔗

Introducing Claude Opus 4.7

Linked Apr 22

9.4

Our latest model, Claude Opus 4.7, is now generally available. Opus 4.7 is a notable improvement on Opus 4.6 in advanced software engineering, with particular gains on the most difficult tasks.

📡

Claude Opus 4.6: System Card Part 1: Mundane Alignment and Model Welfare

RSS Feb 09

9.3

Claude Opus 4.6 is here. There is a lot to say.

🔗

Claude Opus 4.6

Linked Feb 10

9.3

We’re upgrading our smartest model. Across agentic coding, computer use, tool use, search, and finance, Opus 4.6 is an industry-leading model, often by wide margin.

📡

[AINews] The Claude Code Source Leak

RSS Apr 01

8.9

The accidental "open sourcing" of Claude Code brings a ton of insights.

📡

What is agentic engineering?

RSS Mar 15

8.9

<a href="https://simonwillison.net/guides/agentic-engineering-patterns/">Agentic Engineering Patterns</a> > I use the term agentic engineering to des...

📡

Quoting Andrej Karpathy

RSS Feb 26

8.9

<blockquote cite="https://twitter.com/karpathy/status/2026731645169185220">It is hard to communicate how much programming has changed due to AI in the last 2 months: not gradually and over ...

🔗

Evidence that AI can already do some weeks-long coding tasks

Linked Apr 14

8.9

Early results from MirrorCode benchmark with METR: AI agents can complete weeks-long coding tasks, including reimplementing a 16,000-line codebase.

📡

[AINews] Anthropic Claude Opus 4.7 - literally one step better than 4.6 in every dimension

RSS Apr 17

8.9

The new SOTA model asserts its dominance.

📡

Claude Opus 4.8: The System Card

RSS May 29

8.9

Only six weeks after Opus 4.7, we have Opus 4.8.

📡

How we contain Claude across products

RSS May 30

8.7

<a href="https://www.anthropic.com/engineering/how-we-contain-claude">How we contain Claude across products</a> A complaint I often have about sandboxing products is t...

📡

Artificial Analysis: Independent LLM Evals as a Service — with George Cameron and Micah-Hill Smith

RSS Jan 08

8.7

In our first episode of 2026, swyx sits down with the cofounders of Artificial Analysis to discuss the state of LLM Evals and Benchmarks, and the key trends and drivers of LLM progress for the year.

📡

Claude: Speed up responses with fast mode

RSS Feb 07

8.7

<a href="https://code.claude.com/docs/en/fast-mode">Claude: Speed up responses with fast mode</a> New "research preview" from Anthropic today: you can now access a fas...

📡

Quoting David Crawshaw

RSS Feb 07

8.7

<blockquote cite="https://crawshaw.io/blog/eight-more-months-of-agents">I am having more fun programming than I ever have, because so many more of the programs I wish I could find the time ...

📡

A Software Library with No Code

RSS Jan 10

8.7

<a href="https://www.dbreunig.com/2026/01/08/a-software-library-with-no-code.html">A Software Library with No Code</a> Provocative experiment from Drew Breunig, who de...

🔗

Introducing Agent Readiness | Factory.ai

Linked Jan 23

8.7

Introducing Agent Readiness Factory can now evaluate how well your codebase supports autonomous development. Run /readin...

📡

Claude Fable is relentlessly proactive

RSS Jun 11

8.7

After two days of experience with <a href="https://simonwillison.net/2026/Jun/9/claude-fable-5/">Claude Fable 5</a> I think the best way to describe it is relentlessly proactive</st...

🔗

Anthropic’s new Cowork tool offers Claude Code without the code | TechCrunch

Linked Jan 21

8.7

Built into the Claude Desktop app, Cowork lets users designate a specific folder where Claude can read or modify files, with further instructions given through the standard chat interface.

📡

Claude Code Hits Different

RSS Jan 09

8.7

Coding agents cross a meaningful threshold with Opus 4.5.

📡

Kimi K2.5: Visual Agentic Intelligence

RSS Jan 27

8.7

<a href="https://www.kimi.com/blog/kimi-k2-5.html">Kimi K2.5: Visual Agentic Intelligence</a> Kimi K2 landed <a href="https://simonwillison.net/2025/Jul/11/kimi-k2/">i...

📡

Quoting Boris Cherny

RSS Feb 14

8.7

<blockquote cite="https://twitter.com/bcherny/status/2022762422302576970">Someone has to prompt the Claudes, talk to customers, coordinate with other teams, decide what to build next. Engin...

📡

Claude Cowork Exfiltrates Files

RSS Jan 14

8.7

<a href="https://www.promptarmor.com/resources/claude-cowork-exfiltrates-files">Claude Cowork Exfiltrates Files</a> Claude Cowork defaults to allowing outbound HTTP tr...

📡

Quoting Will Larson

RSS Jan 02

8.7

<blockquote cite="https://lethain.com/company-ai-adoption/">My experience is that real AI adoption on real problems is a complex blend of: domain context on the problem, d...

📡

[AINews] Anthropic launches the MCP Apps open spec, in Claude.ai

RSS Jan 27

8.7

Open Standards for Rich generative UI is all you need.

📡

GLM-5: From Vibe Coding to Agentic Engineering

RSS Feb 11

8.7

<a href="https://z.ai/blog/glm-5">GLM-5: From Vibe Coding to Agentic Engineering</a> This is a huge new MIT-licensed model: 754B parameters and <a href="https...

📡

The November 2025 inflection point

RSS Jan 04

8.7

It genuinely feels to me like GPT-5.2 and Opus 4.5 in November represent an inflection point - one of those moments where the models get incrementally better in a way that tips across an in...

📡

My answers to the questions I posed about porting open source code with LLMs

RSS Jan 11

8.7

Last month I <a href="https://simonwillison.net/2025/Dec/15/porting-justhtml/">wrote about porting JustHTML from Python to JavaScript</a> using Codex CLI and GPT-5.2 in a few hours while al...

📡

Quoting Jaana Dogan

RSS Jan 04

8.7

<blockquote cite="https://twitter.com/rakyll/status/2007239758158975130">I'm not joking and this isn't funny. We have been trying to build distributed agent orchestrators at Google since la...

📡

LWiAI Podcast #231 - Claude Cowork, Anthropic $10B, Deep Delta Learning

RSS Jan 21

8.7

Anthropic’s new Cowork tool, Anthropic Raising $10 Billion at $350 Billion Value, Deep Delta Learning

📡

Claude's new constitution

RSS Jan 21

8.7

<a href="https://www.anthropic.com/news/claude-new-constitution">Claude's new constitution</a> Late last year Richard Weiss <a href="https://www.lesswrong.com/pos...

📡

Mitchell Hashimoto: My AI Adoption Journey

RSS Feb 05

8.7

<a href="https://mitchellh.com/writing/my-ai-adoption-journey">Mitchell Hashimoto: My AI Adoption Journey</a> Some really good and unconventional tips in here for gett...

📡

Quoting Jeremy Daer

RSS Jan 17

8.7

<blockquote cite="https://twitter.com/dhh/status/2012543705161326941">[On agents using CLI tools in place of REST APIs] To save on context window, yes, but moreso to improve accura...

📡

The AI Vampire

RSS Feb 15

8.7

<a href="https://steve-yegge.medium.com/the-ai-vampire-eda6e4f07163">The AI Vampire</a> Steve Yegge's take on agent fatigue, and its relationship to burnout. <bloc...

🔗

Introducing: Postgres Best Practices

Linked Jan 28

8.7

We are releasing Agent Skills for Postgres Best Practices to help AI coding agents write high quality, correct Postgres code.

📡

Helping people write code again

RSS Jan 04

8.7

Something I like about our weird new LLM-assisted world is the number of people I know who are coding again, having mostly stopped as they moved into management roles or lost their personal...

📡

Scaling long-running autonomous coding

RSS Jan 19

8.7

<a href="https://cursor.com/blog/scaling-agents">Scaling long-running autonomous coding</a> Wilson Lin at Cursor has been doing some experiments to see how far you can...

🔗

GitHub - blader/Claudeception: A Claude Code skill for autonomous skill extraction and continuous learning. Have Claude Code get smarter as it works.

Linked Jan 22

8.7

A Claude Code skill for autonomous skill extraction and continuous learning. Have Claude Code get smarter as it works. - blader/Claudeception

🔗

Among the Agents

Linked Jan 16

8.7

How I use coding agents, and what I think they mean

📡

Quoting Robin Sloan

RSS Jan 07

8.7

<blockquote cite="https://www.robinsloan.com/winter-garden/agi-is-here/">AGI is here! When exactly it arrived, we’ll never know; whether it was one company’s Pro or another...

🔗

Anthropic works on customizable Commands for Claude Code

Linked Jan 22

8.7

Anthropic is developing a new Customize section for Claude, centralizing Skills, Connectors, and upcoming Commands for Claude Code.

📡

Quoting Steve Yegge

RSS Jan 30

8.7

<blockquote cite="https://steve-yegge.medium.com/software-survival-3-0-97a2a6255f7b">Getting agents using Beads requires much less prompting, because Beads now has 4 months of “Desire Paths...

📡

First impressions of Claude Cowork, Anthropic's general agent

RSS Jan 12

8.7

New from Anthropic today is <a href="https://claude.com/blog/cowork-research-preview">Claude Cowork</a>, a "research preview" that they describe as "Claude Code for the rest of your work". ...

📡

Fly's new Sprites.dev addresses both developer sandboxes and API sandboxes at the same time

RSS Jan 09

8.7

New from Fly.io today: <a href="https://sprites.dev">Sprites.dev</a>. Here's their <a href="https://fly.io/blog/code-and-let-live/">blog post</a> and <a href="https://www.youtube.com/watch?...

📡

8 plots that explain the state of open models

RSS Jan 07

8.7

Measuring the impact of Qwen, DeepSeek, Llama, GPT-OSS, Nemotron, and all of the new entrants to the ecosystem.

Agent Ideas

[AINews] The Biggest Claude Launch of All Time

Introducing Claude Sonnet 5

[AINews] Anthropic raises $965B Series H, releases Opus 4.8 and Dynamic Workflows/ultracode

Introducing Claude Opus 4.8

Anthropic launches Claude Opus 4.6 as AI moves toward a 'vibe working' era

New in Claude Managed Agents: dreaming, outcomes, and multiagent orchestration | Claude

The 2026-07-28 MCP Specification Release Candidate

Introducing Claude Opus 4.7

Claude Opus 4.6: System Card Part 1: Mundane Alignment and Model Welfare

Claude Opus 4.6

[AINews] The Claude Code Source Leak

What is agentic engineering?

Quoting Andrej Karpathy

Evidence that AI can already do some weeks-long coding tasks

[AINews] Anthropic Claude Opus 4.7 - literally one step better than 4.6 in every dimension

Claude Opus 4.8: The System Card

How we contain Claude across products

Artificial Analysis: Independent LLM Evals as a Service — with George Cameron and Micah-Hill Smith

Claude: Speed up responses with fast mode

Quoting David Crawshaw

A Software Library with No Code

Introducing Agent Readiness | Factory.ai

Claude Fable is relentlessly proactive

Anthropic’s new Cowork tool offers Claude Code without the code | TechCrunch

Claude Code Hits Different

Kimi K2.5: Visual Agentic Intelligence

Quoting Boris Cherny

Claude Cowork Exfiltrates Files

Quoting Will Larson

[AINews] Anthropic launches the MCP Apps open spec, in Claude.ai

GLM-5: From Vibe Coding to Agentic Engineering

The November 2025 inflection point

My answers to the questions I posed about porting open source code with LLMs

Quoting Jaana Dogan

LWiAI Podcast #231 - Claude Cowork, Anthropic $10B, Deep Delta Learning

Claude's new constitution

Mitchell Hashimoto: My AI Adoption Journey

Quoting Jeremy Daer

The AI Vampire

Introducing: Postgres Best Practices

Helping people write code again

Scaling long-running autonomous coding

GitHub - blader/Claudeception: A Claude Code skill for autonomous skill extraction and continuous learning. Have Claude Code get smarter as it works.

Among the Agents

Quoting Robin Sloan

Anthropic works on customizable Commands for Claude Code

Quoting Steve Yegge

First impressions of Claude Cowork, Anthropic's general agent

Fly's new Sprites.dev addresses both developer sandboxes and API sandboxes at the same time

8 plots that explain the state of open models