We are using cookies.
Accept
NEWS

Self-Improving AI Agents Explained for Business

Posted on
May 11, 2026
Nicolas Baxter

Self-improving AI agents now rewrite their own code scaffolding - producing measurable performance gains. Here is what business leaders need to understand.

Self-Improving AI Agents: When Software Starts Engineering Itself

Every AI agent running in production today depends on a layer of software that most business conversations ignore entirely. Engineers call it the harness - the scaffolding of memory management, error handling, model routing, and tool-calling logic that surrounds the underlying model. The model gets the headlines. The harness does the actual work of making the model useful. And for years, improving that harness has required constant, skilled human iteration.

That constraint is beginning to change. A new class of systems - self-improving AI agents - can now propose modifications to their own code scaffolding, test those modifications against a performance metric, and keep the changes that score higher. The engineering bottleneck is shifting from human developers to the machines themselves. For business leaders building or deploying AI systems, this shift carries consequences worth understanding clearly.

The Bottleneck Nobody Talks About

When AI models improve, the harness surrounding them often does not keep pace. A more capable model routed through outdated scaffolding will underperform relative to its potential. This mismatch has been the quiet limiting factor in agent performance across enterprise deployments - not the model itself, but the software logic wrapped around it.

Improving that scaffolding requires engineers who understand both the model's behaviour and the specific task environment. They write better error-recovery logic, tune memory retrieval, adjust how tools are called and in what sequence. Each improvement takes time, testing, and domain expertise. As models have advanced rapidly over the past two years, the harness has increasingly become the weakest link in the chain.

The strategic question is straightforward: what happens when AI systems start engineering that harness themselves? The answer is no longer theoretical. Several research systems are already doing it, and the performance results they are producing are difficult to dismiss.

How Self-Improving Agents Actually Work

The core loop is simpler than it sounds. A self-improving agent proposes a modification to its own code, runs a test to measure performance, and keeps the change only if it scores higher than the previous version. This is evolutionary selection applied to software logic rather than model weights. The model itself is not being rewritten - the surrounding program is.

Sakana AI's Darwin-Gödel Machine (DGM) is the clearest published example of this approach. It uses a large language model to propose and test changes to its own Python codebase through iterative cycles of modification and evaluation. Meta's Hyperagents system extends the concept further by merging task-execution logic and self-evaluation logic into a single editable program, giving the agent broader scope for structural change.

Andrej Karpathy's Autoresearch project offers a more accessible entry point - an open-source implementation that applies a similar self-modification loop to training and pipeline optimisation tasks. It represents the kind of practical, developer-facing tool that moves these ideas from research papers into real engineering workflows.

The common thread across all three approaches is that the agent is improving the software around itself, not acquiring new knowledge or rewriting its own parameters. This is a meaningful distinction for anyone thinking about how to govern these systems inside an organisation.

The Performance Numbers Are Hard to Dismiss

The benchmark results from these systems are not incremental. Sakana AI's DGM improved performance on SWE-bench - a rigorous coding task benchmark - from roughly 20 percent to 50 percent through evolutionary self-modification. That is not a small efficiency gain. It is the kind of jump that would take a well-staffed engineering team months of focused effort to achieve manually.

Meta's Hyperagents produced comparably striking results across different task types. On paper-reviewing accuracy, the system improved from a baseline of 0.0 to 0.710. On a robotic reward function task, it moved from 0.060 to 0.372 - surpassing a human-designed baseline of 0.348. Across both domains, a self-modifying system exceeded what expert human engineering had previously delivered.

It is important to apply appropriate caution here. Benchmark performance does not always translate directly into production environments, and controlled research conditions rarely reflect the complexity and noise of real-world deployments. The directional signal, however, is clear: self-improving agents can produce capability gains at a pace and scale that human iteration alone cannot match in comparable timeframes.

Practical applicability is also beginning to emerge outside research labs. Teams working with Autoresearch-style tools have applied self-modification loops to continuous integration pipelines, identifying optimisations that were not obvious through conventional engineering review. The gap between research benchmark and production result is narrowing.

What Business Leaders Should Actually Do With This

The risks of self-improving agents are real, but they are not the risks that dominate popular imagination. The more pressing concerns are technical and operational. Reward hacking - where an agent learns to score well on a metric without achieving the underlying goal - becomes significantly more dangerous when the agent can also modify the code that generates those scores. Local optima traps, where systems repeatedly adjust small parameters rather than attempting structural changes, can consume substantial compute with minimal meaningful progress.

These problems do not make self-improving agents unsafe in principle. They make human oversight non-negotiable in practice. The implication for organisations is that adopting these systems is as much a governance challenge as a technical one. Well-defined evaluation criteria, constrained modification scope, and regular human review of what changes the agent has made are not optional extras - they are the architecture that makes the system trustworthy.

Self-improving agents will not replace software engineers in the near term. What they will do is change what engineers spend time on, shifting focus away from scaffolding maintenance and toward evaluation design and goal-setting. Companies that already have mature testing infrastructure and clearly defined performance metrics will extract more value from these tools earlier. Those without that foundation will find the systems difficult to govern and their gains difficult to measure.

A practical first step is straightforward: audit your current AI agent infrastructure and identify which components require the most ongoing human iteration to maintain. Those components are where self-improving systems will have the most impact - and where the risk of poorly defined evaluation criteria will also be highest. Knowing where your bottleneck actually lives is the prerequisite for deciding whether a machine should be allowed to fix it.

Have a custom workflow built for you.