Is recursive self improvement closer than we think?
How OpenAI's Codex 5.3 and MiniMax's M2.7 built their own successors.
In my earlier posts, I theorized how continual self learning for frontier models is likely the only thing standing in the path to AGI. Some very smart people recently resigned from important positions at various leading AI labs and made claims about recursive self improvement being only six to twelve months away. In his latest podcast1 with Peter Diamandis, Elon Musk also hinted at recursive self improvement being just around the corner. But reading them briefly talk about this, without providing any evidence to support their claims, makes you wonder whether the timeline is accurate.
Earlier this year Codex 5.3 from OpenAI was released. Along with Sonnet/Opus 4.5, Codex 5.3 ushered in a new era for SOTA models. Those who understand it know what changed. These models are now starting to become what I call omnipotent models — they are eating everything up. The reason why I mention Codex 5.3 in particular is because OpenAI publicly acknowledged, for the very first time, that Codex played an instrumental2 role in building its next version. Before that, models were routinely used, as part of training newer models, to generate synthetic data and AI-generated feedback as part of the reinforcement learning from AI feedback (RLAIF) loop. But what Codex 5.3 achieved was altogether different. It helped in debugging training runs, managing deployment of the model, analyzing test results, writing and performing evals, and much more. This is an entirely different level of model involvement in creating its successor. It hints at how close we are to fully recursive self improving models.
Yesterday, MiniMax published an article3 about their latest SOTA-level model titled “Early echoes of self-evolution”. The title alone sent a chill down my spine when I read it. Buried within a series of spectacular results across different benchmarks for M2.7 (their latest model) is the public acknowledgment of the model’s self-evolution that should shake to the core anyone who tracks continual learning. Among other things, they explain that they found the model was able to update its own memory, build dozens of complex skills in its harness, improve its learning process based on reinforcement learning experiments, iterate multiple times over its own architecture, skills and memory, and autonomously run repeated cycles to optimize itself and evaluate whether it’s building a better version of itself. They call it a cycle of model self-evolution. I call it the model’s ability to evolve itself to the point that it can build a better, more capable, more efficient version of itself, almost autonomously. If that’s not the early signs of recursive self improvement, I don’t know what is.
When they said recursive self improvement was only six to twelve months out, they weren’t exaggerating. These are the kind of breakthroughs that are as exciting as they are dangerous. They carry within themselves the echoes of the fictional Skynet scenario we all fear. Yet, it’s no time to stop.
Thanks for reading.
https://www.mexc.com/news/912121?ref=aisecret.us
https://thenewstack.io/openais-gpt-5-3-codex-helped-build-itself/
https://www.minimax.io/news/minimax-m27-en

