A single 'agent' UI won't swallow every workflow. Users already route tasks to the model that wins a vertical, and interfaces will follow.
What happened
A prominent practitioner, mattshumer_, posted a revealing thread after testing Anthropic’s Opus 4.7. He asked, “Am I the only one having a good experience with Opus 4.7?” and noted a concrete split: “I still vastly prefer Codex for most things but Opus is absolutely nailing every UI task I give it.” In follow-ups he argued that harnesses like Codex and Claude Code are hitting practical limits and that each major capability leap forces a new interface paradigm. He predicted continuous churn — GPT-3 forms to chatbots to workflows to agents — with each model generation outgrowing the previous generation’s harness. The thread shows both hands-on performance divergence between models and the broader claim that model improvements will repeatedly make old UIs feel inadequate.
“Am I the only one having a good experience with Opus 4.7?”
— x.com
Why it matters
Toolmakers build for what wins. When teams route UI-heavy work to Opus 4.7 and general coding or reasoning to Codex or Claude variants, product choices harden: which model to embed, which affordances to expose, which latency and cost trade-offs to accept. That fragmentation self-reinforces as teams design bespoke editors, playback tools, and prompt surfaces tuned to a model’s quirks — long context windows, multimodal inputs, or superior table/chart handling — instead of grafting one universal agent onto every problem. The likely outcome is a marketplace of model-aware micro-UIs: one for rapid prototyping with Codex, another for deep-document workflows with Opus, a third for iterative architecture work with other models. The UI becomes a feature of the selected model, not a neutral shell around it.
Context
Mattshumer’s thread connects micro-observations (Opus winning UI tasks) to a macro-pattern: capability spikes force UI shifts, so single-agent fantasies underestimate practical friction and developer incentives.
“I still vastly prefer Codex for most things but Opus is absolutely nailing every UI task I give it.”
— x.com
Counterpoint
The thread itself contains a tempering note: the same user who praises Opus also writes, “I still vastly prefer Codex for most things.” Most teams need a reliable generalist, so fragmentation will be uneven — intense where a model’s advantage is large, tempered where a dependable generalist remains the cheapest, lowest-friction choice.
What to watch
Which verticals reach tipping points first — UI tooling, long-document reasoning, or code generation? How quickly will vendor APIs, latency, and pricing drive composable multi-model UIs? Watch latency/cost curves and the first open-source models that match a vendor niche.