Ben Bausili

August 11, 2025

GPT-5 Is Here. So Why Am I Sticking With Claude and Gemini?

The AI world was anticipating a lot last week. Probably too much. GPT-5 launched last week. After the seismic shift we all felt when GPT-4 was originally released, many expected another giant leap forward for artificial intelligence. What we got, however, was something far more incremental—and the rollout itself left many users frustrated.

This wasn’t the revolution we were hoping for, but an evolution that builds upon the foundations of GPT-3, GPT-4, and the more recent 4o models. So, after a weekend of testing, what’s the verdict?

A Bumpy Landing

Putting aside the model’s actual capabilities for a moment, the initial rollout was polarizing. When GPT-5 first appeared in the ChatGPT app, it was the only model available. The option to switch back to GPT-4o and other predecessors vanished.

The backlash was swift, and OpenAI quickly reversed course, making GPT-4o available again. The reasons for the community’s reaction were varied but valid. Many businesses and developers have workflows that are highly sensitive to the specific model being used, with prompts and processes fine-tuned over months. For others, it was a matter of simple preference; they had grown attached to the “personality” and reliability of their favorite ChatGPT version. It was a clear lesson that in the world of generative AI, consistency and choice are paramount.

What GPT-5 Actually Gets Right

So, putting the hype and the chaotic launch aside, what did OpenAI actually accomplish?

In my testing, GPT-5 is undeniably a better model in several key areas that businesses, in particular, will appreciate:

  • It’s a much better coding assistant
  • It’s significantly more capable at using tools
  • It follows complex directions with greater precision and has a noticeably lower hallucination rate

These are not trivial improvements. They make the model more reliable, more predictable, and more useful for day-to-day professional tasks. This is, without a doubt, a good thing.

The King Is Not Dethroned

However, these improvements don’t necessarily make GPT-5 the top model across the board. In my experience, the competitive landscape is more nuanced than ever:

  • For single-prompt power, I still find Gemini 2.5 Pro to be the overall powerhouse. Especially when tackling a complex problem that requires deep thinking or planning, Gemini consistently delivers the most insightful and comprehensive responses.
  • For coding, Anthropic’s Claude models remain my top choice. The experience of working with Claude Opus and Sonnet, especially within the dedicated Claude Code environment, feels like a true partnership.

Over the weekend, I ran some unscientific tests, switching between GPT-5 in its coding environment, Claude Opus and Sonnet in Claude Code, and Gemini via the web app. My takeaway was clear: Claude Code is still the best all-around agent for accomplishing standard web development tasks with straightforward prompting.

GPT-5 offers a novel user experience—the ability to get code suggestions from my phone is interesting—but it doesn’t feel as tightly integrated as the developer-focused experience Claude provides. And while I haven’t had as much success using Gemini as a multi-file coding agent, it remains the AI I turn to most when planning a new project or when I’m truly stuck on a difficult problem. Leveraging Google Search, Gemini often generates the single best webpage or Python script to break through a roadblock.

So after a week with GPT-5, my workflow remains fundamentally unchanged. I’ll be sticking with Claude Code for the bulk of my development, with Gemini acting as my brilliant occasional collaborator.

The Real News You Might Have Missed

Ironically, the biggest AI news of the week might not have been GPT-5 at all. It was likely the announcement of Genie 3 from Google DeepMind.

While it’s not a tool any of us will be using day-to-day just yet, Genie is a generative world model that can be interacted with, and it remembers the changes made to its environment. This is a monumental step towards a future where AI models can self-play, explore, and learn from their own experiences in a persistent world. This is the kind of foundational research that enables the giant leaps we were all expecting from GPT-5.

And if I were to guess, I suspect Google DeepMind has a few more big announcements waiting in the wings. Last week was interesting, but the most exciting developments may still be just around the corner.

Looking Forward

While GPT-5 represents solid incremental progress, it’s clear that the AI landscape is more competitive and nuanced than ever. Each platform has found its strengths: OpenAI’s reliability, Anthropic’s coding partnership, and Google’s breakthrough thinking.

The real winner is us. We can now choose the right tool for each specific task (or simply when we get stuck). The future isn’t about one AI to rule them all; it’s about having the right AI for the job at hand and the right scaffolding around it.