Treating Claude, GPT, and Gemini as interchangeable will eventually cost you. They have actual personalities now — strengths and weaknesses that matter for the kind of work you give them.
This is a working field guide, not a benchmark sheet.
Claude
Best when the task involves reading carefully and writing carefully. Long-context reasoning, code review, anything where tone and nuance matter. Claude tends to follow instructions about format and constraint more reliably than the others — useful when the output shape is part of the spec. Steering toward concision still takes a sentence.
GPT
Best when the task is fuzzy and you want it to commit to a direction. GPT is faster to "have an opinion" — useful for brainstorming, plan-of-attack, writing a first draft when there's nothing to react to yet. Multimodal handling (images, screenshots) is currently the strongest of the three.
Gemini
Best when context is enormous or the task involves reasoning over a tree of related artefacts. The 1M-context window changes what's possible — feeding it an entire codebase and asking real architectural questions is a use case the others can't quite match. Less reliable on instruction-following nuance than Claude.
How to actually decide
Pick the model whose default failure mode you can live with for this task. Claude over-explains; GPT confidently bullshits; Gemini occasionally loses the thread mid-task. None of them is "best" — they're all best at something different.