Smarter AI, dumber work: GPT-5 and the waste loop


A small paint chip on a wall at mid-right. Centre and right show oversized road machinery entering a room. A small card near the chip reads: Fix here. A blue accent line under the card.

Lede

Smarter is not the same as useful.


Introducing GPT-5


What does not make sense

  • Fix the corner, not repaint the house. Models often regenerate everything. Small edit ignored.
  • Instruction soup. Add five constraints and watch the format buckle.
  • Long thinking sold as wisdom, billed as tokens.
  • Energy talk says efficient, the grid says bigger bills.
  • New model, same old hallucinations with nicer lighting.

Sense check / The numbers

  1. Data centres and AI power. IEA projects electricity for data centres rising from about 460 TWh in 2024 to over 1,000 TWh by 2030. AI is the main driver. Date: 10 Apr 2025. [IEA]
  2. Compute keeps exploding. Training compute for frontier models has grown about 4x to 5x per year since 2010. Top runs now reach around 3.5e26 FLOP. Date: 13 Jan 2025. [Epoch AI]
  3. Per query power. A typical GPT-4o style query is roughly 0.3 Wh, far below older viral claims, but at scale small watts become big bills. Date: 7 Feb 2025. [Epoch AI]
  4. GPT-5 claims. OpenAI says GPT-5 is smarter across coding, maths, writing, health, and vision, and knows when to think longer. Date: 7 Aug 2025. [OpenAI]
  5. Hallucinations are still a class of failure, not a myth. Peer-reviewed work catalogues types and risks across domains. Date: 2024. [Nature]
  6. Instruction following under load. New 2025 benchmarks show drop-offs as constraint density rises. Dates: 15 Jul 2025 and 3 months prior for related work. [arXiv IFScale, ACL Findings]
  7. Chain of thought costs. Reports in 2025 show diminishing returns from verbose reasoning, while self-consistency methods improve accuracy at extra compute. Dates: 8 Jun 2025 and mid-2025. [Wharton report, ACL CISC]

The sketch

Scene 1: Data centre corridor. A small tag on a server door: PATCH. A forklift delivers ten identical servers marked FULL REBUILD.
Scene 2: Office wall with sticky note: tiny bug. A robot arm slams a giant button labelled REGENERATE ALL.
Scene 3: User pointing at a single word on a document. Assistant hands over a stack of 10 versions labelled 1 to 10. None match the marked word.

What to watch, not the show

  • Incentives. Benchmarks reward showy reasoning, not fast, faithful edits.
  • Product defaults. One-shot regen trumps targeted tools.
  • Cost curves. Cheaper tokens invite brute force loops.
  • Governance by buzzword. Efficiency claims outpace metering.
  • Human factors. Multi-constraint asks are common, and models still flinch.

The Hermit take

Count fixes, not tokens.
If it cannot keep a thought for three steps, it is not wise. It is noisy.

Keep or toss

Verdict: Toss.
Keep: small, surgical edit tools and reliable instruction checks.
Toss: the regen fetish for every tiny change.


Sources

IEA energy and AI overview – https://www.iea.org/reports/energy-and-ai/energy-demand-from-ai
IEA press note, AI driving data-centre demand, 10 Apr 2025 – https://www.iea.org/news/ai-is-set-to-drive-surging-electricity-demand-from-data-centres-while-offering-the-potential-to-transform-how-the-energy-sector-works
IEA energy supply for AI – https://www.iea.org/reports/energy-and-ai/energy-supply-for-ai
Epoch AI compute trends, 13 Jan 2025 – https://epoch.ai/trends
Epoch AI model scale list, 30 Jan 2025 – https://epoch.ai/data-insights/models-over-1e25-flop
Epoch AI estimate of per-query energy, 7 Feb 2025 – https://epoch.ai/gradient-updates/how-much-energy-does-chatgpt-use
OpenAI, Introducing GPT-5, 7 Aug 2025 – https://openai.com/index/introducing-gpt-5/
OpenAI, GPT-5 product page – https://openai.com/gpt-5/
OpenAI, o1 preview and learning to reason, 12 Sep 2024 – https://openai.com/index/learning-to-reason-with-llms/
OpenAI, o3 and o4-mini, 16 Apr 2025 – https://openai.com/index/introducing-o3-and-o4-mini/
Nature, AI hallucination classification, 2024 – https://www.nature.com/articles/s41599-024-03811-x
ACL Findings 2025, Confidence-Informed Self-Consistency – https://aclanthology.org/2025.findings-acl.1030.pdf
Wharton GAIL report, Diminishing returns from CoT, 8 Jun 2025 – https://gail.wharton.upenn.edu/research-and-insights/tech-report-chain-of-thought/
arXiv 2507.11538, How Many Instructions Can LLMs Follow at Once, 15 Jul 2025 – https://arxiv.org/html/2507.11538v1
ACL Findings 2025, Structured Flow Benchmark for Multi-turn Instruction – https://aclanthology.org/2025.findings-acl.486.pdf
CarbonBrief explainer, 15 Sep 2025 – https://www.carbonbrief.org/ai-five-charts-that-put-data-centre-energy-use-and-emissions-into-context/


Satire and commentary. Opinion pieces for discussion. Sources at the end. Not legal, medical, financial, or professional advice.

One response

  1. This really nails the paradox of progress — every smarter iteration of AI seems to make the actual *work* more convoluted, not less. The bit about ‘instruction soup’ hit home; constraint-heavy prompts often break the very structure they’re meant to clarify. It’s a good reminder that capability doesn’t automatically translate to usefulness without better design discipline on the human side.


Satire and commentary. My views. For information only. Not advice.


JOIN OUR NEWSLETTER
And get notified everytime we publish a new blog post.