Anthropic's Claude 4 issues & limits are a cautionary tale

I like good tools as much as anyone, but the last couple of weeks around Anthropic’s Claude 4 family have been a reminder that you can’t build your working life on shifting sand. Models change, limits move, and entire features wobble without much notice. Useful? Absolutely. Dependable enough to be your only plan? Not even close.

What changed

If you’ve been anywhere near Claude lately you’ve probably felt the turbulence. Some days are fine; other days you’re staring at elevated errors, partial outages, or features that feel half-broken.

Claude Code in particular has been hot-and-cold: one session will cruise through a tricky refactor, and the next will cough, forget context, or hit a wall with token and usage limits. That volatility isn’t new in AI land, but the frequency and breadth of issues recently has been hard to ignore.

The status page tells a story

Anthropic’s own status history over the back half of August is a scroll of “elevated errors” across the Claude 4 line (Sonnet, Opus, and 4.1), plus repeated web app hiccups like trouble starting chats and connector regressions. There were also notes specifically calling out effects on Claude Code when Opus 4 blipped.

If you’re trying to get work done, that pattern matters: it’s not one bad afternoon, it’s an extended period of degraded reliability with multiple blast radii.

The Reddit barometer

Outside the official feed, the community signal is noisy but useful. Subreddits like r/ClaudeAI and r/Anthropic have been packed with posts about performance swings, memory regressions, artifacts misbehaving, and people cancelling subscriptions after running face-first into usage caps or broken flows.

You also see plenty of mixed experiences: one developer says it’s the best day ever, another says it forgot instructions from two messages ago. That split is exactly the problem. If your livelihood is tied to a specific tool, variability isn’t just annoying, it’s a risk.

People have been complaining about performance regressions and issues since the early days of Claude, especially going back to Claude 3.5. However, the recent dramatic uptick in complaints suggests that something has changed. Users are hitting walls more frequently, and the once-reliable workflows are becoming unpredictable.

Anthropic’s growing pains

Claude Code landed with a lot of promise: a REPL for “vibe coding,” deep diffs, and a workflow that can actually move real projects forward. It can be fantastic. It can also burn time and money when it drifts, hits undefined limits, or stalls mid-change and starts again from scratch.

The last thing you want from an automated assistant is to babysit it while it rewrites files you didn’t ask it to touch, or blocks you for hours because some metering threshold moved. That’s been a recurring theme in the complaints lately.

Anthropic introduced some usage limits on August 28, 2025 because people were abusing Claude Code and its limits, raising the question of whether or not these limits are to blame for the noticeable brokenness lately.

On a surface level I get why Anthropic did this. People were boasting about some insane usage of Claude Code and it definitely was abuse. But why Anthropic didn’t just throttle usage instead of imposing hard limits is a question worth considering.

The introduced limits seem to hint at Anthropic’s constrained compute resources and this was a convenient excuse to rein in usage without having to admit to any underlying issues.

You could argue Anthropic has been a victim of its own success. The rapid adoption of Claude Code created a perfect storm of demand that outstripped their ability to deliver consistent performance and the limits were too good to be true.

The transparency gap

I don’t expect a model vendor to publish every infrastructure detail, but quiet limit changes and vague incident notes make planning difficult. If caps tighten or behaviour shifts, users need clear guidance about what changed and what to expect.

When you’re on a deadline, “try again later” isn’t a plan. You need to know whether to switch models, change providers, or cut bait and code it by hand.

You get this situation where performance is notably worse, the status page says everything is fine, but there is growing discontent on Reddit or X where people are sharing their frustrations and experiences.

And then Anthropic maybe admits to degraded performance or other issues. They claim they’ve fixed them (the pattern we have seen recently) and things don’t appear to have been fixed. It’s this vicious cycle of being at the whim of a company who can make changes to their models, their infrastructure, and their pricing at any time without warning.

Back to the basics: why this is a developer story

None of this is an “anti‑AI” rant. I use these tools, and on a good day they’re ridiculously helpful. But the last few weeks have reinforced some boring, durable habits I’m glad I never dropped:

Keep a mental model of your stack. If the assistant goes sideways, you still need to reason about the system.
Treat AI like a junior engineer on trial. Give it crisp tasks, review diffs, and don’t merge blind. When things go off the rails, step in and have the ability and understanding to fix it.
Write tests first when you’re about to let a tool touch lots of files. Green tests give you a panic button.
Lean on version control like your career depends on it. Small commits; revert ruthlessly.
Maintain fallbacks. Have at least two other models/providers set up and ready to go, and a path to do the task manually.
Budget time and money. When usage caps or token burn spike, you need a ceiling that forces you to stop and rethink.

Some rules for the vibecoding era

Minimise single‑vendor lock‑in. Where possible, use adapters, not vendor SDKs, and isolate model specifics behind a thin interface.
Keep context small and explicit. Long, dreamy “figure it out” prompts are where tools drift the most. Short briefs with files and acceptance criteria work better.
Start new chats often. Models like Claude suffer from context bleed, where information from previous interactions leaks into new ones and confuses the model, as well as context rot where the model forgets important details over time.
Prefer predictable workflows. For big refactors, I still reach for a human‑first plan: write the TODO, sketch the diff, then let the assistant fill in the mechanical pieces.
Don’t make the models do everything from scratch. Start with boilerplate, build the initial structure, create a README with clear instructions, and let the model fill in the gaps.
Build “degrade gracefully” into my own process. If an assistant stalls, I switch to writing the next unit test, documenting assumptions, or refactoring a small module—useful work I control.
Track incidents like a dependency. If the status page lights up for days at a time, I’ll pause tool‑heavy work and move tasks that aren’t blocked by model reliability.

The uncomfortable truth

As developers we’ve always been at the mercy of platforms: cloud outages, breaking API changes, surprise deprecations. AI turns that dial up because the product is probabilistic and the vendor can (and does) swap out behaviour underneath you.

Even if the average trend line goes up and to the right, the day‑to‑day variance can wreck a sprint. Betting everything on one tool or one model family makes you fragile.

So, yes, lean on AI to go faster. Use it to code tedious code, write scaffolding tests, and help you think through hairy problems. But keep your core skills sharp: reading code, debugging, designing small steps, and knowing when to throw work away. Those are the things that keep you shipping when the model du jour is in a mood.

As for me, I’ve actually found Codex CLI by OpenAI to be a much more reliable and better option. Claude Code pioneered the whole CLI coding tool category, but Codex CLI has taken it to the next level with improved context handling and more robust performance with GPT-5.

Anthropic had an edge where people were willing to tolerate the degradation in performance with these models as no other worthy competitor had anything that compared, but ultimately, the improved capabilities of Codex CLI have made it the superior choice for my needs and I think Anthropic needs to step up their game to stay competitive (especially on price and reliability).