I would take their displayed uptime with a huge grain of salt. The other day Claude Code and claude.ai web were completely unavailable for me (Claude Code got into logged out state and couldn’t even log in) for at least two hours, they showed hours of “elevated errors”, yet not a single minute of downtime was recorded. And then there was yet another outage finally with recorded downtime a few hours later…
Honestly my impression was the “nines” of reliability just means how many nines your reliability starts with, as a decimal. I never thought much about it though.
I will also say it’s amusing that the debate is between one and two nines. Neither is objectively great. If you built a system with >3.65 days of downtime in a year that wouldn’t be something you’d brag about in an interview.
Anthropic also fires off the alarm bells seemingly at any sign of issue. I've personally only noticed an outage once, and the status page wasn't even showing it as down at that time. It eventually did update about 45 minutes later, then I was back up and running another 15 minutes later but the "outage" on the status page stayed up for another hour or so.
Probably good to sent alerts early, but they might be going a bit too early.
Anthropic is a great case study in why uptime doesn’t matter. The service is so valuable that you can have one nine uptime and add $9bil ARR in 3 months.
https://status.claude.com/
If AI could effectively replace people, you wouldn’t need CEOs to keep trying to convince people.