How BrokenAI scores providers

BrokenAI aggregates the official Statuspage.io feeds of each provider and computes a composite reliability score per time window. We do not run independent probes — all numbers reflect what the provider themselves report on their status page.

Composite score

Each factor is normalised 0–100 over the selected window, then combined:

score = 0.5 × uptime_factor
      + 0.2 × frequency_factor
      + 0.2 × severity_factor
      + 0.1 × mttr_factor

Uptime factor

Linear rescale of the window's severity- and component-weighted uptime %, with the baseline at 90%:

uptime_factor = clamp(0, 100, (uptime% − 90) ÷ (100 − 90) × 100)

uptime% itself = 100% − Σ(overlap_min × impact_ratio × affected_components / total)
                where impact_ratio = 0.2 (minor) · 0.6 (major) · 1 (critical)

Why 90% baseline: anything under 90% is functionally broken, so we don't need resolution there. The interesting differences live in the 99.x range.

Frequency factor

Exponential decay against the count of incidents in the window. Never reaches zero, so two providers with chatty weeks still rank against each other:

frequency_factor = 100 × exp(−incident_count ÷ e_fold)
                 where e_fold[30d] = 100, e_fold[7d] = 40, e_fold[1h] = 6

Worked examples (30d, e_fold = 100): 20 incidents → 82, 100 incidents → 37, 200 incidents → 14.

Severity factor

Same exponential shape as frequency, but the input is severity-minutes — a duration-and-impact-weighted sum:

severity_minutes = Σ min(duration, 360min cap) × impact_multiplier
                 where multipliers are minor 0.5, major 2, critical 6

severity_factor  = 100 × exp(−severity_minutes ÷ scale)
                 where scale[30d] = 3000, scale[7d] = 1500, scale[1h] = 400

Two design choices worth flagging:

Per-incident duration cap of 6h. A 48-hour minor that an operator forgot to close shouldn't torpedo the factor for weeks. Cap kicks in regardless of impact, so an 8h critical contributes 6h × 60 × 6 = 2160 sev-min, not 8h × 60 × 6.
Minor multiplier 0.5 (was 1). Statuspage providers tag a lot of stuff as minor — the lower weight keeps real outages (major/critical) dominant in the factor.

MTTR factor

Linear penalty against the median resolution time for the window. Hits zero at 120 minutes:

mttr_factor = max(0, 100 − median_minutes_to_resolve ÷ 1.2)

Data sources

Anthropic: https://status.claude.com/api/v2/summary.json
OpenAI: https://status.openai.com/api/v2/summary.json

Polled every 60 seconds by our backend; your browser also fetches live status directly every 30 seconds.

What we don't do

We don't ping APIs ourselves — if a provider's status page is wrong, so are we.
No user accounts, no alerts.
No latency or quality benchmarks.

Composite score

Each factor is normalised 0–100 over the selected window, then combined:

score = 0.5 × uptime_factor + 0.2 × frequency_factor + 0.2 × severity_factor + 0.1 × mttr_factor

Uptime factor

Linear rescale of the window's severity- and component-weighted uptime %, with the baseline at 90%:

uptime_factor = clamp(0, 100, (uptime% − 90) ÷ (100 − 90) × 100) uptime% itself = 100% − Σ(overlap_min × impact_ratio × affected_components / total) where impact_ratio = 0.2 (minor) · 0.6 (major) · 1 (critical)

Why 90% baseline: anything under 90% is functionally broken, so we don't need resolution there. The interesting differences live in the 99.x range.

Frequency factor

Exponential decay against the count of incidents in the window. Never reaches zero, so two providers with chatty weeks still rank against each other:

frequency_factor = 100 × exp(−incident_count ÷ e_fold) where e_fold[30d] = 100, e_fold[7d] = 40, e_fold[1h] = 6

Worked examples (30d, e_fold = 100): 20 incidents → 82, 100 incidents → 37, 200 incidents → 14.

Severity factor

Same exponential shape as frequency, but the input is severity-minutes — a duration-and-impact-weighted sum:

severity_minutes = Σ min(duration, 360min cap) × impact_multiplier where multipliers are minor 0.5, major 2, critical 6 severity_factor = 100 × exp(−severity_minutes ÷ scale) where scale[30d] = 3000, scale[7d] = 1500, scale[1h] = 400

Two design choices worth flagging:

Per-incident duration cap of 6h. A 48-hour minor that an operator forgot to close shouldn't torpedo the factor for weeks. Cap kicks in regardless of impact, so an 8h critical contributes 6h × 60 × 6 = 2160 sev-min, not 8h × 60 × 6.

Minor multiplier 0.5 (was 1). Statuspage providers tag a lot of stuff as minor — the lower weight keeps real outages (major/critical) dominant in the factor.

MTTR factor

Linear penalty against the median resolution time for the window. Hits zero at 120 minutes:

mttr_factor = max(0, 100 − median_minutes_to_resolve ÷ 1.2)