operations
How to write an outage email customers will respect
The outage email you send to customers will be read more carefully than anything else your company writes that quarter. Customers are deciding, in the time it takes them to read three paragraphs, whether to trust you. The cost of getting it wrong is churn; the cost of getting it right is a 15-minute writing exercise. This post is a working template for outage communications that customers will respect instead of resent.
When (and whether) to send one
Not every outage warrants email. Three thresholds we've seen work:
- Status page only — any customer-affecting issue, even small ones. The status page is where customers go to check.
- In-app banner + status page — when most active users are affected, even if briefly. A banner in the dashboard catches the people who don't check status proactively.
- Email + status + in-app — when impact is severe (full outage > 30 min, data loss risk, security incident, or anything that affects billing). The email is for the people not actively using the product right now who will be surprised tomorrow.
Don't email customers for every blip. Doing so erodes the signal: if every Slack-notification-from-the-status-page Tuesday turns into an email, customers route them to spam.
The four parts of a credible outage email
1. The honest subject line
Use the word "outage" or "incident" if it was one. Don't bury it in "Important service update." Examples that work:
- "Service outage on May 12, 2026 — incident report"
- "[Resolved] Login outage on May 12 affecting EU customers"
- "Postmortem: 47-minute checkout outage on May 12"
Subject lines that don't work: "Update from the engineering team," "A note about yesterday," or anything that requires opening the email to figure out it's about an incident.
2. What happened, in two sentences
Lead with the facts: when, what, who was affected. Don't explain root cause in the first paragraph — that comes later.
Between 14:32 and 15:19 UTC on May 12, 2026, our checkout API returned 5xx errors for approximately 30% of requests. EU and US customers were affected; APAC was unaffected.
3. What we did, in two more sentences
Past tense. Active voice. What you actually did.
We identified the root cause as a database connection pool exhaustion triggered by a slow query in our 14:20 deploy. We rolled back the deploy at 15:12; service returned to normal within 7 minutes.
4. What we're changing, in one paragraph
The most important part. The reader is asking: "will this happen again?" Your answer needs to be specific. Vague promises ("we're investing in reliability") sound like marketing copy and make things worse.
To prevent recurrence: (1) we've added a slow-query linter to our deploy pipeline that blocks queries matching this pattern; (2) we've reduced our DB pool exhaustion threshold so we'll be paged 90 seconds earlier next time; (3) we're sharing a full postmortem on our status page by May 19.
Sending an update before you know root cause
Often you need to communicate during an active incident, before the root cause is known. The temptation is to wait until you can explain. Don't wait. Send a stub that says what you do know:
Subject: Investigating: checkout API errors (May 12) We're currently investigating an issue affecting checkout API requests. Approximately 30% of attempts are returning errors as of 14:32 UTC. Our engineering team is actively working on a fix. Live updates: https://status.example.com We'll send a follow-up email within 4 hours with a full post-incident summary, regardless of resolution time.
Two things this gives you. First: customers don't have to guess whether you know. Second: by committing to a follow-up time, you defuse the "why aren't they responding" pressure.
Phrases to avoid (and what to write instead)
- "We sincerely apologize for any inconvenience this may have caused." → Cut the boilerplate. Apologize once, briefly, at the end. The body should be facts.
- "Some customers experienced issues." → How many, where, doing what. Vague is worse than honest.
- "Due to an issue with a third-party provider..." → Naming the provider is fine if true and relevant. But don't use third-party blame as your only explanation. Your customer chose to depend on you, not on AWS or Cloudflare.
- "Our team worked tirelessly..." → Cut. Nobody is moved by this. Show, don't tell — the timeline does the work.
- "We take reliability very seriously." → Self-evidently. Cut. Replaces information with reassurance.
- "A series of unfortunate events..." → Real engineering language only. Florid phrasing reads as evasion.
Tone: technical, not corporate
The audience for an outage email is mostly the technical buyer: the developer, the IT manager, the founder. They've seen a thousand outages and they can smell PR language. Write the way you'd talk to a peer at another company — specific, plain-spoken, no fluff.
A test: read the email out loud. Would you say these sentences to someone's face? If "we sincerely regret" or "at this time" appear in your spoken version, cut them.
Who signs the email
For major incidents (severe impact, security implications, anything board-level), the signature should be senior leadership — CEO, CTO, founder. Not "The Engineering Team." Personal accountability signals seriousness. The exception: a SaaS with a strong head of engineering or director of reliability — naming that person is fine and makes the email feel more direct.
For minor incidents, "The [Product] team" is okay. Don't over-escalate routine ones.
A complete template
Subject: [Resolved] [Service] outage on [Date] — [Duration] On [Date], between [start time UTC] and [end time UTC], [Service] experienced [type of failure]. This affected approximately [%/region/segment] of users; the remaining users were unaffected. What happened: [1-2 sentences. Technical but accessible. What broke and why.] What we did: [1-2 sentences. The actions taken to restore service. Include rollback, mitigation, restart, etc.] What we're changing to prevent recurrence: 1. [Specific change. Owner. Timeline.] 2. [Specific change. Owner. Timeline.] 3. [Specific change. Owner. Timeline.] We'll publish a full postmortem on our status page by [date — typically within 1 week]. If you'd like to discuss the impact on your account, reply to this email and we'll route you to the right person. We're sorry for the disruption. — [Name], [Title]
Where to go from here
Draft a template like this once. Save it in your incident management runbook. The next time you need it, you're editing a known-good document under stress instead of starting from a blank page. The 30 minutes you save when you actually need it are worth the hour to write the template now. For the underlying postmortem, see our incident postmortem template.