The week after a P1, the responders are tired, the customer-facing channels are still humming, and somewhere on a wiki page that nobody opens, a "TODO: write postmortem" sits and ages. Two weeks later it still sits there. Six weeks later the next P1 hits and you write a fresh one — which is the same incident, again, because nobody read the last one.
The bottleneck is almost never the analysis. The bottleneck is the blank page.
What the human actually does well
A postmortem has six sections that look the same in every framework — Google's, Etsy's, PagerDuty's, ours, yours. The summary, the timeline, the impact, the contributing factors, the root cause, the action items. Of those six, only two genuinely need the responder's judgement: the contributing factors ("we had no alert on the new queue lag, because the dashboard was templated from the old service") and the action items ("add a queue-lag SLI to the new service template; backfill the missing dashboards by 30 May").
The other four are reconstruction work. The timeline is in the incident's audit log. The impact is in the duration, the severity, and the affected services. The summary is the title plus two sentences. The root cause — for the kind of incident where there is a single one — usually appears verbatim in the last few comments before resolution.
Reconstruction is what models are good at. Judgement is what humans are good at. We split the postmortem along that line.
What Argus drafts
When you close an incident — or any time after — clicking "Draft postmortem" runs a pipeline that takes about thirty seconds end-to-end:
- The timeline comes from the audit log. Every status change, severity change, owner change, comment timestamp, related-change-request link. Rendered as a chronological narrative, not a CSV.
- The summary comes from the title, the first comment, and the resolution comment. Three sentences, factual, no marketing voice.
- The impact comes from
created_at,resolved_at, and the severity field. Duration in minutes, services affected (from the linked-services list), and — if you have it wired up — the synthetic-monitor downtime overlap. - The candidate root cause comes from the model reading the last 10 comments, the resolution comment, and any linked change requests. It is offered as a suggestion with a confidence score; it never lands in the final doc unless the human accepts it.
The responder lands on a draft that has the dull parts filled in and the two judgement sections marked as "needs your input." Average time from "click draft" to "ready to share" is around twenty minutes — most of which is the responder writing the contributing factors and action items, which is exactly the work that should not be automated.
Where the model is not in the loop
Three places, deliberately:
- The action items. The model does not invent fix-it tasks. It can recap the comments that mention follow-ups, but every action item is the human's call.
- The severity classification on the postmortem itself. The postmortem inherits the incident's final severity. The model does not get to relitigate it.
- The sharing. Drafts are never auto-sent. The responder reviews and clicks publish, every time.
The reason for each of these is the same. A model that invents an action item creates a follow-up task that nobody owns. A model that bumps a severity post-hoc undermines the human who ran the incident. A model that auto-shares ends up sending a half-finished doc to the customer because somebody closed the wrong tab. The augmentation rule from the Argus voice — AI is augmentation, not automation; humans stay in the decision loop — applies hardest here.
What you see in the UI
The postmortem editor is a normal markdown editor with the seven sections pre-populated. Each AI-drafted section has a small Drafted by AI chip next to the heading; clicking it opens a sidebar with the source data the draft came from — the audit-log slice, the comment thread, the linked change requests. If you do not trust the draft, you can rebuild from a different starting point. If you do trust it, you keep typing.
The "needs your input" sections are not pre-filled. They have a placeholder that says what to write and an example, but the body is empty until the human writes it. The model knows when to stop talking.
The numbers from our own use
We dogfood Argus on Argus. Before the postmortem drafter shipped, our internal postmortems averaged four days from incident close to published doc; one in three was never written. With the drafter, the average is under an hour, and the "never written" rate has gone to zero — because the cost of starting one is no longer two hours of timeline reconstruction.
If your team is in the same place — postmortems written when somebody chases them, never written when nobody does — the fix is rarely "more discipline." It is removing the part that is genuinely hard to start. The judgement sections are still yours. The blank page does not have to be.