Why did SLA slip last week?
Start from SLA compliance, drill into carrier performance and failure patterns, propose what to investigate next.
This is the deeper-drilldown counterpart to Daily ops standup — called when a metric has moved, and the user wants a root-cause narrative rather than a snapshot.
When to use
- An ops lead notices SLA dropped on a daily report.
- A customer success manager asks "why did our on-time rate fall last week?".
- An executive asks "what happened?" before a meeting.
Tool sequence
report_sla_compliance (the headline — by how much, in which stage?)
│
▼
report_carrier_performance (which carriers contributed?)
│
▼
report_failed_delivery_analysis (which failure types? where?)
│
▼
narrative root-cause story
Example agent prompt
"SLA looks like it slipped last week — what happened?"
"Why did our on-time delivery rate drop?"
Walkthrough
Step 1 — confirm and quantify the slip
report_sla_compliance(
tenantId="…",
from="<2 weeks ago>",
to="<1 week ago>"
)
Then a second call for the comparison week:
report_sla_compliance(
tenantId="…",
from="<1 week ago>",
to="<today>"
)
The agent computes the delta and identifies which stage slipped (processing, collection, delivery, customer promise). That narrows the rest of the investigation.
Step 2 — carrier-level breakdown
report_carrier_performance(
tenantId="…",
from="<1 week ago>",
to="<today>"
)
The agent looks for carriers whose success rate fell more than the overall average. Often the slip is concentrated in one or two carriers; surfacing that is the most actionable insight.
Step 3 — failure breakdown
report_failed_delivery_analysis(
tenantId="…",
from="<1 week ago>",
to="<today>",
group_by="carrier",
failed_delivery_group_by="reason"
)
Pulls the failure shape — by carrier and by reason. Pair this with the carrier-performance data from step 2 to triangulate cause.
Step 4 — narrate the root cause
The agent collapses the three responses into a story:
"On-time delivery slipped from 91% to 84% week-over-week — mostly in the delivery stage. Aramex drove most of the gap; their success rate fell from 89% to 78%, and within Aramex failures, 'customer not available' jumped 40% (concentrated in KSA). Suggest checking with Aramex KSA on capacity / first-attempt rates, and reviewing whether our customer-notification SMS is being received in that region."
The shape: lead with the headline delta, attribute it to a slice (carrier / region / reason), end with the recommended investigation.
Variations
- Per-merchant — scope every call with
merchant=<id>for brand-specific drilldowns. - Per-region — scope with
country_accessfor region-specific drilldowns. Useful when SLA is regional (e.g. KSA fell, UAE steady). - Custom window — substitute "month over month" or "quarter over quarter" for the same shape.
- Pre-emptive variant — same three calls, run on a schedule (Slack bot every Monday). Only narrate when the delta exceeds a threshold; stay quiet otherwise.
Pitfalls
- Two reports' definitions of "success" can differ.
report_sla_compliancemeasures against promised dates;report_carrier_performanceincludes carrier-side success signals. The agent should narrate which lens it's using and not cross-reference numbers naively. - Sample size. A 7-day window for a small merchant might have too few shipments to draw conclusions. The agent should mention sample size when narrating, not just percentages.
- Don't confuse cause and correlation. Failure-reason data is a hypothesis, not a verdict. The agent's narrative should say "suggest checking…", not "the cause is…".
Related patterns
- Daily ops standup — the lighter cousin; surfaces problems for this pattern to drill into.
- Diagnose a stuck shipment — per-shipment drilldown when this pattern points to specific shipments worth investigating.