Retry and idempotency
Webhook delivery isn't fire-and-forget. When your endpoint fails to acknowledge (non-2xx response, timeout, network error), Carriyo retries with exponential backoff. This is essential for reliability, but it means your handler will see the same event more than once in normal operation. Designing for idempotency isn't optional.
Retry behavior
Carriyo waits up to 30 seconds for a 2xx response on
each delivery attempt. If the response is non-2xx, the
connection times out, or the connection fails, the delivery is
treated as failed and retried.
Standard retries
By default, Carriyo retries a failed delivery 3 times with short delays:
| Retry | Delay (since previous attempt) | Cumulative delay (since original event) |
|---|---|---|
| 1 | 1 minute | 1 minute |
| 2 | 3 minutes | 4 minutes |
| 3 | 5 minutes | 9 minutes |
After the third retry, no further automatic delivery attempts are made under the standard schedule.
Extended retries
A tenant can enable extended retries on top of the standard ones. When enabled, Carriyo continues retrying for up to 30 hours after the original event:
| Retry | Delay (since previous attempt) | Cumulative delay (since original event) |
|---|---|---|
| 1 | 1 hour | 1 hour |
| 2 | 3 hours | 4 hours |
| 3 | 5 hours | 9 hours |
| 4 | 8 hours | 17 hours |
| 5 | 13 hours | 30 hours |
Extended retries can lead to out-of-order events. A newer event for the same shipment may be processed while an older failed one is still being retried hours later. Only enable extended retries if your handler can cope with events arriving out of sequence.
Manual retrigger
If both the standard and extended retries are exhausted, no further automatic attempts are made. The event sits in Settings → Integration Monitor and can be manually retriggered once the underlying issue is fixed:
- Replay one. Re-trigger delivery for a single event.
- Replay many. Bulk-replay events that failed during a known outage window.
The same is available via API
(POST /webhook-events/retrigger).
Other behaviors worth knowing
- Same event id is reused across retries. The retry isn't a new event, it's another attempt at delivering the same one.
- 2xx is final. Anything that returned a 2xx is considered delivered, even if your handler later realized it shouldn't have processed it.
Why idempotency matters
A retry storm during a transient outage means your endpoint
sees the same shipment.status_updated event repeatedly.
Without idempotency, that means:
- Your customer gets the "your order has shipped" email three times.
- Your finance system fires three refunds for one return.
- Your OMS bumps a counter three times.
The fix is to design every webhook handler to be idempotent: applying the same event twice produces the same outcome as applying it once.
Idempotency strategies
The unique event id is sent in the event-id HTTP header
on every delivery. It's stable across retries of the same
event, which makes it the natural idempotency key.
Key on event id
Track processed event ids in a small table; check before processing, record after. Simple and reliable.
async function handle(req) {
const eventId = req.headers["event-id"];
if (await processedEvents.exists(eventId)) {
return ok(); // already processed, no-op
}
await applyTheEvent(req.body);
await processedEvents.record(eventId);
return ok();
}
The processed-events table can have a TTL of a few days. That is long enough for retries (up to ~30 hours with extended retries enabled) to land within the window, but short enough to keep the table from growing unbounded.
Key on entity state
When you don't want to maintain a processed-events table,
idempotency on the outcome works for many cases. Setting a
shipment's status to delivered twice produces the same final
state as setting it once. Don't do anything new when the
current state already matches what the webhook reports.
async function handle(req) {
// Shipment webhooks post the Shipment object directly.
const shipment = req.body;
const current = await shipments.getByRef(shipment.partner_shipment_reference);
if (current.status === shipment.status) {
return ok(); // already at this state, no-op
}
await shipments.update(/* ... */);
return ok();
}
The body shape depends on the entity type. Shipment and
return-request webhooks post the entity object directly. Order
webhooks post {oldImage, newImage, trigger, operation}. See
Webhooks → what's in the payload.
Use unique constraints
For side-effects that must happen once (sending an email, creating a refund), use a unique key in your downstream system to prevent duplicates.
async function handle(req) {
const eventId = req.headers["event-id"];
const shipment = req.body;
await emails.sendUnique({
idempotency_key: `delivered:${eventId}`,
template: "delivered",
to: shipment.dropoff.contact_email,
});
return ok();
}
Most email providers (SendGrid, Postmark, Mailgun, AWS SES) support an idempotency key on send. Same for payment processors on refunds.
Don't return 5xx for "I don't know what to do"
A common mistake: an event arrives for a
partner_shipment_reference your system doesn't recognize
(test data, off-by-one, sync gap), and the handler returns
500. Carriyo retries. The same unknown reference comes in
again. Same 500. Repeat.
Return 200 in that case (and log the unknown reference for investigation). 5xx is for "my system is broken, please retry", not "this reference is meaningless to me, please stop".
How it fits with other modules
- Webhooks. The parent module.
- Configurations. What's being retried against.
- Authentication. The same auth headers go on each retry.