Idempotency: The Property Your Retries Depend On
Jeff Straney·
The job that sends the welcome email had retry logic. Three attempts, exponential backoff, all the right practices. The job ran, the email service timed out, the retry fired, the email service succeeded. Two welcome emails went out to the new user, three minutes apart.
The support ticket was polite. The user assumed they'd signed up twice. They had signed up once. The retry logic was correct. The underlying operation was not idempotent.
This is the version with a sympathetic failure mode. I've seen the same pattern with payment processing, where the failure mode is less sympathetic.
What idempotent means
An operation is idempotent if performing it once and performing it twice produce the same result. GET requests in HTTP are supposed to be idempotent: asking for the same resource twice should return the same thing. PUT is supposed to be idempotent: sending the same update twice should leave the resource in the same state.
POST is not supposed to be idempotent, which is why browsers ask "are you sure you want to resubmit this form?" Creating a resource, sending an email, initiating a payment: these operations produce side effects, and doing them twice produces two side effects.
The problem with retry logic is that retries are inherently "do this again." If the operation is not idempotent, "do this again" means "do the side effect again."
Idempotency keys
The standard solution for operations that have side effects but need to be safe to retry is the idempotency key.
The client generates a unique key for each intended operation. When it submits the operation, it sends the key alongside it. The server records the key and the result. If the same key arrives again, the server returns the stored result without performing the operation again.
async function sendWelcomeEmail(userId: number): Promise<void> {
const key = `welcome-email:${userId}`;
const existing = await redis.get(key);
if (existing) {
return; // Already sent, don't send again
}
await emailService.send({
to: (await getUser(userId)).email,
template: "welcome",
});
await redis.set(key, "sent", "EX", 86400 * 7); // 7 days
}
The key has to be based on the logical operation, not a random value. welcome-email:${userId} is correct because there should only ever be one welcome email per user. A key based on a UUID generated per attempt would allow duplicates.
For payment operations, the idempotency key is usually generated by the client and passed explicitly, because "the same payment" needs to be defined by business intent, not by the data in the request.
The race condition
The naive implementation above has a race condition: two retries could both check the key, find it absent, and both proceed. For low-volume operations where retries are rare, this is unlikely to matter. For high-volume operations, you need the check-and-set to be atomic.
async function sendWelcomeEmail(userId: number): Promise<void> {
const key = `welcome-email:${userId}`;
// SET NX is atomic: sets only if the key doesn't exist
const acquired = await redis.set(key, "in-progress", "NX", "EX", 3600);
if (!acquired) {
return; // Another process is handling or has handled this
}
try {
await emailService.send({
to: (await getUser(userId)).email,
template: "welcome",
});
await redis.set(key, "sent", "EX", 86400 * 7);
} catch (e) {
// Failed: clear the lock so a retry can try again
await redis.del(key);
throw e;
}
}
The SET NX operation is atomic in Redis. Only one process can acquire it. The TTL on the in-progress state handles the case where the process crashes without completing.
Webhook delivery
Webhook delivery is the clearest case where idempotency matters in practice. When you send a webhook, you send it and then wait for a 2xx response. If the response doesn't arrive (network timeout, the receiving server was slow), you retry. But the server may have received and processed the first request. If it processes the retry too, you have a duplicate event.
The standard solution: every webhook event has a unique event ID. The receiver records which event IDs it has processed. Before processing, it checks whether the ID is already known. If so, it returns 200 without processing again.
This places the responsibility for idempotency on the receiver, which is correct. The sender has no way to know whether the original request was received.
CREATE TABLE processed_events (
event_id TEXT PRIMARY KEY,
processed_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
async function processWebhook(event: WebhookEvent): Promise<void> {
const inserted = await db.processedEvents.insertIgnore({ eventId: event.id });
if (!inserted) {
return; // Already processed
}
await handleEvent(event);
}
The database uniqueness constraint makes the deduplication atomic. No separate lock needed.
The mental model
Retry logic says "if this fails, try again." Idempotency says "if you try again, it's safe." You need both. Retry logic without idempotency is optimistic: you're hoping the operation didn't partially complete before the failure. Idempotency without retry logic leaves you with permanent failures when transient issues occur.
The question to ask for every operation that has side effects and is on a retry path: if this executes twice, what happens? If the answer is "two welcome emails" or "two charges," you need idempotency keys. If the answer is "the same state as one execution," you're already idempotent and retries are safe.
