在生产环境中,您如何处理丢失的 webhook?

4作者: everydaydev2 个月前原帖
我曾在几家公司工作过,我们常常在几个小时后才发现来自 Stripe/Shopify 的关键 webhook 从未到达(可能是部署、超时、bug 等原因)。<p>每个团队最终都构建了相同的解决方案:重试逻辑、死信队列、监控。<p>想了解其他团队是如何处理这个问题的: - 你们依赖于服务提供商的重试策略吗? - 自己构建了可靠性层吗? - 使用某种服务吗? - 发生时只是手动对账吗?<p>(背景:正在构建 https://relaehook.com 来解决这个问题,但我真心想知道行业的普遍做法是什么。)
查看原文
I&#x27;ve worked at several companies where we&#x27;d discover hours later that critical webhooks from Stripe&#x2F;Shopify never arrived (deployment, timeout, bug, etc.).<p>Every team ended up building the same solution: retry logic, dead letter queue, monitoring.<p>Curious how others handle this: - Do you rely on the provider&#x27;s retry policy? - Built your own reliability layer? - Use a service? - Just manually reconcile when it happens?<p>(Context: Building https:&#x2F;&#x2F;relaehook.com to solve this, but genuinely curious what the norm is)