在GKE中,一个日志循环让我在3天内花费了1300美元,成本是我实际基础设施的9.2倍。

3作者: nthypes大约 1 个月前原帖
上个月,我在我的GKE集群(圣保罗地区)中有一个容器进入了错误循环,输出到标准输出的日志达到了每秒约2000条。我通过亲身经历发现,GKE的默认行为是将100%的日志吞入Cloud Logging,且没有速率限制。在警报触发之前,我的账单几乎跳涨了1000%。 基础设施(计算):约140美元(821雷亚尔) Cloud Logging:约1300美元(7554雷亚尔) 比例:日志费用是实际服务器费用的9.2倍。 https://imgur.com/jGrxnkh 我修复了循环,并立即暂停了`_Default`接收器。 我提交了一个账单工单,请求对这项失控资源进行“一次性礼遇调整”——这是AWS/Azure首次出现异常时的标准做法。 我已经被拒绝了两次。 最新的回复是:“由于我们的内部政策,团队拒绝了调整请求。” 如果你使用GKE,Log Router中的`_Default`接收器会捕获所有容器的标准输出和标准错误。 对于日志的摄取量没有默认限制,这真是荒谬! 一个简单的`while(true); do echo "error"; done`就能让一个小项目破产。 请前往Logging -> Log Router,编辑`_Default`接收器。 添加一个排除过滤器:`resource.type="k8s_container"` `severity=INFO`(或排除特定命名空间)。 最近有没有人成功将账单争议升级到一级支持以上? 看来他们的政策现在是强制全额支付,即使是明显的失控或意外使用,这真是荒谬,因为这只是日志!文本!
查看原文
Last month, a single container in my GKE cluster (Sao Paulo region) entered an error loop, outputting to stdout at ~2k logs&#x2F;second. I discovered the hard way that GKE&#x27;s default behavior is to ingest 100% of this into Cloud Logging with no rate limiting. My bill jumped nearly 1000% before alerts caught it.<p>Infrastructure (Compute): ~$140 (R$821 BRL) Cloud Logging: ~$1,300 (R$7,554 BRL)<p>Ratio: Logging cost 9.2x the actual servers.<p>https:&#x2F;&#x2F;imgur.com&#x2F;jGrxnkh<p>I fixed the loop and paused the `_Default` sink immediately.<p>I opened a billing ticket requesting a &quot;one-time courtesy adjustment&quot; for a runaway resource—standard practice for first-time anomalies on AWS&#x2F;Azure.<p>I have been rejected twice.<p>The latest response: &quot;The team has declined the adjustment request due to our internal policies.&quot;<p>If you run GKE, the `_Default` sink in Log Router captures all container stdout&#x2F;stderr.<p>There is NO DEFAULT CAP on ingestion volume which is an absurd!<p>A simple while(true); do echo &quot;error&quot;; done can bankrupt a small project.<p>Go to Logging -&gt; Log Router. Edit _Default sink.<p>Add an exclusion filter: resource.type=&quot;k8s_container&quot; severity=INFO (or exclude specific namespaces).<p>Has anyone successfully escalated a billing dispute past Tier 1 support recently?<p>It seems their policy is now to enforce full payment even on obvious runaway&#x2F;accidental usage which is absurd since its LOGS! TEXT!