HackerNews中文版

我见过的每个运行多个机器学习模型的地方，最终都会形成一堆定制的推理服务：不同的API、不同的身份验证、不同的日志记录、半成品的仪表板，以及将这一切联系在一起的部落知识。我正在构建一个小型的副项目，试图标准化推理部分——在异构模型（本地、托管云、不同团队）前面设置一个单一的网关，处理推理API、版本控制/回滚、身份验证、基本指标和健康检查。不涉及训练、不涉及自动机器学习，也不是“端到端的MLOps平台”。在我投入更多时间之前，我想弄清楚这是否是：一个人们默默用内部解决方案掩盖的真实缺口，还是听起来有用但在现实约束下崩溃的东西。对于那些实际上在生产环境中运行机器学习的人：你们是否已经有了这样的内部推理层？推理通常在哪些地方出错（部署、版本控制、调试、合规）？在什么规模下，抽象就不再值得？我并没有宣布任何事情——我真心好奇这是否引起共鸣，还是我只是重新发现了为什么每个人都选择自己实现。

查看原文

Every place I’ve seen run more than a couple of ML models in production ends up with a mess of bespoke inference services: different APIs, different auth, different logging, half-working dashboards, and tribal knowledge holding it all together.I’ve been building a small side project that tries to standardize just the serving part — a single gateway in front of heterogeneous models (local, managed cloud, different teams) that handles inference APIs, versioning/rollback, auth, basic metrics, and health checks. No training, no AutoML, no “end-to-end MLOps platform”.Before I sink more time into it, I’m trying to figure out whether this is:a real gap people quietly paper over with internal glue, orsomething that sounds useful but collapses under real-world constraints.For people actually running ML in prod:Do you already have an internal inference layer like this?Where does inference usually go wrong (deployments, versioning, debugging, compliance)?At what scale does it stop being worth abstracting at all?Not announcing anything — genuinely curious whether this resonates or if I’m just rediscovering why everyone rolls their own.

展示HN：为什么机器学习推理在实践中仍然如此随意？