展示HN:为什么机器学习推理在实践中仍然如此随意?

4作者: krish678大约 1 个月前原帖
我见过的每个运行多个机器学习模型的地方,最终都会形成一堆定制的推理服务:不同的API、不同的身份验证、不同的日志记录、半成品的仪表板,以及将这一切联系在一起的部落知识。 我正在构建一个小型的副项目,试图标准化推理部分——在异构模型(本地、托管云、不同团队)前面设置一个单一的网关,处理推理API、版本控制/回滚、身份验证、基本指标和健康检查。不涉及训练、不涉及自动机器学习,也不是“端到端的MLOps平台”。 在我投入更多时间之前,我想弄清楚这是否是: 一个人们默默用内部解决方案掩盖的真实缺口,还是 听起来有用但在现实约束下崩溃的东西。 对于那些实际上在生产环境中运行机器学习的人: 你们是否已经有了这样的内部推理层? 推理通常在哪些地方出错(部署、版本控制、调试、合规)? 在什么规模下,抽象就不再值得? 我并没有宣布任何事情——我真心好奇这是否引起共鸣,还是我只是重新发现了为什么每个人都选择自己实现。
查看原文
Every place I’ve seen run more than a couple of ML models in production ends up with a mess of bespoke inference services: different APIs, different auth, different logging, half-working dashboards, and tribal knowledge holding it all together.<p>I’ve been building a small side project that tries to standardize just the serving part — a single gateway in front of heterogeneous models (local, managed cloud, different teams) that handles inference APIs, versioning&#x2F;rollback, auth, basic metrics, and health checks. No training, no AutoML, no “end-to-end MLOps platform”.<p>Before I sink more time into it, I’m trying to figure out whether this is:<p>a real gap people quietly paper over with internal glue, or<p>something that sounds useful but collapses under real-world constraints.<p>For people actually running ML in prod:<p>Do you already have an internal inference layer like this?<p>Where does inference usually go wrong (deployments, versioning, debugging, compliance)?<p>At what scale does it stop being worth abstracting at all?<p>Not announcing anything — genuinely curious whether this resonates or if I’m just rediscovering why everyone rolls their own.