HackerNews中文版

我所拥有的所有示例都高度未优化——例如，Modal Labs使用FastAPI - [Modal Labs示例](https://modal.com/docs/examples/chatterbox_tts)。BentoML也使用类似的FastAPI服务 - [BentoML博客](https://www.bentoml.com/blog/deploying-a-text-to-speech-application-with-bentoml)。即使是Chatterbox TTS也有一个非常简单的示例 - [Chatterbox GitHub](https://github.com/resemble-ai/chatterbox)。Tritonserver的文档中没有TTS示例。我百分之百确定，可以使用TritonServer编写一个高度优化的变体，利用模型的并发和批处理。如果有人已经使用Tritonserver实现了TTS服务，或者有更好的推理服务器替代方案可以部署，请帮我一下。我不想重复造轮子。

查看原文

All the examples I have are highly unoptimized - For eg, Modal Labs uses FastAPI - https://modal.com/docs/examples/chatterbox_tts\ BentoML also uses FastAPI like service - https://www.bentoml.com/blog/deploying-a-text-to-speech-application-with-bentoml\Even Chatterbox TTS has a very naive example - https://github.com/resemble-ai/chatterbox\Tritonserver docs don’t have a TTS example.I am 100% certain that a highly optimized variant can be written with TritonServer, utilizing model concurrency and batching.If someone has implemented a TTS service with Tritonserver or has a better inference server alternative to deploy, please help me out here. I don’t want to reinvent the wheel.

请问HN：你们使用什么推理服务器来托管文本转语音模型？