问HN:1千请求每秒的硬件配置?
我在一台CPU服务器上运行了一个未经过滤的模型。正如预期的那样,它非常慢(每个查询需要一两分钟)。<p>我需要什么样的硬件(GPU)才能支持每秒1000个请求?<p>我找不到未经过滤模型的API,这迫使我在本地运行。
查看原文
I ran an uncensored model on a CPU server. as expected its dead slow (min or two per query).<p>What kinda hardware (GPU) do i need to serve 1k RPS?<p>I could not find APIs for uncensored models that kinda forced me to run locally