请问HN:基于变换器的语言模型会达到改进的瓶颈吗?

1作者: jaguar7530 天前原帖
我有一个根本性的问题,那就是当前基于变压器的架构是否足够强大,能够向通用人工智能(AGI)扩展,还是说这里存在一些越来越明显的瓶颈?虽然来自领先大型语言模型(LLM)公司的专家在这一领域处于前沿,但由于明显的利益冲突,很难完全相信他们的说法。而且,目前尚不清楚学术界是否在这方面拥有更多的理论知识,因为这些商业公司似乎也在主导相关的研究工作。
查看原文
A fundamental question I have is whether the current transformer based architectures are powerful enough to allow scaling towards AGI or are there ceilings here that are becoming more evident?<p>Though folks from the leading LLM companies are in the forefront of this, it’s hard to take their word due to an obvious conflict of interest. And it’s unclear if academia has more theoretical knowledge here as it seems these commercial companies are also leading the research work here.