问HN:自己构建大型语言模型?

3作者: retube大约 1 个月前原帖
真正理解某个事物如何运作的最佳方式就是亲自去构建它。因此,我想知道是否有好的教程可以教你从零开始构建自己的大型语言模型(LLM)。也就是说,实施分词、嵌入、注意力机制等。我并不是说可以复制chatGPT,而是更倾向于基于一个小得多的语料库和训练数据构建一个玩具模型。
查看原文
The best way to really understand how something works is to build it yourself. So I am wondering if there are any good tutorials on building your own LLM from scratch. I.e. implementing tokenisation, embeddings, attention and so. I am not suggesting one could replicate chatGPT, but more a toy model based on a much smaller corpus and training data.