HackerNews中文版

该模型（DeepSeek-OCR）与我们对书面语言及人类阅读生物学的理解特别契合。大脑左侧的视觉词形区域（VWFA）是视觉单词的表现形式转化为对生物体更有意义的内容的地方。https://en.wikipedia.org/wiki/Visual_word_form_areaDeepSeek-OCR的编码（而不是简单的文本编码）似乎与VWFA中发生的过程相似。该模型不仅可能比基于文本的大型语言模型（LLMs）更强大，还可能揭开阻碍我们理解语言运作方式以及思维、智能本质等问题的无知面纱。向作者致敬：魏浩然、孙耀峰和李宇坤——你们可能发现了智能本身的罗塞塔石碑！太棒了！

查看原文

This model (DeepSeek-OCR) aligns particularly well with what we know about written language and the biology of the human act of reading.The Visual Word Form Area (VWFA) on the left side of the brain is where the visual representation of words is transformed to something more meaningful to the organism.https://en.wikipedia.org/wiki/Visual_word_form_areaThe DeepSeek-OCR encoding (rather than simple text encoding) appears analogous to what occurs in the VWFA.This model may not only be more powerful than text-based LLMs but may open the curtain of ignorance that has stymied our understanding of how language works and ergo how we think, what intelligence is precisely, etc.Kudos to the authors: Haoran Wei, Yaofeng Sun, and Yukun Li - you may have tripped over the Rosetta Stone of intelligence itself! Bravo!

DeepSeek-OCR模型可能是首个真正类人智能的AGI。