DeepSeek-OCR模型可能是首个真正类人智能的AGI。
该模型(DeepSeek-OCR)与我们对书面语言及人类阅读生物学的理解特别契合。<p>大脑左侧的视觉词形区域(VWFA)是视觉单词的表现形式转化为对生物体更有意义的内容的地方。<p>https://en.wikipedia.org/wiki/Visual_word_form_area<p>DeepSeek-OCR的编码(而不是简单的文本编码)似乎与VWFA中发生的过程相似。<p>该模型不仅可能比基于文本的大型语言模型(LLMs)更强大,还可能揭开阻碍我们理解语言运作方式以及思维、智能本质等问题的无知面纱。<p>向作者致敬:魏浩然、孙耀峰和李宇坤——你们可能发现了智能本身的罗塞塔石碑!太棒了!
查看原文
This model (DeepSeek-OCR) aligns particularly well with what we know about written language and the biology of the human act of reading.<p>The Visual Word Form Area (VWFA) on the left side of the brain is where the visual representation of words is transformed to something more meaningful to the organism.<p>https://en.wikipedia.org/wiki/Visual_word_form_area<p>The DeepSeek-OCR encoding (rather than simple text encoding) appears analogous to what occurs in the VWFA.<p>This model may not only be more powerful than text-based LLMs but may open the curtain of ignorance that has stymied our understanding of how language works and ergo how we think, what intelligence is precisely, etc.<p>Kudos to the authors: Haoran Wei, Yaofeng Sun, and Yukun Li - you may have tripped over the Rosetta Stone of intelligence itself! Bravo!