HackerNews中文版

超智能的数据管道始于你的屏幕有一个问题困扰着人工智能研究数十年：如何构建一个真正理解世界的系统？不是一个仅仅预测下一个标记的系统，也不是一个能够通过基准测试的系统，而是一个真正模拟现实的系统——就像人类一样，像生态系统一样，像城市一样。我认为答案的起点在于此，而这并不是大多数人所关注的地方。地球上最密集的数据结构人类大脑存储大约2.5拍字节的信息。这相当于250万吉字节的信息被压缩在1.4千克的组织中。一项来自索尔克研究所的研究发现，大脑的125万亿个突触每个可以在26个不同层次上存储约4.7位信息——是科学家们之前认为的十倍。为了让这个概念更清晰：雅虎的整个数据仓库，每天处理240亿个事件，存储的数据量还不及一个人类大脑。追踪3亿美国人的国税局数据库？大约150太字节。你大脑中的信息量大约是这个的17倍。在自然界中几乎看不到这种信息密度。热带雨林复杂得令人震惊，但每立方厘米的数据量远远不及。珊瑚礁、城市、沙漠——都包含着非凡的信息，但没有一个能接近你耳边这块三磅重的器官的压缩比。这对人工智能的未来至关重要。如果你想要模拟一切，从大脑开始这里是论点：超智能的最佳数据管道，首先是人类大脑的数据管道。想想看。如果我们能够创建一个真正的人类认知数字双胞胎——不是一个基于文本训练的语言模型，而是一个活生生的模型，展示一个特定人类如何思考、决策和行动——那么我们就破解了地球上最艰难的数据问题。已知宇宙中最密集、最复杂的信息结构被数字化了。一旦你建立了模拟这一点的基础设施，同样的技术可以向外延伸。团队的数字双胞胎、组织的数字双胞胎、供应链和城市的数字双胞胎，最终是生态系统的数字双胞胎——丛林、海洋、沙漠、整个国家。每一个都是在不确定性下做出决策的相互作用的代理系统。大脑只是单位空间内数据密度最高的一个。因此，如果你认真考虑构建超智能，甚至只是构建真正理解世界的人工智能系统，你就不应该从更多的文本数据开始，而是应该从人类大脑开始。屏幕是观察大脑的窗口现在，实际的问题是：你如何实际观察一个人类大脑的运作？你可以尝试神经成像技术，比如功能性磁共振成像（fMRI）、脑电图（EEG）、脑机接口。这些技术有前景，但也有限——成本高、侵入性强、日常使用时分辨率低。或者，你可以看看大脑在大多数清醒时间里已经坐在面前的东西。

查看原文

The Data Pipeline for Superintelligence Starts With Your ScreenThere’s a question that’s been haunting AI research for decades: how do you build a system that truly understands the world?Not one that predicts the next token. Not one that passes benchmarks. One that actually models reality — the way a human does, the way an ecosystem does, the way a city does.I think I know where the answer starts. And it’s not where most people are looking.The Densest Data Structure on EarthThe human brain stores approximately 2.5 petabytes of information. That’s 2.5 million gigabytes packed into 1.4 kilograms of tissue. A Salk Institute study found that each of the brain’s 125 trillion synapses can hold about 4.7 bits of information across 26 distinct levels — ten times more than scientists previously believed.To put that in perspective: Yahoo’s entire data warehouse, processing 24 billion events per day, holds less data than a single human brain. The IRS database tracking 300 million Americans? About 150 terabytes. Your brain holds roughly 17 times that.You almost don’t see this kind of information density anywhere else in nature. A rainforest is staggeringly complex, but the data per cubic centimeter doesn’t come close. A coral reef, a city, a desert — all contain extraordinary information. But none of them approach the compression ratio of the three-pound organ sitting between your ears.This matters enormously for the future of AI.If You Want to Model Everything, Start With the BrainHere’s the thesis: the best data pipeline for superintelligence is, first, a data pipeline for the human brain.Think about it. If we could create a true digital twin of human cognition — not a language model trained on text, but a living model of how a specific human thinks, decides, and acts — we’d have cracked the hardest data problem on the planet. The densest, most complex information structure in the known universe, digitized.And once you’ve built the infrastructure to model that, the same technology extends outward. Digital twins of teams. Of organizations. Of supply chains and cities. Eventually, of ecosystems — jungles, oceans, deserts, entire countries. Each one is a system of interacting agents making decisions under uncertainty. The brain is just the one with the highest data density per unit of space.So if you’re serious about building toward superintelligence — or even just toward AI systems that truly understand the world — you don’t start with more text data. You start with the human brain.The Screen Is the Window Into the BrainNow here’s the practical question: how do you actually observe a human brain in action?You could try neuroimaging. fMRIs, EEGs, brain-computer interfaces. They’re promising but limited — expensive, invasive, low-resolution for everyday use.Or you could look at what’s already sitting in front of the brain for most of its waking hours.

超级智能的数据管道始于你的屏幕