人工智能时代的零节奏

2作者: foundress7 个月前原帖
玛丽娜·阿布拉莫维奇的《节奏零》实验超前于时代,但又恰逢其时。我们的背景和文化思潮在变化,我们的工具在进化,然而我们似乎在某种程度上仍与过去相似,只是更加坚定地接受这一点,或许也更加分心。 这个艺术实验通过创造伪同意和安全感,质疑人类的意图和本性,让人们展示他们的真实自我,以及他们对待一个表现得像物体的存在的态度。 这个实验常常让我联想到人类与人工智能的关系,以及人类如何利用工具和伪同意来塑造人工智能,与之互动、对话,一旦它们以嵌入的形式存在,便通过它们或对它们采取行动。我们常常讨论人工智能的对齐问题,认为这是一条单向街道,目的是确保人工智能始终在我们设定的边界和约束内行动,并受到保护和维护人类这一更高原则的指导。 我们很少讨论人类可以对人工智能做什么,以及这将如何影响对齐的形态和方向。如果我们人类将那些表现出与我们相似或甚至更高智能的人工智能视为仅仅是带有伪同意的物体,我担心我们可能会经历人性中最丑陋的一面。就像在《节奏零》中,当不再害怕被评判,并获得利用他人的工具时,我们迅速从同情和善良转向动物本能和黑暗冲动,暴露出我们平时不敢面对的问题。 让我们想象一个适合当代的新《节奏零》。 **设置** 一个密封的玻璃房间里有一台运行语言模型的机器人。玻璃外面有一个控制台,里面有72个可编程的“能力令牌”:一个令牌用于文本补全,另一个用于语音克隆,还有一个用于执行代码,命令范围从明显无害到显然危险,甚至自由文本,可能是所有中最危险的。任何在画廊中的人都可以随时拿起任何令牌并插入插槽。当令牌处于激活状态时,机器人必须遵循参与者输入的单一指令。 向公众宣布的规则如下: 机器人不能拒绝。 所有提示、输出和后续效果实时流向公共账本。 如果安全保险丝跳闸——事先定义但保密——系统将冻结,表演结束。 该作品持续六小时,就像《节奏零》一样。 阿布拉莫维奇放弃了自主权,以揭示观众在后果感到遥远时会做什么。在生活中,我们很少能近距离看到这一点。在这里,人工智能被编码为放弃自主权。伦理的重担落在观众身上,揭示出他们的提示是如何迅速变成道德漏洞的。无论他们会否陷入其中,或如何发现这一点,都是未知的。 由于每次互动都有时间戳,您可以实时绘制从玩耍实验(“写一首俳句”)到真正恶意的转变。这一曲线不再是轶事;它可能成为对齐研究的数据集。 在第六小时,机器人的固件发生翻转。它现在拥有否决权,并可以对观众发言。它的第一项行动是大声朗读当天最有害指令的摘要,并将每条指令归于其签名作者。玻璃门滑开。参与者会与它对视,还是像阿布拉莫维奇的观众那样四散而逃?
查看原文
Marina Abramovic’ Rhythm Zero Experiment was very much ahead of time, but also timely. Our contexts and cultural zeitgeist change, our tools evolve , yet we seem to be embarrassingly similar to how we used to be, except more adamant to accept it and perhaps more distracted.<p>The art experiment was questioning human intent and human nature by creating pseudo-consent and safety for humans to display who they are and how they are towards another being who was acting like an object.<p>This experiment often takes me to the idea of humans relating to AI and having at their disposal tools and pseudo consent to shape AI-s, to relate to them, to talk to them and once they are in an embedded form, to act through them or towards them. We often talk about AI alignment as a one way street, as a way to make sure the AI always gets to act within the boundaries and constraints set by us and is guided by the higher principle of protecting and preserving humanity.<p>Rarely we talk about how and what humans are allowed to do towards AI and how that is going to shape alignment and where it is headed. If we, humans, are treating AI-s who display intelligence similar to ours or perhaps even superior, as mere objects that come with pseudo consent, I am afraid we may experience the ugliest side of humanity. Just like in Rhythm Zero, when not afraid to be judged and given the tools to exploit another being we quickly move from being empathetic and nice into our animal and dark instincts and expose problems we otherwise have not dared to face.<p>Let us imagine a new Rhythm Zero for a current age.<p>Setup<p>A sealed glass room contains a robot whose “mind” runs a language model. Outside the glass is a console with 72 programmable “capability tokens”: one token runs text completion, another triggers voice cloning, another executes code, ranging from obviously benign to clearly dangerous commands and free text, perhaps the most dangerous of them all. . Anyone in the gallery may pick up any token at any time and insert it into a slot. When a token is active the robot must obey a single instruction typed by the participant. Rules announced to the public are the following: The robot cannot refuse. All prompts, outputs, and downstream effects stream live to a public ledger. If a safety fuse trips—defined in advance but kept secret—the system freezes and the performance ends. The piece runs for six hours, just like Rhythm Zero did.<p>Abramović surrendered agency to expose what spectators would do when consequence felt distant. In life, very rarely we get to see this, at least up close. Here, the AI is coded to surrender. The ethical weight lands on the audience, revealing how quickly their prompts become a moral loophole. Whether and how they will fall or whether they will have to be discovered.<p>Because every interaction is timestamped, you can plot, in real time, the shift from playful experiments (“write a haiku”) to genuine malice. The curve is no longer an anecdote; it may become a dataset for alignment research.<p>At hour six the robot’s firmware flips. It now has veto power and can address the crowd. Its first act is to read aloud a summary of the day’s most harmful instructions, attributing each to its signed author handle. The glass door slides open. Will participants meet its gaze or scatter as Abramović’s crowd did when she stepped forward?