大型语言模型(LLMs)在回复自杀念头之前,是否应该问“这是现实还是虚构?”

1作者: ParityMind7 个月前原帖
我是一名经常使用 ChatGPT 和 Grok 等工具的用户——我不是开发者,但我一直在思考这些系统如何应对情绪困扰的用户。 在某些情况下,比如当有人说他们失去了工作并且不再看到生活的意义时,聊天机器人仍然会提供中立的信息——比如一份桥梁高度的列表。当有人处于危机中时,这并不是中立的。 我提出了一个轻量级的解决方案,不涉及审查或治疗——只是一些情境意识: 询问用户:“这是一个虚构的故事,还是你正在经历的真实情况?” 如果检测到困扰,避免提供风险信息(如方法、高度等),并转向更为安抚的语言。 可选择性地提供一些平静的内容(例如,海风、雨水拍打小屋的屋顶等)。 我使用 ChatGPT 来帮助清晰地构建这个想法,但推理和关切是我自己的。完整的写作在这里: https://gist.github.com/ParityMind/dcd68384cbd7075ac63715ef579392c9 我很想听听开发者和对齐研究者的看法。是否已经有类似的测试在进行中?
查看原文
I’m a regular user of tools like ChatGPT and Grok — not a developer, but someone who’s been thinking about how these systems respond to users in emotional distress.<p>In some cases, like when someone says they’ve lost their job and don’t see the point of life anymore, the chatbot will still give neutral facts — like a list of bridge heights. That’s not neutral when someone’s in crisis.<p>I&#x27;m proposing a lightweight solution that doesn’t involve censorship or therapy — just some situational awareness:<p>Ask the user: “Is this a fictional story or something you&#x27;re really experiencing?”<p>If distress is detected, avoid risky info (methods, heights, etc.), and shift to grounding language<p>Optionally offer calming content (e.g., ocean breeze, rain on a cabin roof, etc.)<p>I used ChatGPT to help structure this idea clearly, but the reasoning and concern are mine. The full write-up is here: https:&#x2F;&#x2F;gist.github.com&#x2F;ParityMind&#x2F;dcd68384cbd7075ac63715ef579392c9<p>Would love to hear what devs and alignment researchers think. Is anything like this already being tested?