问HN:我们是否已经到了软件能够自我改进的阶段?
我想在这里讨论一下这个想法,看看你们认为它是否可行和/或有用,同时也想知道这是否已经是过时的信息,大家都在做了。
大致内容是这样的:设想有一个面向用户的软件,它在世界上运行,正常运作。为了这个例子,它不需要太复杂,假设有一个叫Foo的库存管理系统,每天有几百人使用。
现在想象你设置了一个“循环”,其工作方式如下:
每24小时,一个编码代理会启动,使用以下提示:
“这是应用程序Foo的代码库。那边是过去24小时内Foo生成的所有应用和系统日志。这里是过去24小时内发送到support@foo-app.com的所有邮件。这里是Foo收集的所有用户界面遥测数据:用户点击的位置、滚动到的位置等。那边是Foo当前的数据库快照。这里是描述Foo商业目标的文档,以及它需要遵循的基本限制(法律、财务等)。
这是你的任务:分析所有这些输入,分析代码库——然后基于你的分析,创建三个最紧迫的改进建议的拉取请求,使Foo成为更好的用户软件。”
是的,我省略了很多细节,但这难道不应该导致某种形式的完全自动化改进,使软件Foo随着时间的推移不断优化吗?
查看原文
Putting this out here to hear if you think this is feasible and/or useful, but also to find out if this is yesterday's news and everybody is already doing it.<p>So here is the gist of it: Imagine a user-facing software that is out there in the world, doing its thing. Doesn't have to be anything fancy for this example, let's say an inventory management system called Foo, used by several hundred people a day.<p>Now imagine you set up a kind of "loop" that works like this:<p>Every 24 hours, a Coding Agent launches, with the following prompt:<p>"Here is the codebase for application Foo. Over there are all application and system logs that Foo produced over the past 24 hours. Over here are all emails that went to support@foo-app.com in the past 24 hours. Over here is all UI telemetry that Foo collected: where users clicked, where they scrolled to, etc. Over there is the current database snapshot of Foo. Here is the document that describes the business goals of Foo, and the basic limitations (legal, financial) in which it needs to operate.<p>This is your mission: analyze all these inputs, analyze the codebase — and then create Pull Requests with the three most pressing improvements that make Foo a better software for its users, based on your analysis."<p>Yes, there's a ton of details I'm glossing over, and yet: shouldn't something like this lead to some kind of fully automatic improvement of software Foo over time?