我的GLM-5.1编码代理在LiveCodeBench Lite上得分为94.3%(348/369)。

3作者: univence4 天前原帖
我正在构建 Univence,这是一个由 GLM-5.1 驱动的定制自主编码代理平台。<p>我们希望将其打造成真正的 Replit/Vercel 竞争对手,但没有任何供应商锁定。您可以在我们的平台上与我们的最先进代理一起完全构建和开发,但您拥有代码,并且可以无缝部署到任何第三方主机,如 DigitalOcean、Netlify、AWS 或您自己的 VPS。<p>为了证明核心代理的能力,我们刚刚在 LiveCodeBench Lite 数据集(Python 分割)上进行了测试。以下是盲测 369 个问题的结果:<p><pre><code> 总计:348/369 通过(94.3%) 简单:138/141 通过(97.9%) 中等:152/156 通过(97.4%) 困难:58/72 通过(80.6%) </code></pre> (注意:我们通过对代理的约束进行工程设计,严格优先考虑最佳时间复杂度,如 O(n log n),而非暴力破解的 O(n^2),从而避免了通常会导致标准包装器出现超时错误的情况,成功达到了 80% 的困难题通过率)。<p>但我们不仅仅是为了技术而构建这个平台。我的联合创始人是一位目前居住在加沙地带的巴勒斯坦难民,我们推出这个平台是为了产生即时的人道主义影响。该平台每年 11 个月的利润将100%直接捐赠用于支持巴勒斯坦难民。<p>这个代理已经非常出色,但我有一份架构改进的路线图,可以让它变得更好。目前,我正在寻找快速的天使投资、计算资源赞助或战略合作伙伴,以帮助我们尽快扩展。<p><pre><code> 尝试一下:https://univence.com 原始 JSONL 轨迹日志:https://github.com/UnivenceAI/Univence-benchmarks/tree/main/Z%20AI/GLM-5.1 关注我们的进展和捐赠证明:https://x.com/UnivenceAI </code></pre> 我非常欢迎您对平台或代理架构的反馈。如果您是投资者或希望支持我们的使命,可以在 X 上私信我,或者通过 univenceai@gmail.com 联系我们。
查看原文
I’ve been building Univence, a custom autonomous coding agent platform powered by GLM-5.1.<p>We are building this to be a true Replit&#x2F;Vercel competitor, but with zero vendor lock-in. You can build and develop entirely on our platform alongside our SOTA agent, but you own the code and can deploy it seamlessly to any 3rd-party host like DigitalOcean, Netlify, AWS, or your own VPS.<p>To prove the core agent&#x27;s capability, we just ran it against the LiveCodeBench Lite dataset (Python split). Here is the breakdown over a blind 369-problem run:<p><pre><code> Total: 348&#x2F;369 passed (94.3%) Easy: 138&#x2F;141 passed (97.9%) Medium: 152&#x2F;156 passed (97.4%) Hard: 58&#x2F;72 passed (80.6%) </code></pre> (Note: We achieved that 80% on Hard by engineering the agent&#x27;s constraints to strictly prioritize optimal time complexities like O(n log n) over brute-force O(n^2), avoiding the Time Limit Exceeded errors that usually trip up standard wrappers).<p>But we aren&#x27;t just building this for the tech. My co-founder is a Palestinian refugee currently living in the Gaza Strip, and we are launching this to drive immediate humanitarian impact. 100% of the profits from 11 months of the year from this platform will be donated directly to support Palestinian refugees.<p>The agent is this good already, but I have a roadmap of architectural ideas to make it even better. Right now, I&#x27;m looking for fast angel funding, compute sponsorships, or strategic partners to help us scale this ASAP.<p><pre><code> Try it out: https:&#x2F;&#x2F;univence.com Raw JSONL trajectory logs: https:&#x2F;&#x2F;github.com&#x2F;UnivenceAI&#x2F;Univence-benchmarks&#x2F;tree&#x2F;main&#x2F;Z%20AI&#x2F;GLM-5.1 Follow our progress &amp; proof of donations: https:&#x2F;&#x2F;x.com&#x2F;UnivenceAI </code></pre> I would love any feedback on the platform or the agent architecture. If you are an investor or want to support the mission, my DMs are open on X, or you can reach us at univenceai@gmail.com.