我们构建了超快速的 Secp256k1,速度比传统方法快至 51%,适用于 x86、ARM64 和 RISC-V 架构。
大家好,我们于2月11日启动了一个项目,旨在构建最快、最强大的secp256k1库,利用现代CPU特性和低级汇编。这个项目叫做UltrafastSecp256k1,在短短11天内,我们已经实现了一些相当激进的基准测试和平台覆盖。
我们解决的问题:现有的secp256k1实现(如来自Bitcoin Core的libsecp256k1)虽然经过高度优化,但往往在特定的新硬件特性或跨平台需求上未能充分发挥性能。我们看到了进一步推动这一点的机会,特别是在恒定时间操作和多样化架构方面。
我们所做的(“如何做到的”):
- 深度汇编与硬件内在特性:为x86-64和ARM64手动调优的$5 \times 52$域表示。这直接绕过了更高层的抽象,以达到峰值性能。
- 设计为恒定时间:每条关键路径都设计为恒定时间,以减轻侧信道攻击。我们甚至在x86-64的恒定时间下,实现了$k \times G$(生成器标量乘法)比libsecp256k1标准实现快+51%的速度提升。
- 跨平台与嵌入式:我们迅速扩展了支持,从x86/ARM64(包括Android)到ESP32-S3,接下来将开始支持RISC-V(Milk-V Mars)。
- 广泛的语言绑定:该库支持12种以上的语言(Rust、Go、Python、Swift、Dart、Java/Kotlin、通过NPM的Node.js、通过NuGet的C#等),使其易于集成到几乎任何项目中。
我们分享的原因:在11天内,我们已经看到超过5000个克隆,项目正在快速发展。我们希望从HN社区获得关于我们低级优化的反馈,特别是恒定时间实现的细节和平台特定的汇编。
相关基准测试:(可以稍后插入具体数字)
- x86-64 ($k \times G$ CT):Ultrafast: 10.4µs vs libsecp: 15.7µs (+51%更快)
- ARM64 ($field\_mul$):Ultrafast: 0.083µs vs libsecp: 0.098µs (+18%更快)
- ARM64 ($field\_inv$):Ultrafast: 4.47µs vs libsecp: 5.21µs (+17%更快)
我们相信,UltrafastSecp256k1可以成为各种应用中高性能加密需求的关键组件,从区块链节点到安全的物联网设备。
GitHub仓库:[https://github.com/shrec/UltrafastSecp256k1](https://github.com/shrec/UltrafastSecp256k1)
更新日志:[https://github.com/shrec/UltrafastSecp256k1/blob/main/CHANGELOG.md](https://github.com/shrec/UltrafastSecp256k1/blob/main/CHANGELOG.md)
期待您的见解和建设性的批评!
查看原文
Hey HN,We started a project on Feb 11th aiming to build the fastest, most robust secp256k1 library out there, leveraging modern CPU features and low-level assembly. It's called UltrafastSecp256k1, and after just 11 days, we've achieved some pretty aggressive benchmarks and platform coverage.The Problem We're Solving: Existing secp256k1 implementations (like libsecp256k1 from Bitcoin Core) are highly optimized, but often leave performance on the table for specific, newer hardware features or cross-platform needs. We saw an opportunity to push this further, particularly in constant-time operations and diverse architectures.What We've Done (The "How"):Deep Assembly & Hardware Intrinsics: Hand-tuned $5 \times 52$ field representation for x86-64 and ARM64. This directly bypasses higher-level abstractions to hit peak performance.Constant-Time by Design: Every critical path is designed to be constant-time, mitigating side-channel attacks. We even achieved a +51% speedup on $k \times G$ (generator scalar multiplication) for x86-64 in constant-time, compared to libsecp256k1's standard implementation.Cross-Platform & Embedded: We've expanded support rapidly: from x86/ARM64 (including Android) to ESP32-S3, and we're starting RISC-V (Milk-V Mars) next.Broad Language Bindings: The library is accessible from 12+ languages (Rust, Go, Python, Swift, Dart, Java/Kotlin, Node.js via NPM, C# via NuGet, etc.), making it easy to integrate into almost any project.Why we're sharing: We've seen over 5,000 clones in 11 days, and the project is rapidly evolving. We're looking for feedback from the HN community on our low-level optimizations, especially the constant-time implementation details and platform-specific assembly.Relevant Benchmarks: (შეგიძლია მოგვიანებით ჩასვა კონკრეტული რიცხვები, როცა განაახლებ)x86-64 ($k \times G$ CT): Ultrafast: 10.4µs vs libsecp: 15.7µs (+51% faster)ARM64 ($field\_mul$): Ultrafast: 0.083µs vs libsecp: 0.098µs (+18% faster)ARM64 ($field\_inv$): Ultrafast: 4.47µs vs libsecp: 5.21µs (+17% faster)We believe that UltrafastSecp256k1 can become a critical component for high-performance cryptographic needs in various applications, from blockchain nodes to secure IoT devices.GitHub Repo: https://github.com/shrec/UltrafastSecp256k1Changelog: https://github.com/shrec/UltrafastSecp256k1/blob/main/CHANGELOG.mdLooking forward to your insights and constructive criticism!