我使用 Claude 進行了一些測試,對象是 GPT-OSS-120B、Qwen3-Coder-480B 和 Claude Opus 4,主要針對編程相關任務:
1. 閱讀並理解 Bitcoin Core GUI 倉庫
2. 在 C++ 中實現 PageRank
這是它的最終評價:
"GPT-OSS-120B 提供了卓越的價值,使其成為希望大規模實施 AI 編程輔助的組織的明顯贏家。質量差異微小,不足以證明 Claude 的 54 倍價格溢價是合理的。"
cc @sama @gdb
Here’s a demo of our Agentic Memory system, inspired by how our own brain holds information in a 3D spatial space. This feels, natural.
Extending this further, we announced the Agentic Memory Protocol on July 15th in SF - which enables memory to be local, encrypted and available to other agents and apps based on your permission-only.
We believe this is the future of memory - not owned by any one app, spatial, and always improving.
@levie @karpathy