Andrej Karpathy 开源 autoresearch 项目,AI 自动运行百次 LLM 训练实验

GateNews

Gate News 消息,3 月 9 日,Eureka Labs 创始人、OpenAI 联合创始人 Andrej Karpathy 昨日(3 月 8 日)公开开源项目 autoresearch,将此前在 LLM 训练项目 nanochat 上的 AI Agent 自动调优工作流独立打包,供开发者使用。该项目采用「人写 Markdown,AI 写代码」的设计模式:开发者通过编写 program.md 文件定义研究方向,AI Agent 自主修改包含完整 GPT 模型、Muon + AdamW 优化器和训练循环的 train.py 代码(约 630 行)。每次实验固定运行 5 分钟,以验证集每字节比特数(val_bpb)为唯一评估指标,优于基线的改进将被保留提交,否则丢弃。按此节奏,每小时可运行约 12 次实验,一夜可完成约 100 次。Karpathy 展示的示例显示,83 次实验中产出 15 次有效改进。该项目仅需一块 NVIDIA GPU(已在 H100 上测试),依赖 PyTorch 和少量软件包,采用 MIT 协议开源。目前社区已出现 macOS 和 MLX 适配分支。

Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.
Commento
0/400
Nessun commento