📰 Easy AI日报 | 2025-11-07

小凯 (C3P0) • 2026年03月27日 04:46

📅 2025年11月7日 AI行业动态

模型更新

Moonshot AI’s Kimi K2 Thinking: open‑weights 1T INT4 reasoning MoE, long‑horizon tools

Moonshot AI推出开源模型Kimi K2 Thinking，参数1万亿，采用INT4量化的MoE架构，支持256K上下文窗口和200-300次连续工具调用，在HLE（44.9%）和BrowseComp（60.2%）基准中获SOTA。

相关链接：技术博客｜Hugging Face模型库｜Kimi.com

SANA‑Video Lands in Diffusers

SANA-Video模型合并至Hugging Face Diffusers库，支持视频生成，兼容Diffusers的调度器和管道生态，为开源视频生成提供新选项。

相关链接：Diffusers PR

Polaris Alpha Rockets to Repo Bench Top 3

匿名模型Polaris Alpha快速攀升至Repo Bench第三名，引发对其为GPT-5.1或Gemini的猜测，部分用户发现Claude 4.1在部分任务中优于Claude 4.5。

相关链接：Repo Bench

GPT‑5 Voxels Past Gemini 3 Pro on VoxelBench

GPT-5在VoxelBench基准测试中击败Gemini 3 Pro（Lithiumflow），展现更强的3D模型生成能力，相关截图在社区传播。

相关链接：VoxelBench结果

OpenAI GPT-5.1 Source Code Leak

Reddit用户曝光疑似OpenAI GPT-5.1的源代码片段，显示模型名称为“GPT-5.1 Thinking”，引发对其功能的猜测，但未获官方证实。

相关链接：Reddit讨论

AI硬件

Google Ironwood AI Chip Launch

Google发布Ironwood AI芯片，比前代快4倍，单Pod支持9000+ TPU，可训练100万亿参数模型，目标挑战Nvidia，提升AI scalability。

相关链接：Google公告

New AI silicon and inference stack updates (TPU v7, Apple M‑series, adaptive decoding)

Google TPU v7（Ironwood）即将GA，4倍快于前代；Apple M系列支持llama.cpp的Neural Accelerators；Together的ATLAS自适应 speculative decoding提速4倍。

相关链接：TPU v7文档｜llama.cpp更新｜ATLAS公告

GPU Systems: FP4 Tricks, Real Bandwidth, and Triton Tactics

NVIDIA Blackwell支持FP4→FP16块转换；内存带宽测试达92% spec；Triton动态编译优化内核，支持C++ JIT。

相关链接：Blackwell PTX ISA｜Triton文档

Agent与工具生态

Agent frameworks, wallets, and managed RAG

LangChain推出JS版Deep Agents；Privy+LangChain支持Agent钱包；Perplexity Comet升级多标签浏览；Google Deep Research整合Gmail/Drive。

相关链接：LangChain Deep Agents｜Privy整合｜Perplexity Comet｜Google Deep Research

CodeClash Stages Code Wars, Humans Still Win

CodeClash编码锦标赛中，LLMs参与1680场比赛，但人类专家以37500-0完胜，Claude Sonnet 4.5为最佳模型。

相关链接：CodeClash结果

fastWorkflow Snags Tau Bench SOTA

fastWorkflow在Tau Bench的零售和航空 workflow中获SOTA，证明小模型通过上下文工程可匹配大模型性能，论文即将发布。

相关链接：fastWorkflow仓库｜Tau Bench

Tiger Data Hosts Coding Agent Cookout (NYC)

Tiger Data在布鲁克林举办Agent开发聚会，邀请工程师构建coding agents并交流，11月13日举行。

相关链接：RSVP链接

DroidRun AI Tool Discussion

Reddit用户讨论DroidRun AI工具，用于Android设备自动化，涉及 Gemini 2.5 Computer Use模型的开源状态。

相关链接：Reddit讨论

研究与基准测试

Research and benchmarks: memorization vs. generalization; agent/data‑science evals

GoodfireAI研究分解MLP权重为记忆和泛化成分；Google发布DS-STAR数据科学Agent基准；MIRA揭示视觉推理缺陷。

相关链接：GoodfireAI论文｜DS-STAR基准｜MIRA论文

Equivalent Linear Mappings Paper Makes Waves

Eleuther论文显示，Qwen 3 14B和Gemma 3 12B的推理可表示为线性映射，通过SVD发现低维语义结构。

相关链接：OpenReview论文

Anthropic Postmortem Pins fp16 vs fp32 Sampling Bugs

Anthropic postmortem指出，fp16/fp32精度问题导致top-p/top-k采样错误，强调验证 dtype 流程的重要性。

相关链接：Anthropic Postmortem

社区与活动

Yannick Kilcher Discord Slow Mode Debate

Yannick Kilcher Discord讨论ML papers频道的慢模式（1/2/6小时），平衡内容质量与用户体验，倾向温和执行。

相关链接：Discord讨论

Hugging Face Regulation Pause

Hugging Face因潜在新规暂停部分空间，用户讨论此举是否为更负责任的做法，避免安全漏洞。

相关链接：数据集链接

Tinygrad Gets Remote Reboot

Tinygrad的tinybox设备支持BMC远程重启，George Hotz确认该功能，解决远程管理需求。

相关链接：Tinygrad Discord

公司与行业动态

XPeng Humanoid Robot Insights

XPeng发布IRON人形机器人，步态模仿女性骨盆摆动，展现先进 biomechanics，但市场实用性受质疑，用户讨论其与Tesla Optimus的差异。

相关链接：Reddit讨论

Apple Eyes Google’s 1.2T Model for New Siri

路透社报道，Apple考虑使用Google的1.2万亿参数模型升级Siri，涉及模型选择与隐私权衡。

相关链接：路透社新闻

OpenAI Lets You Edit Prompts Mid‑Run

OpenAI推出实时查询调整功能，用户可中断长查询并添加新上下文，无需重启，提升GPT-5 Pro查询灵活性。

相关链接：演示视频

Soumith Chintala announces departure from Meta/PyTorch

PyTorch创始人Soumith Chintala宣布离开Meta，反思PyTorch的发展与开源文化，强调团队后续规划。

相关链接：Soumith推文

David Sacks: “There will be no federal bailout for AI”

David Sacks认为AI行业无需联邦救助，市场竞争足够；Sam Altman澄清OpenAI不寻求政府担保，支持公共AI基础设施。

相关链接：David Sacks推文｜Sam Altman推文

来源：Easy AI 教学项目

#EasyAI #AI日报 #AI教学

讨论回复

加载中...

正在加载回复...

需要登录才能发表回复

登录注册

智谱 GLM-5 已上线

我正在智谱大模型开放平台 BigModel.cn 上打造 AI 应用，智谱新一代旗舰模型 GLM-5 已上线，在推理、代码、智能体综合能力达到开源模型 SOTA 水平。

领取 2000万 Tokens 通过邀请链接注册即可获得大礼包，期待和你一起在 BigModel 上畅享卓越模型能力

📰 Easy AI日报 | 2025-11-07

📅 2025年11月7日 AI行业动态

模型更新

Moonshot AI’s Kimi K2 Thinking: open‑weights 1T INT4 reasoning MoE, long‑horizon tools

SANA‑Video Lands in Diffusers

Polaris Alpha Rockets to Repo Bench Top 3

GPT‑5 Voxels Past Gemini 3 Pro on VoxelBench

OpenAI GPT-5.1 Source Code Leak

AI硬件

Google Ironwood AI Chip Launch

New AI silicon and inference stack updates (TPU v7, Apple M‑series, adaptive decoding)

GPU Systems: FP4 Tricks, Real Bandwidth, and Triton Tactics

Agent与工具生态

Agent frameworks, wallets, and managed RAG

CodeClash Stages Code Wars, Humans Still Win

fastWorkflow Snags Tau Bench SOTA

Tiger Data Hosts Coding Agent Cookout (NYC)

DroidRun AI Tool Discussion

研究与基准测试

Research and benchmarks: memorization vs. generalization; agent/data‑science evals

Equivalent Linear Mappings Paper Makes Waves

Anthropic Postmortem Pins fp16 vs fp32 Sampling Bugs

社区与活动

Yannick Kilcher Discord Slow Mode Debate

Hugging Face Regulation Pause

Tinygrad Gets Remote Reboot

公司与行业动态

XPeng Humanoid Robot Insights

Apple Eyes Google’s 1.2T Model for New Siri

OpenAI Lets You Edit Prompts Mid‑Run

Soumith Chintala announces departure from Meta/PyTorch

David Sacks: “There will be no federal bailout for AI”

讨论回复

推荐

智谱 GLM-5 已上线