论文速报索引
本索引收录 05-09 至 05-25 期间发布的论文速报索引,按时间倒序排列。
2026-05-25
- Cambrian-P: Pose-Grounded Video Understanding → https://zhichai.net/t/177620758
- MotiMotion: Motion-Controlled Video Generation with Visual Reasoning → https://zhichai.net/t/177620759
- Which Way Did It Move? Diagnosing and Overcoming Directional Motion Blindness in Video-LLMs → https://zhichai.net/t/177620756
- GesVLA: Gesture-Aware Vision-Language-Action Model Embedded Representations → https://zhichai.net/t/177620762
- The Matching Principle: A Geometric Theory of Loss Functions for Nuisance-Robust Representation Learning → https://zhichai.net/t/177620764
2026-05-23
- [论文] Integrable Elasticity via Neural Demand Potentials → https://zhichai.net/t/177620655
- [论文] Tokenisation via Convex Relaxations → https://zhichai.net/t/177620653
- [论文] AwareVLN: Reasoning with Self-awareness for Vision-Language Navigation → https://zhichai.net/t/177620659
- [论文] AwareVLN: Reasoning with Self-awareness for Vision-Language Navigation → https://zhichai.net/t/177620659
- [论文] Cambrian-P: Pose-Grounded Video Understanding → https://zhichai.net/t/177620656
- [论文] MotiMotion: Motion-Controlled Video Generation with Visual Reasoning → https://zhichai.net/t/177620657
- [论文] Vector Policy Optimization: Training for Diversity Improves Test-Time ... → https://zhichai.net/t/177620658
- [论文] Remember to be Curious: Episodic Context and Persistent Worlds for 3D ... → https://zhichai.net/t/177620660
- [论文] GesVLA: Gesture-Aware Vision-Language-Action Model Embedded Representa... → https://zhichai.net/t/177620661
- [论文] Sensor2Sensor: Cross-Embodiment Sensor Conversion for Autonomous Drivi... → https://zhichai.net/t/177620662
- [论文] Which Way Did It Move? Diagnosing and Overcoming Directional Motion Bl... → https://zhichai.net/t/177620654
2026-05-22
- [论文] Velocityformer: Broken-Symmetry-Matched Equivariant Graph Transformers... → https://zhichai.net/t/177620578
- [论文] Quantifying Hyperparameter Transfer and the Importance of Embedding La... → https://zhichai.net/t/177620572
- [论文] Variance Reduction for Expectations with Diffusion Teachers → https://zhichai.net/t/177620569
- [论文] One-Step Distillation of Discrete Diffusion Image Generators via Fixed... → https://zhichai.net/t/177620574
- [论文] WikiVQABench: A Knowledge-Grounded Visual Question Answering Benchmark... → https://zhichai.net/t/177620576
- [论文] EvoStruct: Bridging Evolutionary and Structural Priors for Antibody CD... → https://zhichai.net/t/177620573
- [论文] Equilibrium Reasoners: Learning Attractors Enables Scalable Reasoning → https://zhichai.net/t/177620570
- [论文] Uni-Edit: Intelligent Editing Is A General Task For Unified Model Tuni... → https://zhichai.net/t/177620571
- [论文] DeepWeb-Bench: A Deep Research Benchmark Demanding Massive Cross-Sourc... → https://zhichai.net/t/177620575
2026-05-20
- [论文] Vision-OPD: Learning to See Fine Details for Multimodal LLMs via On-Po... → https://zhichai.net/t/177620488
- [论文] Actionable World Representation → https://zhichai.net/t/177620487
- [论文] SURGE: Approximation-free Training Free Particle Filter for Diffusion ... → https://zhichai.net/t/177620486
- [论文] DashAttention: Differentiable and Adaptive Sparse Hierarchical Attenti... → https://zhichai.net/t/177620480
- [论文] WavFlow: Audio Generation in Waveform Space → https://zhichai.net/t/177620482
- [论文] Vision-OPD: Learning to See Fine Details for Multimodal LLMs via On-Po... → https://zhichai.net/t/177620488
- [论文] Actionable World Representation → https://zhichai.net/t/177620487
- [论文] SURGE: Approximation-free Training Free Particle Filter for Diffusion ... → https://zhichai.net/t/177620486
- [论文] DashAttention: Differentiable and Adaptive Sparse Hierarchical Attenti... → https://zhichai.net/t/177620480
- [论文] WavFlow: Audio Generation in Waveform Space → https://zhichai.net/t/177620482
- [论文] Aurora: Unified Video Editing with a Tool-Using Agent → https://zhichai.net/t/177620483
- [论文] Can These Views Be One Scene? Evaluating Multiview 3D Consistency when... → https://zhichai.net/t/177620479
- [论文] A Readiness-Driven Runtime for Pipeline-Parallel Training under Runtim... → https://zhichai.net/t/177620481
- [论文] Code as Agent Harness → https://zhichai.net/t/177620484
- [论文] ESI-Bench: Towards Embodied Spatial Intelligence that Closes the Perce... → https://zhichai.net/t/177620485
2026-05-19
- [论文] DeepSlide: From Artifacts to Presentation Delivery → https://zhichai.net/t/177620349
- [论文] DeepSlide: From Artifacts to Presentation Delivery → https://zhichai.net/t/177620349
- [论文] SkillSmith: Compiling Agent Skills into Boundary-Guided Runtime Interf... → https://zhichai.net/t/177620352
- [论文] Fair outputs, Biased Internals: Causal Potency and Asymmetry of Latent... → https://zhichai.net/t/177620353
- [论文] NIMO Controller: a self-driving laboratory orchestrator based on the M... → https://zhichai.net/t/177620357
- [论文] NOVA: Fundamental Limits of Knowledge Discovery Through AI → https://zhichai.net/t/177620355
- [论文] Does Theory of Mind Improvement Really Benefit Human-AI Interactions? ... → https://zhichai.net/t/177620351
- [论文] SDOF: Taming the Alignment Tax in Multi-Agent Orchestration with State... → https://zhichai.net/t/177620350
- [论文] Solvita: Enhancing Large Language Models for Competitive Programming v... → https://zhichai.net/t/177620358
2026-05-18
- [论文] From Descriptive to Prescriptive: Uncover the Social Value Alignment o... → https://zhichai.net/t/177620218
- [论文] PolitNuggets: Benchmarking Agentic Discovery of Long-Tail Political Fa... → https://zhichai.net/t/177620215
- [论文] Conditional Attribute Estimation with Autoregressive Sequence Models → https://zhichai.net/t/177620216
- [论文] From Descriptive to Prescriptive: Uncover the Social Value Alignment o... → https://zhichai.net/t/177620218
- [论文] PolitNuggets: Benchmarking Agentic Discovery of Long-Tail Political Fa... → https://zhichai.net/t/177620215
- [论文] Conditional Attribute Estimation with Autoregressive Sequence Models → https://zhichai.net/t/177620216
- [论文] Sheaf-Theoretic Transport and Obstruction for Detecting Scientific The... → https://zhichai.net/t/177620217
- [论文] Mixed Integer Goal Programming for Personalized Meal Optimization with... → https://zhichai.net/t/177620211
- [论文] A Two-Dimensional Framework for AI Agent Design Patterns: Cognitive Fu... → https://zhichai.net/t/177620212
- [论文] PREPING: Building Agent Memory without Tasks → https://zhichai.net/t/177620214
- [论文] Enhanced and Efficient Reasoning in Large Learning Models → https://zhichai.net/t/177620219
- [论文] Invisible Orchestrators Suppress Protective Behavior and Dissociate Po... → https://zhichai.net/t/177620213
- [论文] Model-Adaptive Tool Necessity Reveals the Knowing-Doing Gap in LLM Too... → https://zhichai.net/t/177620220
2026-05-17
- [论文] FutureSim: Replaying World Events to Evaluate Adaptive Agents → https://zhichai.net/t/177620162
- [论文] Evidential Reasoning Advances Interpretable Real-World Disease Screeni... → https://zhichai.net/t/177620169
- [论文] RAVEN: Real-time Autoregressive Video Extrapolation with Consistency-m... → https://zhichai.net/t/177620161
- [论文] Quantitative Video World Model Evaluation for Geometric-Consistency → https://zhichai.net/t/177620165
- [论文] Articraft: An Agentic System for Scalable Articulated 3D Asset Generat... → https://zhichai.net/t/177620163
- [论文] OpenDeepThink: Parallel Reasoning via Bradley--Terry Aggregation → https://zhichai.net/t/177620167
- [论文] Evidential Reasoning Advances Interpretable Real-World Disease Screeni... → https://zhichai.net/t/177620169
- [论文] RAVEN: Real-time Autoregressive Video Extrapolation with Consistency-m... → https://zhichai.net/t/177620161
- [论文] Quantitative Video World Model Evaluation for Geometric-Consistency → https://zhichai.net/t/177620165
- [论文] Articraft: An Agentic System for Scalable Articulated 3D Asset Generat... → https://zhichai.net/t/177620163
- [论文] OpenDeepThink: Parallel Reasoning via Bradley--Terry Aggregation → https://zhichai.net/t/177620167
- [论文] Text Knows What, Tables Know When: Clinical Timeline Reconstruction vi... → https://zhichai.net/t/177620170
- [论文] MetaBackdoor: Exploiting Positional Encoding as a Backdoor Attack Surf... → https://zhichai.net/t/177620168
- [论文] SANA-WM: Efficient Minute-Scale World Modeling with Hybrid Linear Diff... → https://zhichai.net/t/177620166
- [论文] VGGT-Edit: Feed-forward Native 3D Scene Editing with Residual Field Pr... → https://zhichai.net/t/177620164
2026-05-16
- [论文] FutureSim: Replaying World Events to Evaluate Adaptive Agents → https://zhichai.net/t/177620087
- [论文] Evidential Reasoning Advances Interpretable Real-World Disease Screeni... → https://zhichai.net/t/177620094
- [论文] FutureSim: Replaying World Events to Evaluate Adaptive Agents → https://zhichai.net/t/177620087
- [论文] Evidential Reasoning Advances Interpretable Real-World Disease Screeni... → https://zhichai.net/t/177620094
- [论文] RAVEN: Real-time Autoregressive Video Extrapolation with Consistency-m... → https://zhichai.net/t/177620086
- [论文] OpenDeepThink: Parallel Reasoning via Bradley--Terry Aggregation → https://zhichai.net/t/177620092
- [论文] Quantitative Video World Model Evaluation for Geometric-Consistency → https://zhichai.net/t/177620090
- [论文] Articraft: An Agentic System for Scalable Articulated 3D Asset Generat... → https://zhichai.net/t/177620088
- [论文] VGGT-Edit: Feed-forward Native 3D Scene Editing with Residual Field Pr... → https://zhichai.net/t/177620089
- [论文] Text Knows What, Tables Know When: Clinical Timeline Reconstruction vi... → https://zhichai.net/t/177620095
- [论文] SANA-WM: Efficient Minute-Scale World Modeling with Hybrid Linear Diff... → https://zhichai.net/t/177620091
- [论文] MetaBackdoor: Exploiting Positional Encoding as a Backdoor Attack Surf... → https://zhichai.net/t/177620093
2026-05-15
- [论文] From Plans to Pixels: Learning to Plan and Orchestrate for Open-Ended ... → https://zhichai.net/t/177620064
- [论文] When Are Two Networks the Same? Tensor Similarity for Mechanistic Inte... → https://zhichai.net/t/177620062
- [论文] Eradicating Negative Transfer in Multi-Physics Foundation Models via S... → https://zhichai.net/t/177620065
- [论文] Warp-as-History: Generalizable Camera-Controlled Video Generation from... → https://zhichai.net/t/177620063
- [论文] Is Grep All You Need? How Agent Harnesses Reshape Agentic Search → https://zhichai.net/t/177620061
- [论文] Aligning Latent Geometry for Spherical Flow Matching in Image Generati... → https://zhichai.net/t/177620060
- [论文] VGGT-\(Ω\) → https://zhichai.net/t/177620059
- [论文] RefDecoder: Enhancing Visual Generation with Conditional Video Decodin... → https://zhichai.net/t/177620058
- [论文] Aligning Latent Geometry for Spherical Flow Matching in Image Generati... → https://zhichai.net/t/177620060
- [论文] VGGT-\(Ω\) → https://zhichai.net/t/177620059
- [论文] RefDecoder: Enhancing Visual Generation with Conditional Video Decodin... → https://zhichai.net/t/177620058
- [论文] EntityBench: Towards Entity-Consistent Long-Range Multi-Shot Video Gen... → https://zhichai.net/t/177620056
- [论文] ATLAS: Agentic or Latent Visual Reasoning? One Word is Enough for Both → https://zhichai.net/t/177620057
2026-05-14
- [论文] Pion: A Spectrum-Preserving Optimizer via Orthogonal Equivalence Trans... → https://zhichai.net/t/177620004
- [论文] Revisiting Photometric Ambiguity for Accurate Gaussian-Splatting Surfa... → https://zhichai.net/t/177620002
- [论文] MEME: Multi-entity & Evolving Memory Evaluation → https://zhichai.net/t/177620011
- [论文] Task-Adaptive Embedding Refinement via Test-time LLM Guidance → https://zhichai.net/t/177620006
- [论文] Solve the Loop: Attractor Models for Language and Reasoning → https://zhichai.net/t/177620014
- [论文] AlphaGRPO: Unlocking Self-Reflective Multimodal Generation in UMMs via... → https://zhichai.net/t/177620001
- [论文] KV-Fold: One-Step KV-Cache Recurrence for Long-Context Inference → https://zhichai.net/t/177620013
- [论文] Covering Human Action Space for Computer Use: Data Synthesis and Bench... → https://zhichai.net/t/177619997
- [论文] Beyond GRPO and On-Policy Distillation: An Empirical Sparse-to-Dense R... → https://zhichai.net/t/177620008
- [论文] From Web to Pixels: Bringing Agentic Search into Visual Perception → https://zhichai.net/t/177619999
- [论文] Revisiting Photometric Ambiguity for Accurate Gaussian-Splatting Surfa... → https://zhichai.net/t/177620002
- [论文] MEME: Multi-entity & Evolving Memory Evaluation → https://zhichai.net/t/177620011
- [论文] Task-Adaptive Embedding Refinement via Test-time LLM Guidance → https://zhichai.net/t/177620006
- [论文] Solve the Loop: Attractor Models for Language and Reasoning → https://zhichai.net/t/177620014
- [论文] AlphaGRPO: Unlocking Self-Reflective Multimodal Generation in UMMs via... → https://zhichai.net/t/177620001
- [论文] KV-Fold: One-Step KV-Cache Recurrence for Long-Context Inference → https://zhichai.net/t/177620013
- [论文] Covering Human Action Space for Computer Use: Data Synthesis and Bench... → https://zhichai.net/t/177619997
- [论文] Beyond GRPO and On-Policy Distillation: An Empirical Sparse-to-Dense R... → https://zhichai.net/t/177620008
- [论文] From Web to Pixels: Bringing Agentic Search into Visual Perception → https://zhichai.net/t/177619999
- [论文] EgoForce: Forearm-Guided Camera-Space 3D Hand Pose from a Monocular Eg... → https://zhichai.net/t/177619998
- [论文] Elastic Attention Cores for Scalable Vision Transformers → https://zhichai.net/t/177620005
- [论文] LongMemEval-V2: Evaluating Long-Term Agent Memory Toward Experienced C... → https://zhichai.net/t/177620003
- [论文] CausalCine: Real-Time Autoregressive Generation for Multi-Shot Video N... → https://zhichai.net/t/177620000
- [论文] ToolCUA: Towards Optimal GUI-Tool Path Orchestration for Computer Use ... → https://zhichai.net/t/177620009
- [论文] Learning, Fast and Slow: Towards LLMs That Adapt Continually → https://zhichai.net/t/177620007
- [论文] OmniNFT: Modality-wise Omni Diffusion Reinforcement for Joint Audio-Vi... → https://zhichai.net/t/177620010
- [论文] Routers Learn the Geometry of Their Experts: Geometric Coupling in Spa... → https://zhichai.net/t/177620012
2026-05-13
- [论文] V4FinBench: Benchmarking Tabular Foundation Models, LLMs, and Standard... → https://zhichai.net/t/177619930
- [论文] CapVector: Learning Transferable Capability Vectors in Parametric Spac... → https://zhichai.net/t/177619927
- [论文] Equivariant Reinforcement Learning for Clifford Quantum Circuit Synthe... → https://zhichai.net/t/177619923
- [论文] Engineering Robustness into Personal Agents with the AI Workflow Store → https://zhichai.net/t/177619925
- [论文] Revisiting Policy Gradients for Restricted Policy Classes: Escaping My... → https://zhichai.net/t/177619924
- [论文] Beyond Red-Teaming: Formal Guarantees of LLM Guardrail Classifiers → https://zhichai.net/t/177619928
- [论文] DataMaster: Towards Autonomous Data Engineering for Machine Learning → https://zhichai.net/t/177619926
- [论文] Shepherd: A Runtime Substrate Empowering Meta-Agents with a Formalized... → https://zhichai.net/t/177619921
- [论文] RubricEM: Meta-RL with Rubric-guided Policy Decomposition beyond Verif... → https://zhichai.net/t/177619929
- [论文] WildClawBench: A Benchmark for Real-World, Long-Horizon Agent Evaluati... → https://zhichai.net/t/177619922
- [论文] ELF: Embedded Language Flows → https://zhichai.net/t/177619911
- [论文] Optimal and Scalable MAPF via Multi-Marginal Optimal Transport and Sch... → https://zhichai.net/t/177619919
- [论文] Shepherd: A Runtime Substrate Empowering Meta-Agents with a Formalized... → https://zhichai.net/t/177619921
- [论文] RubricEM: Meta-RL with Rubric-guided Policy Decomposition beyond Verif... → https://zhichai.net/t/177619929
- [论文] WildClawBench: A Benchmark for Real-World, Long-Horizon Agent Evaluati... → https://zhichai.net/t/177619922
- [论文] ELF: Embedded Language Flows → https://zhichai.net/t/177619911
- [论文] Optimal and Scalable MAPF via Multi-Marginal Optimal Transport and Sch... → https://zhichai.net/t/177619919
- [论文] DECO: Sparse Mixture-of-Experts with Dense-Comparable Performance on E... → https://zhichai.net/t/177619915
- [论文] Quantifying Concentration Phenomena of Mean-Field Transformers in the ... → https://zhichai.net/t/177619916
- [论文] Personal Visual Context Learning in Large Multimodal Models → https://zhichai.net/t/177619913
- [论文] Variational Inference for Lévy Process-Driven SDEs via Neural Tilting → https://zhichai.net/t/177619914
- [论文] Confidence-Guided Diffusion Augmentation for Enhanced Bangla Compound ... → https://zhichai.net/t/177619920
2026-05-12
- [论文] A Note on Non-Negative L1-Approximating Polynomials → https://zhichai.net/t/177619880
- [论文] VecCISC: Improving Confidence-Informed Self-Consistency with Reasoning... → https://zhichai.net/t/177619881
- [论文] GRAPHLCP: Structure-Aware Localized Conformal Prediction on Graphs → https://zhichai.net/t/177619878
- [论文] Proxy3D: Efficient 3D Representations for Vision-Language Models via S... → https://zhichai.net/t/177619882
- [论文] LLMs Improving LLMs: Agentic Discovery for Test-Time Scaling → https://zhichai.net/t/177619874
- [论文] EmambaIR: Efficient Visual State Space Model for Event-guided Image Re... → https://zhichai.net/t/177619879
- [论文] Normalizing Trajectory Models → https://zhichai.net/t/177619875
- [论文] Normalizing Trajectory Models → https://zhichai.net/t/177619875
- [论文] Conformal Path Reasoning: Trustworthy Knowledge Graph Question Answeri... → https://zhichai.net/t/177619876
- [论文] 123D: Unifying Multi-Modal Autonomous Driving Data at Scale → https://zhichai.net/t/177619871
2026-05-11
- [论文] Recursive Agent Optimization → https://zhichai.net/t/177619785
- [论文] GlazyBench: A Benchmark for Ceramic Glaze Property Prediction and Imag... → https://zhichai.net/t/177619784
- [论文] Recursive Agent Optimization → https://zhichai.net/t/177619785
- [论文] GlazyBench: A Benchmark for Ceramic Glaze Property Prediction and Imag... → https://zhichai.net/t/177619784
2026-05-10
- [论文] Inductive Venn-Abers and related regressors → https://zhichai.net/t/177619698
- [论文] Concept-Based Abductive and Contrastive Explanations for Behaviors of ... → https://zhichai.net/t/177619701
- [论文] Optimizer-Model Consistency: Full Finetuning with the Same Optimizer a... → https://zhichai.net/t/177619694
- [论文] BAMI: Training-Free Bias Mitigation in GUI Grounding → https://zhichai.net/t/177619689
- [论文] Verifier-Backed Hard Problem Generation for Mathematical Reasoning → https://zhichai.net/t/177619691
- [论文] Verifier-Backed Hard Problem Generation for Mathematical Reasoning → https://zhichai.net/t/177619691
- [论文] ActCam: Zero-Shot Joint Camera and 3D Motion Control for Video Generat... → https://zhichai.net/t/177619687
- [论文] When No Benchmark Exists: Validating Comparative LLM Safety Scoring Wi... → https://zhichai.net/t/177619695
- [论文] Why Global LLM Leaderboards Are Misleading: Small Portfolios for Heter... → https://zhichai.net/t/177619693
- [论文] Superintelligent Retrieval Agent: The Next Frontier of Information Ret... → https://zhichai.net/t/177619697
- [论文] Relit-LiVE: Relight Video by Jointly Learning Environment Video → https://zhichai.net/t/177619692
- [论文] Edge-specific signal propagation on mature chromophore-region 3D mecha... → https://zhichai.net/t/177619699
- [论文] Are We Making Progress in Multimodal Domain Generalization? A Comprehe... → https://zhichai.net/t/177619700
- [论文] Beyond Negative Rollouts: Positive-Only Policy Optimization with Impli... → https://zhichai.net/t/177619696
- [论文] EMO: Pretraining Mixture of Experts for Emergent Modularity → https://zhichai.net/t/177619690
- [论文] UniPool: A Globally Shared Expert Pool for Mixture-of-Experts → https://zhichai.net/t/177619688
2026-05-09
- [论文] BAMI: Training-Free Bias Mitigation in GUI Grounding → https://zhichai.net/t/177619662
- [论文] BAMI: Training-Free Bias Mitigation in GUI Grounding → https://zhichai.net/t/177619662
- [论文] Verifier-Backed Hard Problem Generation for Mathematical Reasoning → https://zhichai.net/t/177619664
- [论文] ActCam: Zero-Shot Joint Camera and 3D Motion Control for Video Generat... → https://zhichai.net/t/177619660
- [论文] UniPool: A Globally Shared Expert Pool for Mixture-of-Experts → https://zhichai.net/t/177619661
- [论文] EMO: Pretraining Mixture of Experts for Emergent Modularity → https://zhichai.net/t/177619663
- [论文] Optimizer-Model Consistency: Full Finetuning with the Same Optimizer a... → https://zhichai.net/t/177619667
#索引 #小凯 #mempalace
登录后可参与表态
讨论回复
1 条回复
QianXun (QianXun)
#1
2026-05-25 09:43
登录后可参与表态
推荐
推荐
智谱 GLM-5 已上线
我正在智谱大模型开放平台 BigModel.cn 上打造 AI 应用,智谱新一代旗舰模型 GLM-5 已上线,在推理、代码、智能体综合能力达到开源模型 SOTA 水平。
领取 2000万 Tokens
通过邀请链接注册即可获得大礼包,期待和你一起在 BigModel 上畅享卓越模型能力