- formatting
- images
- links
- math
- code
- blockquotes
- external-services
•
•
•
•
•
•
-
$π_0$: A Vision-Language-Action Flow Model for General Robot Control
-
$ΔL$ Normalization: Rethink Loss Aggregation in RLVR
-
Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models
-
You Need Better Attention Priors
-
When Less is More: 8-bit Quantization Improves Continual Learning in Large Language Models
-
What Makes Low-Bit Quantization-Aware Training Work for Reasoning LLMs? A Systematic Study
-
What Does Loss Optimization Actually Teach, If Anything? Knowledge Dynamics in Continual Pre-training of LLMs
-
What Affects the Effective Depth of Large Language Models?
-
Web World Models
-
Wait, Wait, Wait... Why Do Reasoning Models Loop?
-
Visual Language Hypothesis
-
Vision Transformers are Circulant Attention Learners
-
Understanding and Steering the Cognitive Behaviors of Reasoning Models at Test-Time
-
TreeWriter: AI-Assisted Hierarchical Planning and Writing for Long-Form Documents
-
Transformers learn factored representations
-
Trainable Log-linear Sparse Attention for Efficient Diffusion Transformers
-
Towards Execution-Grounded Automated AI Research
-
Towards Automated Kernel Generation in the Era of LLMs
-
Top 10 Open Challenges Steering the Future of Diffusion Language Model and Its Variants
-
The Two-Stage Decision-Sampling Hypothesis: Understanding the Emergence of Self-Reflection in RL-Trained LLMs
-
The Reasoning-Creativity Trade-off: Toward Creativity-Driven Problem Solving
-
The Illusion of Insight in Reasoning Models
-
The Evolution of Reranking Models in Information Retrieval: From Heuristic Methods to Large Language Models
-
The Curse and Blessing of Mean Bias in FP4-Quantized LLM Training
-
Tackling the Inherent Difficulty of Noise Filtering in RAG
-
T5Gemma 2: Seeing, Reading, and Understanding Longer
-
Structured Hints for Sample-Efficient Lean Theorem Proving
-
Step-GUI Technical Report
-
Step-DeepResearch Technical Report
-
Statistical Reinforcement Learning in the Real World: A Survey of Challenges and Future Directions
-
Stackelberg Learning from Human Feedback: Preference Optimization as a Sequential Game
-
SortedRL: Accelerating RL Training for LLMs through Online Length-Aware Scheduling
-
SonicMoE: Accelerating MoE with IO and Tile-aware Optimizations
-
SkillRouter: Retrieve-and-Rerank Skill Selection for LLM Agents at Scale
-
SimpleMem: Efficient Lifelong Memory for LLM Agents
-
Sigmoid Head for Quality Estimation under Language Ambiguity
-
Semiparametric Preference Optimization: Your Language Model is Secretly a Single-Index Model
-
Seedance 1.5 pro: A Native Audio-Visual Joint Generation Foundation Model
-
Seed-Prover 1.5: Mastering Undergraduate-Level Theorem Proving via Learning from Experience
-
Search over Self-Edit Strategies for LLM Adaptation
-
Scaling Reinforcement Learning for Content Moderation with Large Language Models
-
RMAAT: Astrocyte-Inspired Memory Compression and Replay for Efficient Long-Context Transformers
-
ReX-MLE: The Autonomous Agent Benchmark for Medical Imaging Challenges
-
RevFFN: Memory-Efficient Full-Parameter Fine-Tuning of Mixture-of-Experts LLMs with Reversible Blocks
-
Retrieval--Reasoning Processes for Multi-hop Question Answering: A Four-Axis Design Framework and Empirical Trends
-
Rethinking Supervised Fine-Tuning: Emphasizing Key Answer Tokens for Improved LLM Accuracy
-
Recursive Language Models
-
Reasoning over mathematical objects: on-policy reward modeling and test time aggregation
-
Read As Human: Compressing Context via Parallelizable Close Reading and Skimming
-
QwenLong-L1.5: Post-Training Recipe for Long-Context Reasoning and Memory Management