- formatting
- images
- links
- math
- code
- blockquotes
- external-services
•
•
•
•
•
•
-
Prompt Repetition Improves Non-Reasoning LLMs
-
Prefill vs. Decode Bottlenecks: SRAM-Frequency Tradeoffs and the Memory-Bandwidth Ceiling
-
Power-of-Two Quantization-Aware-Training (PoT-QAT) in Large Language Models (LLMs)
-
Physics of Language Models: Part 4.1, Architecture Design and the Magic of Canon Layers
-
PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning
-
On the Convergence Rate of LoRA Gradient Descent
-
NVIDIA Nemotron 3: Efficient and Open Intelligence
-
NRGPT: An Energy-based Alternative for GPT
-
NextFlow: Unified Sequential Modeling Activates Multimodal Understanding and Generation
-
Nested Learning: The Illusion of Deep Learning Architectures
-
MUSIC: MUlti-Step Instruction Contrast for Multi-Turn Reward Models
-
Monitoring Monitorability
-
Monadic Context Engineering
-
MoEBlaze: Breaking the Memory Wall for Efficient MoE Training on Modern GPUs
-
Modular Prompt Optimization: Optimizing Structured Prompts with Section-Local Textual Gradients
-
Modeling Language as a Sequence of Thoughts
-
Mixture-of-Depths Attention
-
MiMo-V2-Flash Technical Report
-
mHC: Manifold-Constrained Hyper-Connections
-
MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild
-
Mesh-Attention: A New Communication-Efficient Distributed Attention with Improved Data Locality
-
MemRL: Self-Evolving Agents via Runtime Reinforcement Learning on Episodic Memory
-
Memory in the Age of AI Agents
-
Memorization Dynamics in Knowledge Distillation for Language Models
-
Memoria: A Scalable Agentic Memory Framework for Personalized Conversational AI
-
MemEvolve: Meta-Evolution of Agent Memory Systems
-
Mechanisms of Introspective Awareness
-
MAI-UI Technical Report: Real-World Centric Foundation GUI Agents
-
LLM Router: Prefill is All You Need
-
LLM-in-Sandbox Elicits General Agentic Intelligence
-
LinMU: Multimodal Understanding Made Linear
-
Let's (not) just put things in Context: Test-Time Training for Long-Context LLMs
-
Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem
-
Learning to Discover at Test Time
-
Learning from Synthetic Data: Limitations of ERM
-
Large language models are not about language
-
Large language models and the entropy of English
-
Kling-Omni Technical Report
-
Kimi K2.5: Visual Agentic Intelligence
-
Kascade: A Practical Sparse Attention Method for Long-Context LLM Inference
-
Increasing the Thinking Budget is Not All You Need
-
Improving Recursive Transformers with Mixture of LoRAs
-
How and Why LLMs Generalize: A Fine-Grained Analysis of LLM Reasoning from Cognitive Behaviors to Low-Level Patterns
-
Hindsight is 20/20: Building Agent Memory that Retains, Recalls, and Reflects
-
HiFi-RAG: Hierarchical Content Filtering and Two-Pass Generation for Open-Domain RAG
-
GoAgent: Group-of-Agents Communication Topology Generation for LLM-based Multi-Agent Systems
-
Geometric and Dynamic Scaling in Deep Transformers
-
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization
-
From Static Templates to Dynamic Runtime Graphs: A Survey of Workflow Optimization for LLM Agents
-
Forgetful but Faithful: A Cognitive Memory Architecture and Benchmark for Privacy-Aware Generative Agents