- formatting
- images
- links
- math
- code
- blockquotes
- external-services
•
•
•
•
•
•
-
Expand Neurons, Not Parameters
-
Executable Counterfactuals: Improving LLMs' Causal Reasoning Through Code
-
Every Token Counts: Generalizing 16M Ultra-Long Context in Large Language Models
-
Every Question Has Its Own Value: Reinforcement Learning with Explicit Human Values
-
Every Attention Matters: An Efficient Hybrid Architecture for Long-Context Reasoning
-
Enhancing Large Language Model Reasoning with Reward Models: An Analytical Survey
-
Encoder-Decoder or Decoder-Only? Revisiting Encoder-Decoder Large Language Model
-
Empowering Real-World: A Survey on the Technology, Practice, and Evaluation of LLM-driven Industry Agents
-
EmoRAG: Evaluating RAG Robustness to Symbolic Perturbations
-
ELPO: Ensemble Learning Based Prompt Optimization for Large Language Models
-
Efficient Streaming Language Models with Attention Sinks
-
Efficient Reinforcement Learning for Large Language Models with Intrinsic Exploration
-
Efficient Memory Management for Large Language Model Serving with PagedAttention
-
Effective context engineering for AI agents
-
Educational data mining and learning analytics: An updated survey
-
Dynamic Speculative Agent Planning
-
Dynamic Affective Memory Management for Personalized LLM Agents
-
Dual-Weighted Reinforcement Learning for Generative Preference Modeling
-
Dual LoRA: Enhancing LoRA with Magnitude and Direction Updates
-
DRO-InstructZero: Distributionally Robust Prompt Optimization for Large Language Models
-
DR. WELL: Dynamic Reasoning and Learning with Symbolic World Model for Embodied LLM-Based Multi-Agent Collaboration
-
DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research
-
DoPE: Denoising Rotary Position Embedding
-
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?
-
DocReward: A Document Reward Model for Structuring and Stylizing
-
Do Not Step Into the Same River Twice: Learning to Reason from Trial and Error
-
Do Depth-Grown Models Overcome the Curse of Depth? An In-Depth Analysis
-
DLER: Doing Length pEnalty Right - Incentivizing More Intelligence per Token via Reinforcement Learning
-
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
-
Diffusion Language Models are Super Data Learners
-
Detecting Data Contamination in LLMs via In-Context Learning
-
Demystifying Synthetic Data in LLM Pre-training: A Systematic Study of Scaling Laws, Benefits, and Pitfalls
-
DELTA: Decoupling Long-Tailed Online Continual Learning
-
Defeating the Training-Inference Mismatch via FP16
-
DeepWideSearch: Benchmarking Depth and Width in Agentic Information Seeking
-
DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning
-
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
-
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models
-
DeepSeek-V3 Technical Report
-
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
-
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
-
DeepDive: Advancing Deep Search Agents with Knowledge Graphs and Multi-Turn RL
-
DeepAgent: A General Reasoning Agent with Scalable Toolsets
-
Deep sequence models tend to memorize geometrically; it is unclear why
-
Deep Self-Evolving Reasoning
-
Dataset Growth
-
DataSage: Multi-agent Collaboration for Insight Discovery with External Knowledge Retrieval, Multi-role Debating, and Multi-path Reasoning
-
Data-Efficient RLVR via Off-Policy Influence Guidance
-
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
-
DAComp: Benchmarking Data Agents across the Full Data Intelligence Lifecycle