- formatting
- images
- links
- math
- code
- blockquotes
- external-services
•
•
•
•
•
•
-
Less is More Tokens: Efficient Math Reasoning via Difficulty-Aware Chain-of-Thought Distillation
-
Learning When to Plan: Efficiently Allocating Test-Time Compute for LLM Agents
-
Learning to Reason: Training LLMs with GPT-OSS or DeepSeek R1 Reasoning Traces
-
Learning to Focus: Focal Attention for Selective and Scalable Transformers
-
Learning on the Job: An Experience-Driven Self-Evolving Agent for Long-Horizon Tasks