Tutorials

每日AI最新进展分享。

How Does RL Post-training Induce Skill Composition? A Case Study on Countdown

1 min read · December 12, 2025

2025
Higher-order Linear Attention

2 min read · December 12, 2025

2025
Higher Embedding Dimension Creates a Stronger World Model for a Simple Sorting Task

2 min read · December 12, 2025

2025
HEAL: A Hypothesis-Based Preference-Aware Analysis Framework

2 min read · December 12, 2025

2025
Harnessing Uncertainty: Entropy-Modulated Policy Gradients for Long-Horizon LLM Agents

3 min read · December 12, 2025

2025