Tutorials

每日AI最新进展分享。

CUDA-L2: Surpassing cuBLAS Performance for Matrix Multiplication through Reinforcement Learning

1 min read · March 29, 2026

2026
CreativityPrism: A Holistic Benchmark for Large Language Model Creativity

4 min read · March 29, 2026

2026
Cost-Aware Retrieval-Augmentation Reasoning Models with Adaptive Retrieval Depth

2 min read · March 29, 2026

2026
ConvergeWriter: Data-Driven Bottom-Up Article Construction

3 min read · March 29, 2026

2026
Controlling changes to attention logits

1 min read · March 29, 2026

2026
Continual Learning via Sparse Memory Finetuning

2 min read · March 29, 2026

2026
Connecting Jensen-Shannon and Kullback-Leibler Divergences: A New Bound for Representation Learning

2 min read · March 29, 2026

2026
Compress to Impress: Efficient LLM Adaptation Using a Single Gradient Step on 100 Samples

4 min read · March 29, 2026

2026
Collaboration and Conflict between Humans and Language Models through the Lens of Game Theory

2 min read · March 29, 2026

2026
Cognitive Foundations for Reasoning and Their Manifestation in LLMs

1 min read · March 29, 2026

2026
CogGuide: Human-Like Guidance for Zero-Shot Omni-Modal Reasoning

5 min read · March 29, 2026

2026
CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning

6 min read · March 29, 2026

2026
Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference

3 min read · March 29, 2026

2026
Causal Reasoning Favors Encoders: On The Limits of Decoder-Only Models

2 min read · March 29, 2026

2026
Capabilities of GPT-4 on Medical Challenge Problems

3 min read · March 29, 2026

2026
CAMformer: Associative Memory is All You Need

1 min read · March 29, 2026

2026
CALM Before the STORM: Unlocking Native Reasoning for Optimization Modeling

4 min read · March 29, 2026

2026
Budget-Aware Tool-Use Enables Effective Agent Scaling

1 min read · March 29, 2026

2026
BrowseConf: Confidence-Guided Test-Time Scaling for Web Agents

3 min read · March 29, 2026

2026
BroRL: Scaling Reinforcement Learning via Broadened Exploration

3 min read · March 29, 2026

2026
BridgeData V2: A Dataset for Robot Learning at Scale

5 min read · March 29, 2026

2026
BloombergGPT: A Large Language Model for Finance

7 min read · March 29, 2026

2026
Bias and Fairness in Large Language Models: A Survey

4 min read · March 29, 2026

2026
Bi-LoRA: Efficient Sharpness-Aware Minimization for Fine-Tuning Large-Scale Models

4 min read · March 29, 2026

2026
BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation

4 min read · March 29, 2026

2026
Beyond Two-Stage Training: Cooperative SFT and RL for LLM Reasoning

3 min read · March 29, 2026

2026
Beyond Turn Limits: Training Deep Search Agents with Dynamic Context Window

3 min read · March 29, 2026

2026
Beyond Pipelines: A Survey of the Paradigm Shift toward Model-Native Agentic AI

2 min read · March 29, 2026

2026
Beyond Patch Aggregation: 3-Pass Pyramid Indexing for Vision-Enhanced Document Retrieval

1 min read · March 29, 2026

2026
Better World Models Can Lead to Better Post-Training Performance

1 min read · March 29, 2026

2026
Behind RoPE: How Does Causal Mask Encode Positional Information?

2 min read · March 29, 2026

2026
BEFT: Bias-Efficient Fine-Tuning of Language Models

3 min read · March 29, 2026

2026
BEAM: Brainwave Empathy Assessment Model for Early Childhood

3 min read · March 29, 2026

2026
Batch Prompting Suppresses Overthinking Reasoning Under Constraint: How Batch Prompting Suppresses Overthinking in Reasoning Models

4 min read · March 29, 2026

2026
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

2 min read · March 29, 2026

2026
BaseReward: A Strong Baseline for Multimodal Reward Model

5 min read · March 29, 2026

2026
Balanced Actor Initialization: Stable RLHF Training of Distillation-Based Reasoning Models

3 min read · March 29, 2026

2026
BabyBabelLM: A Multilingual Benchmark of Developmentally Plausible Training Data

7 min read · March 29, 2026

2026
Autonomous Agents for Scientific Discovery: Orchestrating Scientists, Language, Code, and Physics

2 min read · March 29, 2026

2026
Auto-Rubric: Learning to Extract Generalizable Criteria for Reward Modeling

4 min read · March 29, 2026

2026
Attention Illuminates LLM Reasoning: The Preplan-and-Anchor Rhythm Enables Fine-Grained Policy Optimization

1 min read · March 29, 2026

2026
Asymmetric Proximal Policy Optimization: mini-critics boost LLM reasoning

3 min read · March 29, 2026

2026
Artificial Hippocampus Networks for Efficient Long-Context Modeling

3 min read · March 29, 2026

2026
ARM-FM: Automated Reward Machines via Foundation Models for Compositional Reinforcement Learning

2 min read · March 29, 2026

2026
Are Large Language Models Sensitive to the Motives Behind Communication?

3 min read · March 29, 2026

2026
Are Agents Just Automata? On the Formal Equivalence Between Agentic AI and the Chomsky Hierarchy

2 min read · March 29, 2026

2026
Architecting Resilient LLM Agents: A Guide to Secure Plan-then-Execute Implementations

2 min read · March 29, 2026

2026
An Augmentation Overlap Theory of Contrastive Learning

2 min read · March 29, 2026

2026
AlphaResearch: Accelerating New Algorithm Discovery with Language Models

2 min read · March 29, 2026

2026
AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback

4 min read · March 29, 2026

2026