- formatting
- images
- links
- math
- code
- blockquotes
- external-services
•
•
•
•
•
•
-
What is the objective of reasoning with reinforcement learning?
-
Weight-sparse transformers have interpretable circuits
-
WebWeaver: Structuring Web-Scale Evidence with Dynamic Outlines for Open-Ended Deep Research
-
Webscale-RL: Automated Data Pipeline for Scaling RL Data to Pretraining Levels
-
Voyager: An Open-Ended Embodied Agent with Large Language Models