- formatting
- images
- links
- math
- code
- blockquotes
- external-services
•
•
•
•
•
•
-
ParallelMuse: Agentic Parallel Thinking for Deep Information Seeking
-
Paper2Agent: Reimagining Research Papers As Interactive and Reliable AI Agents
-
PaLM-E: An Embodied Multimodal Language Model
-
Outcome-based Exploration for LLM Reasoning
-
ORION: Teaching Language Models to Reason Efficiently in the Language of Thought
-
Optimizing Mixture of Block Attention
-
OpenVLA: An Open-Source Vision-Language-Action Model
-
OpenAssistant Conversations -- Democratizing Large Language Model Alignment
-
Open Data Synthesis For Deep Research
-
Opal: An Operator Algebra View of RLHF
-
Online Process Reward Leanring for Agentic Reinforcement Learning
-
OnePiece: Bringing Context Engineering and Reasoning to Industrial Cascade Ranking System
-
On the Theoretical Limitations of Embedding-Based Retrieval
-
On the Origin of Algorithmic Progress in AI
-
On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models
-
On the Fundamental Limits of LLMs at Scale
-
On-line Policy Improvement using Monte-Carlo Search
-
On GRPO Collapse in Search-R1: The Lazy Likelihood-Displacement Death Spiral
-
OmniScientist: Toward a Co-evolving Ecosystem of Human and AI Scientists
-
Octo: An Open-Source Generalist Robot Policy
-
Object Recognition Datasets and Challenges: A Review
-
Not All Parameters Are Created Equal: Smart Isolation Boosts Fine-Tuning Performance
-
Natural Language Actor-Critic: Scalable Off-Policy Learning in Language Space
-
Multimodal Deep Learning
-
Multi-Phase Spacecraft Trajectory Optimization via Transformer-Based Reinforcement Learning
-
Multi-Agent Evolve: LLM Self-Improve through Co-evolution
-
MoM: Mixtures of Scenario-Aware Document Memories for Retrieval-Augmented Generation Systems
-
Model Compression using Progressive Channel Pruning
-
MobileLLM-Pro Technical Report
-
MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities
-
Mixture-of-Minds: Multi-Agent Reinforcement Learning for Table Understanding
-
Mixture of Contexts for Long Video Generation
-
Mixtral of Experts
-
Mitigating Hallucination in Large Language Models (LLMs): An Application-Oriented Survey on RAG, Reasoning, and Agentic Systems
-
Mistral 7B
-
MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling
-
Midtraining Bridges Pretraining and Posttraining Distributions
-
Mid-Training of Large Language Models: A Survey
-
MeSH: Memory-as-State-Highways for Recursive Transformers
-
Memory Retrieval and Consolidation in Large Language Models through Function Tokens
-
Memory-R1: Enhancing Large Language Model Agents to Manage and Utilize Memories via Reinforcement Learning
-
MCP vs RAG vs NLWeb vs HTML: A Comparison of the Effectiveness and Efficiency of Different Agent Interfaces to the Web (Technical Report)
-
MaxShapley: Towards Incentive-compatible Generative Search with Fair Context Attribution
-
Matrix: Peer-to-Peer Multi-Agent Synthetic Data Generation Framework
-
MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts
-
Mathematical Framing for Different Agent Strategies
-
MARS: Optimizing Dual-System Deep Research via Multi-Agent Reinforcement Learning
-
MAPEX: A Multi-Agent Pipeline for Keyphrase Extraction
-
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
-
LORE: A Large Generative Model for Search Relevance