1. 📘 Topic and Domain: The paper focuses on scaling language models into agentic systems through continual pre-training in the domain of AI/ML, specifically addressing deep research agents capable of autonomous tool use and complex problem-solving.
2. 💡 Previous Research and New Ideas: Based on traditional post-training approaches (SFT and RL) for language models, the paper proposes a novel Agentic Continual Pre-training (Agentic CPT) framework as an intermediate step between pre-training and post-training.
3. ❓ Problem: The paper aims to solve the limitation of post-training approaches that force models to simultaneously learn diverse agentic behaviors while aligning them to expert demonstrations, creating optimization tensions.
4. 🛠️ Methods: The authors developed AgentFounder using First-order Action Synthesis (FAS) and Higher-order Action Synthesis (HAS) for data generation, implemented through a two-stage training strategy with progressive context window expansion (32K to 128K).
5. 📊 Results and Evaluation: AgentFounder-30B achieved state-of-the-art performance across 10 benchmarks, notably scoring 39.9% on BrowseComp-en, 43.3% on BrowseComp-zh, and 31.5% Pass@1 on HLE, outperforming both open-source and some commercial models.