2025-09-17 Papers

1/2

Paper 1

Scaling Agents via Continual Pre-training

Published: 2025-09-16

Link: http://arxiv.org/pdf/2509.13310

1. 📘 Topic and Domain: The paper focuses on scaling language models into agentic systems through continual pre-training in the domain of AI/ML, specifically addressing deep research agents capable of autonomous tool use and complex problem-solving.

2. 💡 Previous Research and New Ideas: Based on traditional post-training approaches (SFT and RL) for language models, the paper proposes a novel Agentic Continual Pre-training (Agentic CPT) framework as an intermediate step between pre-training and post-training.

3. ❓ Problem: The paper aims to solve the limitation of post-training approaches that force models to simultaneously learn diverse agentic behaviors while aligning them to expert demonstrations, creating optimization tensions.

4. 🛠️ Methods: The authors developed AgentFounder using First-order Action Synthesis (FAS) and Higher-order Action Synthesis (HAS) for data generation, implemented through a two-stage training strategy with progressive context window expansion (32K to 128K).

5. 📊 Results and Evaluation: AgentFounder-30B achieved state-of-the-art performance across 10 benchmarks, notably scoring 39.9% on BrowseComp-en, 43.3% on BrowseComp-zh, and 31.5% Pass@1 on HLE, outperforming both open-source and some commercial models.

Scaling Agents via Continual Pre-training

1/2

Paper 2

Towards General Agentic Intelligence via Environment Scaling

Published: 2025-09-16

Link: http://arxiv.org/pdf/2509.13311

1. 📘 Topic and Domain: The paper focuses on developing general agentic intelligence for Large Language Models through environment scaling and tool-learning capabilities.

2. 💡 Previous Research and New Ideas: Prior research used real-world APIs, LLM simulations, or manual environment construction for tool learning, while this paper proposes automatic environment construction and a two-phase agent training strategy.

3. ❓ Problem: The paper addresses the challenge of scaling up environments for training language agents' function-calling capabilities and effectively training agents in these environments.

4. 🛠️ Methods: The authors develop a scalable framework that automatically constructs heterogeneous environments through tool graph modeling and programmatic materialization, combined with a two-phase agent training approach (general foundation learning followed by domain specialization).

5. 📊 Results and Evaluation: Their AgentScaler models achieved state-of-the-art performance among open-source models under 1T parameters on τ-bench, τ2-Bench, and ACEBench benchmarks, with AgentScaler-30B-A3B performing comparably to trillion-parameter models.

Towards General Agentic Intelligence via Environment Scaling

1/2

Paper 3

WebResearcher: Unleashing unbounded reasoning capability in Long-Horizon Agents

Published: 2025-09-16

Link: http://arxiv.org/pdf/2509.13309

1. 📘 Topic and Domain: Development of an advanced AI research agent (WebResearcher) that can autonomously discover and synthesize knowledge from external sources through web search and tool use.

2. 💡 Previous Research and New Ideas: Based on previous deep-research systems like OpenAI's Deep Research and Google's Gemini Deep Research, but introduces a novel iterative paradigm instead of the traditional mono-contextual approach for information accumulation.

3. ❓ Problem: Addresses the limitations of current mono-contextual AI research agents that suffer from context suffocation and noise contamination when handling complex, long-horizon research tasks.

4. 🛠️ Methods: Implements IterResearch (an iterative deep-research paradigm), WebFrontier (a data synthesis engine), and a Research-Synthesis Framework using multiple parallel agents, with periodic consolidation of findings into evolving reports.

5. 📊 Results and Evaluation: Achieved state-of-the-art performance across 6 benchmarks, notably scoring 36.7% accuracy on Humanity's Last Exam (surpassing DeepSeek-V3.1's 29.8%) and 51.7% on BrowseComp-en (matching OpenAI's Deep Research system).