1. 📘 Topic and Domain: The paper focuses on developing an improved web agent system (WebSailor-V2) for autonomous information seeking and research tasks in the domain of artificial intelligence and natural language processing.
2. 💡 Previous Research and New Ideas: Based on the original WebSailor framework and ReAct paradigm, it introduces novel ideas including SailorFog-QA-V2 (an enhanced dataset with complex knowledge graphs) and a dual-environment reinforcement learning framework combining simulated and real-world training.
3. ❓ Problem: The paper aims to close the performance gap between open-source and proprietary web agents while addressing challenges in data quality and training scalability for autonomous research agents.
4. 🛠️ Methods: The authors use a comprehensive pipeline including: (1) SailorFog-QA-V2 dataset construction with dense knowledge graphs, (2) Supervised Fine-Tuning for initial training, and (3) a dual-environment Reinforcement Learning approach with both simulated and real-world components.
5. 📊 Results and Evaluation: WebSailor-V2 achieved state-of-the-art results on multiple benchmarks, scoring 35.3 on BrowseComp-EN, 44.1 on BrowseComp-ZH, and 30.6 on HLE, outperforming existing open-source agents and matching or exceeding some proprietary systems despite using a smaller model (30B parameters).