2025-08-07 Papers

1/2

Paper 1

SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience

Published: 2025-08-06

Link: http://arxiv.org/pdf/2508.04700

1. 📘 Topic and Domain: The paper presents SEAgent, a self-evolving computer use agent framework that enables autonomous learning and adaptation to unfamiliar software environments through experience.

2. 💡 Previous Research and New Ideas: Based on previous research in large vision-language models and computer use agents that relied heavily on human-labeled data, this paper proposes a novel framework allowing agents to learn autonomously through self-exploration and experiential learning.

3. ❓ Problem: The paper addresses the challenge of enabling computer use agents to effectively learn and adapt to new and specialized software environments without requiring human annotations or supervision.

4. 🛠️ Methods: The authors developed a framework combining a World State Model for trajectory assessment, a Curriculum Generator for creating progressively challenging tasks, and a reinforcement learning approach using both adversarial imitation for failure actions and Group Relative Policy Optimization for successful ones.

5. 📊 Results and Evaluation: The system achieved a significant improvement in success rate from 11.3% to 34.5% compared to the baseline UI-TARS model when tested across five professional software applications, with the specialist-to-generalist strategy outperforming both specialist and direct generalist approaches.

SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience

1/2

Paper 2

Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference

Published: 2025-08-04

Link: http://arxiv.org/pdf/2508.02193

1. 📘 Topic and Domain: The paper introduces Seed Diffusion Preview, a large-scale discrete-state diffusion language model for code generation with high-speed inference capabilities.

2. 💡 Previous Research and New Ideas: Based on previous work in discrete diffusion models and non-autoregressive generation, it proposes new techniques for balancing generation quality and speed while addressing the limitations of traditional token-by-token decoding.

3. ❓ Problem: The paper aims to solve the challenges of slow inference speed in language models while maintaining competitive performance, particularly addressing the inefficiencies in token-by-token generation.

4. 🛠️ Methods: The paper employs a two-stage curriculum (TSC) for diffusion training, constrained-order diffusion training, on-policy diffusion learning, and block-level parallel sampling with system optimizations.

5. 📊 Results and Evaluation: The model achieves 2,146 tokens/second inference speed on H20 GPUs while maintaining competitive performance across various code benchmarks, establishing new state-of-the-art on the speed-quality trade-off frontier.

Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference

1/2

Paper 3

Agent Lightning: Train ANY AI Agents with Reinforcement Learning

Published: 2025-08-05

Link: http://arxiv.org/pdf/2508.03680

1. 📘 Topic and Domain: The paper presents Agent Lightning, a framework for applying Reinforcement Learning (RL) to train Large Language Models (LLMs) in any AI agent system.

2. 💡 Previous Research and New Ideas: Previous work focused on static, single-call RL tasks, while this paper proposes a novel framework that decouples agent execution from RL training to enable seamless integration with any existing agent.

3. ❓ Problem: The paper addresses the challenge of applying RL to complex AI agents, which currently lack mechanisms for automated optimization and struggle with reliability in real-world tasks.

4. 🛠️ Methods: The authors formulate agent execution as a Markov Decision Process, introduce a unified data interface for RL training, and develop LightningRL algorithm with a Training-Agent Disaggregation architecture.

5. 📊 Results and Evaluation: The framework demonstrated stable performance improvements across three different tasks (text-to-SQL, retrieval-augmented generation, and math QA) implemented with different agent frameworks (LangChain, OpenAI Agents SDK, and AutoGen).