2025-10-30 Papers

1/2

Paper 1

VFXMaster: Unlocking Dynamic Visual Effect Generation via In-Context Learning

Published: 2025-10-29

Link: http://arxiv.org/pdf/2510.25772

1. 📘 Topic and Domain: Visual effects (VFX) video generation using AI, specifically focusing on reference-based dynamic visual effect generation in the domain of computer vision and graphics.

2. 💡 Previous Research and New Ideas: Based on prior work in video diffusion models and VFX generation that used one-LoRA-per-effect approaches; introduces novel in-context learning paradigm that allows a single unified model to handle multiple effects.

3. ❓ Problem: Current VFX generation methods require training separate models for each effect type and cannot generalize to unseen effects, limiting scalability and creative freedom.

4. 🛠️ Methods: Developed VFXMaster framework using in-context conditioning with reference videos, attention masking to prevent information leakage, and one-shot effect adaptation for handling out-of-domain effects.

5. 📊 Results and Evaluation: Achieved superior performance compared to existing methods across multiple metrics (FVD, Dynamic Degree, VFX-Cons), with particularly strong results in effect fidelity and generalization to unseen effects.

VFXMaster: Unlocking Dynamic Visual Effect Generation via In-Context Learning

1/2

Paper 2

The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution

Published: 2025-10-29

Link: http://arxiv.org/pdf/2510.25726

1. 📘 Topic and Domain: A benchmark for evaluating language AI agents' ability to use tools and execute complex real-world tasks across multiple applications.

2. 💡 Previous Research and New Ideas: Previous benchmarks focused on narrow domains or simplified tasks; this paper proposes a more comprehensive benchmark with diverse applications, realistic environments, and complex multi-step workflows.

3. ❓ Problem: Existing language agent benchmarks lack diversity, realism, and long-horizon complexity needed to evaluate real-world performance.

4. 🛠️ Methods: Created Tool Decathlon (TOOLATHLON) benchmark with 32 software applications, 604 tools, and 108 tasks requiring multi-step execution, with realistic environment states and verifiable evaluation scripts.

5. 📊 Results and Evaluation: The best model (Claude-4.5-Sonnet) achieved only 38.6% success rate with 20.2 tool calling turns on average, while the top open-source model (DeepSeek-V3.2-Exp) reached 20.1%, highlighting significant room for improvement.

The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution

1/2

Paper 3

RegionE: Adaptive Region-Aware Generation for Efficient Image Editing

Published: 2025-10-29

Link: http://arxiv.org/pdf/2510.25590

1. 📘 Topic and Domain: Efficient image editing using region-aware generation in the domain of instruction-based image editing and diffusion models.

2. 💡 Previous Research and New Ideas: Based on previous research in diffusion models and image editing, it proposes a novel approach of distinguishing between edited and unedited regions during the generation process to reduce computational redundancy.

3. ❓ Problem: High inference latency in instruction-based image editing models, which apply uniform generation processes across entire images despite only needing to modify specific regions.

4. 🛠️ Methods: Implements RegionE framework with three components: Adaptive Region Partition to separate edited/unedited regions, Region-Aware Generation to apply different processing to each region, and Adaptive Velocity Decay Cache to accelerate denoising.

5. 📊 Results and Evaluation: Achieved 2.57×, 2.41×, and 2.06× speedups on three major image editing models while maintaining high image quality (PSNR: 30.520-32.133), with GPT-4o evaluations confirming preserved semantic and perceptual fidelity.