2026-02-27 Papers

1/2

Paper 1

From Statics to Dynamics: Physics-Aware Image Editing with Latent Transition Priors

Published: 2026-02-25

Link: http://arxiv.org/pdf/2602.21778

1. 📘 Topic and Domain: The paper focuses on physics-aware image editing in computer vision, specifically addressing how to generate physically plausible edits that obey natural laws.

2. 💡 Previous Research and New Ideas: The paper builds on instruction-based image editing methods like Qwen-Image-Edit and proposes reformulating editing as continuous physical state transitions rather than discrete mappings, introducing learnable transition queries to capture dynamics from video data.

3. ❓ Problem: Current image editing models achieve high semantic fidelity but frequently violate physical principles (e.g., incorrect refraction, implausible material deformation), treating editing as a black-box transformation without considering underlying physical laws.

4. 🛠️ Methods: The authors create PhysicTran38K (38K video-based dataset with physics categories), develop PhysicEdit framework with dual-thinking mechanism (frozen Qwen2.5-VL for reasoning + learnable transition queries for visual guidance), and use timestep-aware modulation for diffusion generation.

5. 📊 Results and Evaluation: PhysicEdit achieves 64.86% on PICABench (5.9% improvement over baseline) and 72.16% on KRISBench (10.1% improvement), outperforming all evaluated open-source models and remaining competitive with proprietary models in physical realism and knowledge-grounded editing.

From Statics to Dynamics: Physics-Aware Image Editing with Latent Transition Priors

1/2

Paper 2

From Blind Spots to Gains: Diagnostic-Driven Iterative Training for Large Multimodal Models

Published: 2026-02-26

Link: http://arxiv.org/pdf/2602.22859

1. 📘 Topic and Domain: The paper focuses on improving Large Multimodal Models (LMMs) through diagnostic-driven iterative training in the domain of multimodal reasoning and reinforcement learning.

2. 💡 Previous Research and New Ideas: The paper builds on self-evolving training frameworks and reinforcement learning methods for LMMs, proposing Diagnostic-driven Progressive Evolution (DPE) that uses explicit failure attribution and targeted data generation instead of heuristic signals.

3. ❓ Problem: The paper aims to solve the limitations of static training data and fixed recipes that create capability blind spots and prevent dynamic, targeted reinforcement in LMM training.

4. 🛠️ Methods: The authors use a closed-loop framework with diagnostic agents that analyze failure patterns, multi-agent systems with tools for image search/editing to generate targeted training data, and reinforcement learning (GRPO) for model updates.

5. 📊 Results and Evaluation: DPE achieved consistent improvements across 11 benchmarks on Qwen models, surpassing baselines like VisPlay with only 1000 training examples, demonstrating stable gains in STEM, OCR, visual math, and hallucination mitigation tasks.

From Blind Spots to Gains: Diagnostic-Driven Iterative Training for Large Multimodal Models

1/2

Paper 3

Exploratory Memory-Augmented LLM Agent via Hybrid On- and Off-Policy Optimization

Published: 2026-02-26

Link: http://arxiv.org/pdf/2602.23008

1. 📘 Topic and Domain: The paper focuses on reinforcement learning for large language model (LLM) agents in multi-step embodied reasoning tasks.

2. 💡 Previous Research and New Ideas: The paper builds on prior work like GRPO, Reflexion, and memory-augmented LLMs, proposing EMPO² which combines parametric (model parameter) and non-parametric (external memory) updates with hybrid on-policy and off-policy optimization.

3. ❓ Problem: The paper addresses the exploration bottleneck in LLM agents trained with RL, where agents struggle to discover novel states and rely too heavily on pretrained knowledge rather than systematic exploration.

4. 🛠️ Methods: EMPO² uses memory-augmented prompting with self-generated tips, implements both on-policy and off-policy learning modes, and employs intrinsic rewards for encouraging exploration of novel states.

5. 📊 Results and Evaluation: On ScienceWorld and WebShop benchmarks, EMPO² achieved 128.6% and 11.3% improvements over GRPO respectively, demonstrating superior adaptability to new tasks with only few trials and no parameter updates.