1. 📘 Topic and Domain: The paper introduces Multi-Agent Test-Time Reinforcement Learning (MATTRL) for improving collaborative reasoning among large language models across medicine, math, and education domains.
2. 💡 Previous Research and New Ideas: The paper builds on recent work in multi-agent LLM systems and reinforcement learning for reasoning, proposing a novel framework that injects structured textual experience into multi-agent deliberation at inference time rather than requiring expensive training.
3. ❓ Problem: The paper addresses the challenges of multi-agent reinforcement learning, which is resource-intensive and unstable due to non-stationarity from co-adapting teammates and sparse, high-variance rewards.
4. 🛠️ Methods: MATTRL forms specialized expert teams for multi-turn discussions, uses credit assignment strategies to construct an experience pool from high-value interactions, and injects these experiences during test-time to improve collaborative reasoning.
5. 📊 Results and Evaluation: Across medical diagnosis, math problem-solving, and educational tasks, MATTRL improved accuracy by an average of 3.67% over multi-agent baselines and 8.67% over single-agent approaches, with detailed ablation studies validating different credit assignment schemes.