2026-01-15 Papers

1/2

Paper 1

MAXS: Meta-Adaptive Exploration with LLM Agents

Published: 2026-01-14

Link: http://arxiv.org/pdf/2601.09259

1. 📘 Topic and Domain: The paper proposes MAXS, a meta-adaptive exploration framework for Large Language Model (LLM) Agents to improve multi-tool reasoning and decision-making.

2. 💡 Previous Research and New Ideas: Based on existing Chain of Thought (CoT), Tree of Thought (ToT), and Monte Carlo Tree Search (MCTS) methods, it introduces a novel lookahead strategy and value estimation mechanism for more efficient reasoning.

3. ❓ Problem: The paper addresses two key issues in LLM Agent reasoning: locally myopic generation (lack of foresight in decision-making) and trajectory instability (where small early errors can lead to divergent reasoning paths).

4. 🛠️ Methods: MAXS employs a lookahead strategy to simulate future steps, combines three metrics (advantage score, step consistency variance, and inter-step trend slopes) for value estimation, and uses a trajectory convergence mechanism to control computational costs.

5. 📊 Results and Evaluation: Tested across five datasets and three base models, MAXS consistently outperformed existing methods in both accuracy and efficiency, showing particular strength on MathVista (85.5% accuracy) while using significantly fewer tokens than alternatives like MCTS.

MAXS: Meta-Adaptive Exploration with LLM Agents

1/2

Paper 2

A^3-Bench: Benchmarking Memory-Driven Scientific Reasoning via Anchor and Attractor Activation

Published: 2026-01-14

Link: http://arxiv.org/pdf/2601.09274

1. 📘 Topic and Domain: A benchmark dataset (A3-Bench) for evaluating memory-driven scientific reasoning in math, physics, and chemistry domains.

2. 💡 Previous Research and New Ideas: Based on existing memory and scientific reasoning benchmarks, proposes a novel dual-scale memory framework using anchors (foundational knowledge) and attractors (experience-based templates).

3. ❓ Problem: Existing benchmarks only evaluate final answers or step-by-step coherence, without evaluating how models activate and utilize memory during scientific reasoning.

4. 🛠️ Methods: Created 2,198 scientific problems using SAPM process (subject benchmarking, anchor/attractor development, problem reconstruction, memory mapping) and introduced AAUI metric to measure memory activation rates.

5. 📊 Results and Evaluation: Models showed improved accuracy under memory activation (13.48% average increase), with strongest gains on difficult problems, while maintaining reasonable token costs and achieving AAUI scores up to 0.97.

A^3-Bench: Benchmarking Memory-Driven Scientific Reasoning via Anchor and Attractor Activation

1/2

Paper 3

Fast-ThinkAct: Efficient Vision-Language-Action Reasoning via Verbalizable Latent Planning

Published: 2026-01-14

Link: http://arxiv.org/pdf/2601.09708

1. 📘 Topic and Domain: The paper focuses on efficient Vision-Language-Action (VLA) reasoning for robotic tasks, specifically developing a framework called Fast-ThinkAct to improve the speed and effectiveness of robots' decision-making processes.

2. 💡 Previous Research and New Ideas: Based on previous chain-of-thought (CoT) reasoning approaches in VLA models, this paper proposes a novel method of compressing lengthy reasoning chains into compact latent representations while maintaining performance.

3. ❓ Problem: The paper addresses the high inference latency in current VLA systems that use lengthy reasoning traces, which hampers real-time performance in robotic applications requiring rapid decision-making.

4. 🛠️ Methods: The authors implement a teacher-student framework with preference-guided distillation that compresses linguistic and visual planning into compact continuous latents, using a verbalizer LLM to ensure the latent representations remain interpretable.

5. 📊 Results and Evaluation: The framework achieved up to 89.3% reduction in inference latency compared to state-of-the-art reasoning VLAs while maintaining or improving performance across various robotic manipulation and reasoning benchmarks.