2025-09-09 Papers

1/2

Paper 1

Reverse-Engineered Reasoning for Open-Ended Generation

Published: 2025-09-07

Link: http://arxiv.org/pdf/2509.06160

1. 📘 Topic and Domain: The paper introduces a new paradigm called "Reverse-Engineered Reasoning" (REER) for improving open-ended text generation capabilities in large language models.

2. 💡 Previous Research and New Ideas: Previous research relied on reinforcement learning and instruction distillation for reasoning capabilities; this paper proposes a novel "backwards" approach of discovering reasoning processes from known good solutions.

3. ❓ Problem: The paper aims to solve the challenge of instilling deep reasoning capabilities in language models for open-ended, creative generation tasks where traditional methods fail due to lack of verifiable rewards or high costs.

4. 🛠️ Methods: The authors use a gradient-free local search algorithm to iteratively refine reasoning trajectories by optimizing perplexity scores, creating a dataset of 20,000 deep reasoning examples to train their DeepWriter-8B model.

5. 📊 Results and Evaluation: DeepWriter-8B outperformed open-source baselines and achieved performance competitive with proprietary models like GPT-4o and Claude 3.5 on benchmarks like LongBench, HelloBench, and WritingBench.

Reverse-Engineered Reasoning for Open-Ended Generation

1/2

Paper 2

WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents

Published: 2025-09-08

Link: http://arxiv.org/pdf/2509.06501

1. 📘 Topic and Domain: The paper focuses on developing WebExplorer, a web agent training system in the domain of Large Language Models (LLMs) and information retrieval.

2. 💡 Previous Research and New Ideas: Previous research used graph-based and evolution-based approaches for web navigation data construction, while this paper introduces a novel model-based exploration and long-to-short query evolution approach.

3. ❓ Problem: The paper addresses the scarcity of challenging data for training web agents in complex information-seeking tasks.

4. 🛠️ Methods: The method combines model-based exploration to construct information spaces, iterative query evolution to increase difficulty, supervised fine-tuning for initialization, and reinforcement learning with GRPO algorithm for optimization.

5. 📊 Results and Evaluation: WebExplorer-8B achieved state-of-the-art performance across multiple benchmarks, including 15.7% on BrowseComp-en and 32.0% on BrowseComp-zh, outperforming larger models like WebSailor-72B despite having only 8B parameters.

WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents

1/2

Paper 3

Does DINOv3 Set a New Medical Vision Standard?

Published: 2025-09-08

Link: http://arxiv.org/pdf/2509.06467

1. 📘 Topic and Domain: Evaluating DINOv3, a self-supervised vision transformer trained on natural images, for medical imaging tasks including 2D/3D classification and segmentation.

2. 💡 Previous Research and New Ideas: Based on DINO series and other medical vision models like BiomedCLIP, proposes using natural image-trained DINOv3 as a universal encoder for medical imaging without domain-specific pre-training.

3. ❓ Problem: Investigating whether DINOv3's visual features trained on natural images can effectively transfer to specialized medical imaging tasks without medical domain pre-training.

4. 🛠️ Methods: Conducted comprehensive benchmarking across multiple medical imaging tasks using linear probing, k-NN evaluation, and multiple instance learning, testing different model sizes (DINOv3-S/B/L) and input resolutions.

5. 📊 Results and Evaluation: DINOv3 showed strong performance on X-ray and CT tasks but struggled with specialized domains like pathology slides and PET scans, with inconsistent scaling benefits across different medical tasks and modalities.