2025-07-10 Papers

1/2

Paper 1

Go to Zero: Towards Zero-shot Motion Generation with Million-scale Data

Published: 2025-07-09

Link: http://arxiv.org/pdf/2507.07095

1. 📘 Topic and Domain: Text-to-motion generation focusing on zero-shot capabilities using large-scale motion data collection and modeling.

2. 💡 Previous Research and New Ideas: Based on previous text-to-motion generation methods limited by small datasets, proposes scaling up both dataset size (MotionMillion with 2M sequences) and model capacity (7B parameters) to achieve zero-shot generalization.

3. ❓ Problem: Current text-to-motion generation models lack zero-shot generalization abilities due to limited training data and model capacities.

4. 🛠️ Methods: Built MotionMillion dataset through efficient motion reconstruction pipeline, used wavelet-enhanced FSQ for motion tokenization, and scaled up transformer architecture to 7B parameters.

5. 📊 Results and Evaluation: Achieved superior performance on new MotionMillion-Eval benchmark, demonstrating strong zero-shot capabilities for complex compositional motions compared to existing methods like ScaMo.

Go to Zero: Towards Zero-shot Motion Generation with Million-scale Data

1/2

Paper 2

Rethinking Verification for LLM Code Generation: From Generation to Testing

Published: 2025-07-09

Link: http://arxiv.org/pdf/2507.06920

1. 📘 Topic and Domain: The paper focuses on improving test case generation and verification methods for evaluating Large Language Model (LLM) code generation capabilities.

2. 💡 Previous Research and New Ideas: Based on previous benchmarks like HumanEval and LiveCodeBench that use limited test cases, the paper proposes a novel human-LLM collaborative framework called SAGA for generating more comprehensive test suites.

3. ❓ Problem: Current code evaluation benchmarks use insufficient test cases that fail to detect subtle errors, leading to artificially inflated performance metrics and compromised reward estimation in reinforcement learning frameworks.

4. 🛠️ Methods: The authors developed SAGA, which combines human programming expertise with LLM reasoning capabilities through multi-dimensional analysis of correct solutions and differential analysis of incorrect solutions to generate high-quality test cases.

5. 📊 Results and Evaluation: SAGA achieved a detection rate of 90.62% and verifier accuracy of 32.58% on TCGBench, with the verifier accuracy being 10.78% higher than LiveCodeBench-v6, demonstrating significant improvements in test case generation quality.

Rethinking Verification for LLM Code Generation: From Generation to Testing

1/2

Paper 3

First Return, Entropy-Eliciting Explore

Published: 2025-07-09

Link: http://arxiv.org/pdf/2507.07017

1. 📘 Topic and Domain: The paper focuses on improving reinforcement learning exploration strategies for Large Language Models (LLMs) in mathematical reasoning tasks.

2. 💡 Previous Research and New Ideas: Based on traditional reinforcement learning approaches like GRPO and PPO, the paper introduces FR3E, a novel framework that combines "First Return, Then Explore" principles with entropy-based exploration for LLMs.

3. ❓ Problem: The paper addresses unstable exploration and ineffective credit assignment in Reinforcement Learning from Verifiable Rewards (RLVR) for LLMs during mathematical reasoning tasks.

4. 🛠️ Methods: FR3E identifies high-uncertainty decision points in reasoning trajectories, performs targeted rollouts from these points, and uses entropy-based signals to guide exploration while maintaining semantic coherence.

5. 📊 Results and Evaluation: FR3E demonstrated improved performance across multiple mathematical reasoning benchmarks, showing more stable training dynamics, longer coherent responses, and higher proportions of correct solutions compared to baseline methods like GRPO++.