2025-11-18 Papers

1/2

Paper 1

P1: Mastering Physics Olympiads with Reinforcement Learning

Published: 2025-11-17

Link: http://arxiv.org/pdf/2511.13612

1. 📘 Topic and Domain: Development of large language models (P1) specialized in physics reasoning and solving Physics Olympiad problems, in the domain of artificial intelligence and scientific reasoning.

2. 💡 Previous Research and New Ideas: Based on recent advances in LLMs for scientific reasoning, introduces new reinforcement learning techniques for physics problem-solving and proposes a novel multi-stage training framework with adaptive learnability adjustment.

3. ❓ Problem: Addresses the challenge of developing open-source language models capable of mastering complex physics problems at the Olympiad level, requiring deep scientific reasoning rather than simple pattern matching.

4. 🛠️ Methods: Employs reinforcement learning with Group Sequence Policy Optimization (GSPO), adaptive learnability adjustment, and test-time scaling through an agentic framework called PhysicsMinions.

5. 📊 Results and Evaluation: P1-235B-A22B achieved gold-medal performance at IPhO 2025, winning 12 gold medals out of 13 competitions, while P1-30B-A3B earned silver medal performance, surpassing most open-source models, with further improvements when combined with PhysicsMinions.

P1: Mastering Physics Olympiads with Reinforcement Learning

1/2

Paper 2

Part-X-MLLM: Part-aware 3D Multimodal Large Language Model

Published: 2025-11-17

Link: http://arxiv.org/pdf/2511.13647

1. 📘 Topic and Domain: A part-aware 3D multimodal large language model (Part-X-MLLM) for understanding and manipulating 3D shapes at the part level in computer vision and graphics.

2. 💡 Previous Research and New Ideas: Previous research focused on scene-level 3D understanding or holistic shape generation, while this paper proposes a unified framework that treats parts as first-class citizens and provides a single executable interface for part-based operations.

3. ❓ Problem: The lack of a native language model that can understand, name, and manipulate 3D object parts while providing precise spatial grounding and executable programs for downstream geometry tasks.

4. 🛠️ Methods: Uses a dual-encoder architecture (structure + semantics) with an autoregressive decoder to generate structured programs containing part-level bounding boxes and edit commands, followed by specialized geometry modules for execution.

5. 📊 Results and Evaluation: Achieved superior performance on UniPart-Bench across 11 task families, with significant improvements in bounding box generation (74.11% voxel recall, 48.74% voxel IoU) and consistent gains in part-level Q&A and grounding tasks over baseline models.

Part-X-MLLM: Part-aware 3D Multimodal Large Language Model

1/2

Paper 3

PhysX-Anything: Simulation-Ready Physical 3D Assets from Single Image

Published: 2025-11-17

Link: http://arxiv.org/pdf/2511.13648

1. 📘 Topic and Domain: Generating simulation-ready 3D physical assets from single images, in the domain of computer vision and 3D graphics.

2. 💡 Previous Research and New Ideas: Based on previous 3D generation and physical modeling research, introduces a novel VLM-based approach with a new efficient 3D representation that reduces tokens by 193x while maintaining structural information.

3. ❓ Problem: Existing 3D generation methods lack physical and articulation properties needed for simulation, limiting their use in embodied AI applications.

4. 🛠️ Methods: Uses a multi-round VLM conversation to generate physical descriptions and geometry, with a controllable flow transformer for fine-grained details, and introduces PhysX-Mobility dataset with rich physical annotations.

5. 📊 Results and Evaluation: Achieves superior performance across geometric and physical metrics compared to state-of-the-art methods, with 99% improvement in absolute scale accuracy and strong generalization to real-world images.