3D-PU

Segment Any 3D-Part from a Sentence

Hongyu Wu^1,†, Pengwan Yang^1,†, Yuki M. Asano^{1, 2,‡}, Cees G. M. Snoek^1,‡

¹University of Amsterdam ²Technical University of Nuremberg

Abstract

This paper aims to achieve the segmentation of any 3Dpart based on natural language descriptions, extending be-yond traditional object-level 3D scene understanding andaddressing both data and methodological challenges. Existing datasets and methods are predominantly limited toobject-level comprehension. To overcome the limitationsof data availability, we introduce the first large-scale 3Ddataset with dense part annotations, created through aninnovative and cost-effective method for constructing synthetic 3D scenes with fine-grained part-level annotations,paving the way for advanced 3D part understanding. On themethodological side, we propose OpenPart3D, a 3D-inputonly framework to effectively tackle the challenges of partlevel segmentation. Extensive experiments demonstrate thesuperiority ofour approach in open-vocabulary 3D understanding tasks at both the part and object levels, with stronggeneralization capabilities across various 3D datasets.

Method: OpenPart3D

First, the Room-Tour Snap Module captures multiple view images of the 3D scene by strategically positioning cameras at optimized locations and orientations. These images are then processed by a 2D open-vocabulary model to generate 2D part masks corresponding to the given text query. Subsequently, the View-Weighted 3D-Part Grouping Module integrates these 2D part masks across multiple views, assigning adaptive weights to each view, to extract geometrically consistent regions from the scene’s point cloud and aggregate them into precise 3D parts

Segment Any 3D-Part from a Sentence

Abstract

3D-PU Data

Method: OpenPart3D

Visualization(3D-PU)

Visualization(Other datasets)

BibTeX