Rectified Point Flow:
Generic Point Cloud Pose Estimation

Tao Sun1,* Liyuan Zhu1,* Shengyu Huang2 Shuran Song1 Iro Armeni1
1Stanford University 2NVIDIA Research
*Equal contribution

TL;DR: A point cloud generative model that turns unposed parts into assembled shapes.

The model samples new assemblies each time. They show meaningful variation, particularly for symmetric parts.
Orbit the viewer to see from different angles. Switch between objects using the buttons below.


Abstract

We introduce Rectified Point Flow, a unified parameterization that formulates pairwise point cloud registration and multi-part shape assembly as a single conditional generative problem. Given unposed point clouds, our method learns a continuous point-wise velocity field that transports noisy points toward their target positions, from which part poses are recovered. In contrast to prior work that regresses part-wise poses with ad-hoc symmetry handling, our method intrinsically learns assembly symmetries without symmetry labels.

Together with a self-supervised encoder focused on overlapping points, our method achieves a new state-of-the-art performance on six benchmarks spanning pairwise registration and shape assembly. Notably, our unified formulation enables effective joint training on diverse datasets, facilitating the learning of shared geometric priors and consequently boosting accuracy.

Framework

Rectified Point Flow supports shape assembly and pairwise registration tasks in a single framework. Given a set of unposed part point clouds \(\{\bar {X}_i\}_{i\in\Omega}\), it predicts each part's point cloud at the target assembled state \(\{\hat {X}_i{(0)}\}_{i\in\Omega}\). Subsequently, we solve Procrustes problem via SVD between the condition point cloud \(\bar X_i\) and the estimated point cloud \(\hat X_i(0)\) to recover the rigid transformation \(\hat T_i\) for each non-anchored part.

ReStyle3D method teaser figure

Multi-part Shape Assembly

We evaluate our method on the multi-part shape assembly task, where the goal is to estimate the poses of multiple parts given their unposed point clouds.

Columns show objects with increasing number of parts (left to right). Rows display (1) colored input point clouds of each part, (2) GARF outputs (dashed boxes indicate samples limited to 20 by GARF’s design, selecting the top 20 parts by volume), (3) Rectified Point Flow outputs, and (4) ground-truth assemblies. Compared to GARF, our method produces more accurate pose estimation on most parts, especially as the number of parts increases.

Comparison with other methods

Linear Interpolation in Noise Space

We visualize the linear interpolation in the noise space by generating the assembled point cloud from \( Z(s) \), where \( Z(s) \) interpolates linearly between two Gaussian noise vectors \( Z_0 \) and \( Z_1 \). We observe a continuous, semantically meaningful mapping from Gaussian noise to valid assemblies.

Part Interchanging

Generated from
\(Z_0\)
Generated from
\(Z(s) = (1 - s) Z_0 + s Z_1\)
Generated from
\(Z_1\)
\( s = \)0.00

Structural Changing

Generated from
\(Z_0\)
Generated from
\(Z(s) = (1 - s) Z_0 + s Z_1\)
Generated from
\(Z_1\)
\( s = \)0.00

Generalization to Unseen Assemblies

Parts from Same Categories

We test the model’s ability to generalize to unseen assemblies composed of parts from two different objects within the same category. Our results show that the model captures the underlying geometry of the category and can successfully re-target parts to construct a coherent shape belonging to that category.

Comparison with other methods

Parts from Different Categories

Surprisingly, our method can also generalize to certain parts from different categories, which is particularly challenging. This indicates that the model can reason about part compositionality and re-target parts to produce a plausible final shape, even when some parts originate from completely different categories.

Comparison with other methods

Concurrent Works

We are pleased to see several concurrent works that explore flow matching for pose estimation. Check them as well!
- GARF: Learning Generalizable 3D Reassembly for Real-World Fractures combines fracture-aware pretraining with a flow matching model to predict SE(3) poses for parts.
- Equivariant Flow Matching for Point Cloud Assembly handles part symmetry like ours, but with a proposed equivariant flow model working on top of an SE(3)-equivariant encoder.

BibTeX

@inproceedings{sun2025_rpf,
      author = {Sun, Tao and Zhu, Liyuan and Huang, Shengyu and Song, Shuran and Armeni, Iro},
      title = {Rectified Point Flow: Generic Point Cloud Pose Estimation},
      booktitle = {arxiv preprint arXiv:2506.05282},
      year = {2025},
}