TriSplat

Simulation-Ready Feed-Forward 3D Scene Reconstruction

Weijie Wang^1,* Zimu Li^1,* Jinchuan Shi¹ Zeyu Zhang¹ Botao Ye^2,3 Marc Pollefeys^2,4 Donny Y. Chen⁵ Bohan Zhuang¹

¹ Zhejiang University ² ETH Zurich ³ ETH AI Center ⁴ Microsoft ⁵ Monash University

^* Equal contribution.

Why triangle primitives?

Sparse-view 3D reconstruction is increasingly handled by feed-forward splatting networks, but Gaussian primitives expose surfaces only indirectly. Turning them into meshes still requires expensive TSDF fusion or Poisson reconstruction, which breaks the feed-forward promise and makes downstream simulation cumbersome.

TriSplat represents scenes with oriented triangle primitives. It predicts local 3D point maps, triangle attributes, camera poses, and optional intrinsics from sparse inputs, then anchors triangle orientation to geometry normals refined with image-conditioned cues and a mono-normal bootstrap schedule. Because the rendering primitives are already triangles, the output can be loaded directly into physics engines, collision detectors, and standard renderers.

Interactive Demo

Drive multiple agents on exported mesh surfaces and watch the active agent through a live first-person camera. For GitHub Pages delivery, this demo uses edge-cropped and randomly sampled web mesh chunks, so web-only holes may appear; original TriSplat exports keep the full surface density.

WASD / arrow keys to drive QE to rotate the active agent Add agents, then select a roster chip to switch control Choose a scene or mark the current position

Loading scene...

Scene

Exported Triangle Mesh Gallery

These viewers load edge-cropped and randomly sampled web chunks for GitHub Pages. Offline exports keep the complete triangle surfaces without the web-only holes.

Triangle-native reconstruction pipeline

TriSplat couples point-map geometry, normal-anchored local frames, and progressive surface sharpening so hard-edged triangle primitives can train stably and export cleanly.

TriSplat model pipeline. — From sparse unposed images, the network predicts geometry, camera pose, and oriented triangle attributes for direct mesh rendering and export.

Point maps and poses

A DINOv2-backed transformer decoder predicts dense local 3D point maps, relative camera poses, optional intrinsics, and per-pixel triangle attributes.

Normal-anchored triangles

Finite-difference geometry normals are refined by an image-conditioned normal head and converted into tangent frames that orient each triangle primitive.

Progressive sharpening

Opacity and edge blur schedules start with forgiving soft footprints and gradually converge to crisp, mesh-ready surface elements.

Direct mesh export

Low-opacity triangles are discarded, face winding is corrected, nearby vertices are merged, and the native triangle primitives become a standard mesh.

Fast feed-forward reconstruction without mesh post-processing

The page already exposes the exported mesh quality through interactive viewers, so this section keeps the paper result focused on inference efficiency.

Efficiency comparison for feed-forward mesh reconstruction. — Feed-forward efficiency across input-view counts: TriSplat exports simulation-ready triangle meshes in under 1.3 seconds, while Gaussian-to-mesh baselines require tens to hundreds of seconds.

BibTeX

@techreport{wang2026trisplat,
  title        = {TriSplat: Simulation-Ready Feed-Forward 3D Scene Reconstruction},
  author       = {Wang, Weijie and Li, Zimu and Shi, Jinchuan and Zhang, Zeyu and Ye, Botao and Pollefeys, Marc and Chen, Donny Y. and Zhuang, Bohan},
  institution  = {Zhejiang University},
  year         = {2026},
  month        = {May},
  url          = {https://lhmd.top/trisplat}
}