Robots Learn to “See” With Language in Real Time Using 3D Gaussian Splatting

This article presents a robotic system that incrementally reconstructs large indoor environments using multi-camera Gaussian Splatting and global bundle adjustment, while embedding language-aligned features to localize objects from open-vocabulary natural-language queries in real time.

