By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: LEGS Trains 3.5x Faster Than LERF in Large-Scale Indoor Mapping | HackerNoon
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > Computing > LEGS Trains 3.5x Faster Than LERF in Large-Scale Indoor Mapping | HackerNoon
Computing

LEGS Trains 3.5x Faster Than LERF in Large-Scale Indoor Mapping | HackerNoon

News Room
Last updated: 2026/02/26 at 2:54 PM
News Room Published 26 February 2026
Share
LEGS Trains 3.5x Faster Than LERF in Large-Scale Indoor Mapping | HackerNoon
SHARE

Table of Links

  • Abstract
  • Introduction
  • Related work
  • Problem statement
  • [Methods]()
  • Experiments
  • Limitations
  • [Conclusion]()

LIMITATIONS

We assume a static environment where objects do not move during traversal. This limits the scope of this work because many applications involve dynamic scenes with moving objects. In future work, we will adapt our method to work for dynamic scenes. The motion of the Fetch mobile base can have a large effect on the LEGS reconstruction quality; the high stiction between the robot’s caster wheels and the environment introduces jolts, causing camera pose inaccuracies and image blurs. In the future, we hope to correct this with a new mobile base where the trajectory is autonomously determined by a frontier-based exploration algorithm.

Although autonomous navigation and obstacle avoidance has been extensively studied [57], obstacles can pose a problem when it comes to the 3D Gaussian map if they are only visible in a few of the ground truth images. 3D Gaussians are initialized at the deprojected points from these few images, but there are not enough views to refine and properly train these Gaussians; the result is oddly colored floaters that obstruct some parts of the static scene. When performing natural language queries, LEGS inherits the limitations of LERF + CLIP distillation into 3D described by similar works [1]. In our experimentation, we find that a large scale environment brings additional challenges in querying, particularly in 1) small or far-field objects in the training view, 2) similar item-background color features, such as white objects on white. Language embedded Gaussian splats can also produce false-positives when querying an object that is not in the scene due to the presence of visually or semantically similar objects, which may get incorrectly classified as the query object.

CONCLUSION

In this work, we introduce Language-Embedded Gaussian Splats (LEGS), a system that can train Gaussian Splats online with CLIP embeddings for large-scale indoor scenes. Because of pose accumulation error that builds up in large scenes, we use incremental bundle adjustment to improve pose fidelity for Gaussian Splat training. Results suggest LEGS trains 3.5x faster than LERF with comparable object recall.

REFERENCES

[1] J. Kerr, C. M. Kim, K. Goldberg, A. Kanazawa, and M. Tancik, “Lerf: Language embedded radiance fields,” in IEEE/CVF ICCV, 2023.

[2] B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, and R. Ng, “Nerf: Representing scenes as neural radiance fields for view synthesis,” Communications of the ACM, vol. 65, 2021.

[3] Y. Ze et al., “Gnfactor: Multi-task real robot learning with generalizable neural feature fields,” in CoRL, PMLR, 2023, pp. 284–301.

[4] W. Shen, G. Yang, A. Yu, J. Wong, L. P. Kaelbling, and P. Isola, “Distilled feature fields enable few-shot language-guided manipulation,” in 7th Annual Conference on Robot Learning, 2023.

[5] A. Rashid et al., “Language embedded radiance fields for zero-shot task-oriented grasping,” in 7th Annual CoRL, 2023.

[6] K. Jatavallabhula et al., “Conceptfusion: Open-set multimodal 3d mapping,” Robotics: Science and Systems (RSS), 2023.

[7] N. M. M. Shafiullah, C. Paxton, L. Pinto, S. Chintala, and A. Szlam, Clip-fields: Weakly supervised semantic fields for robotic memory, 2023. [8] C. Huang, O. Mees, A. Zeng, and W. Burgard, “Visual language maps for robot navigation,” in Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), London, UK, 2023.

[9] S. Kobayashi, E. Matsumoto, and V. Sitzmann, “Decomposing nerf for editing via feature field distillation,” NeurIPS, vol. 35, pp. 23 311– 23 330, 2022.

[10] V. Tschernezki, I. Laina, D. Larlus, and A. Vedaldi, “Neural feature fusion fields: 3d distillation of self-supervised 2d image representations,” in 2022 3DV, IEEE, 2022.

[11] A. Meuleman et al., “Progressively optimized local radiance fields for robust view synthesis,” in Proceedings of the IEEE/CVF CVPR, 2023, pp. 16 539–16 548.

[12] P. Wang et al., “F2-nerf: Fast neural radiance field training with free camera trajectories,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 4150–4159.

[13] M. Tancik et al., “Block-nerf: Scalable large scene neural view synthesis,” in CVPR, 2022.

[14] S. Peng, K. Genova, C. Jiang, A. Tagliasacchi, M. Pollefeys, and T. Funkhouser, Openscene: 3d scene understanding with open vocabularies, 2023.

[15] M. Bajracharya et al., “Demonstrating mobile manipulation in the wild: A metrics-driven approach,” RSS, 2023.

[16] B. Kerbl, G. Kopanas, T. Leimkuhler, and G. Drettakis, “3d gaussian ¨ splatting for real-time radiance field rendering,” ACM Transactions on Graphics, vol. 42, no. 4, 2023.

[17] X. Zuo, P. Samangouei, Y. Zhou, Y. Di, and M. Li, Fmgs: Foundation model embedded 3d gaussian splatting for holistic 3d scene understanding, 2024.

[18] M. Qin, W. Li, J. Zhou, H. Wang, and H. Pfister, Langsplat: 3d language gaussian splatting, 2024.

[19] T. Muller, A. Evans, C. Schied, and A. Keller, “Instant neural graphics ¨ primitives with a multiresolution hash encoding,” ACM Trans. Graph., vol. 41, no. 4, 102:1–102:15, Jul. 2022.

[20] K. O. Arras, “Feature-based robot navigation in known and unknown environments,” 2003.

[21] R. Chatila and J. Laumond, “Position referencing and consistent world modeling for mobile robots,” in Proceedings. 1985 IEEE International Conference on Robotics and Automation, IEEE, vol. 2, 1985, pp. 138–145.

[22] G. Jiang, L. Yin, S. Jin, C. Tian, X. Ma, and Y. Ou, “A simultaneous localization and mapping (slam) framework for 2.5 d map building based on low-cost lidar and vision fusion,” Applied Sciences, vol. 9, no. 10, p. 2105, 2019.

[23] H. Choset and K. Nagatani, “Topological simultaneous localization and mapping (slam): Toward exact localization without explicit localization,” IEEE Transactions on robotics and automation, vol. 17, no. 2, pp. 125–137, 2001.

[24] A. Tapus, “Topological slam: Simultaneous localization and mapping with fingerprints of places,” 2005.

[25] B. Alsadik and S. Karam, “The simultaneous localization and mapping (slam)-an overview,” Journal of Applied Science and Technology Trends, vol. 2, no. 02, pp. 147–158, 2021.

[26] S. Kohlbrecher, O. Von Stryk, J. Meyer, and U. Klingauf, “A flexible and scalable slam system with full 3d motion estimation,” in 2011 IEEE international symposium on safety, security, and rescue robotics, IEEE, 2011, pp. 155–160.

[27] W. Hess, D. Kohler, H. Rapp, and D. Andor, “Real-time loop closure in 2d lidar slam,” in 2016 ICRA, 2016.

[28] L. Huang, “Review on lidar-based slam techniques,” in 2021 International Conference on Signal Processing and Machine Learning (CONF-SPML), IEEE, 2021, pp. 163–168.

[29] M. T. Lazaro, R. Capobianco, and G. Grisetti, “Efficient long-term ´ mapping in dynamic environments,” in 2018 IROS, IEEE, 2018.

[30] Z. Zhu et al., “Nice-slam: Neural implicit scalable encoding for slam,” in Proceedings of the IEEE/CVF CVPR, 2022.

[31] A. Rosinol, J. J. Leonard, and L. Carlone, “Nerf-slam: Real-time dense monocular slam with neural radiance fields,” in 2023 IROS, IEEE, 2023.

[32] L. Roldao, R. De Charette, and A. Verroust-Blondet, “3d semantic scene completion: A survey,” International Journal of Computer Vision, vol. 130, no. 8, pp. 1978–2005, 2022.

[33] A. Nuchter and J. Hertzberg, “Towards semantic maps for mobile ¨ robots,” Robotics and Autonomous Systems, vol. 56, no. 11, 2008.

[34] H. A. Kestler et al., “Concurrent object identification and localization for a mobile robot,” Kunstliche Intelligenz ¨ , vol. 14, no. 4, pp. 23–29, 2000.

[35] K. Genova et al., “Learning 3d semantic segmentation with only 2d image supervision,” in 2021 International Conference on 3D Vision (3DV), IEEE, 2021, pp. 361–372.

[36] V. Vineet et al., “Incremental dense semantic stereo fusion for largescale semantic scene reconstruction,” in 2015 ICRA, IEEE, 2015.

[37] A. Brohan et al., “Do as i can, not as i say: Grounding language in robotic affordances,” in Conference on robot learning, PMLR, 2023.

[38] A. Brohan et al., “Rt-1: Robotics transformer for real-world control at scale,” arXiv preprint arXiv:2212.06817, 2022.

[39] B. Zitkovich et al., “Rt-2: Vision-language-action models transfer web knowledge to robotic control,” in CoRL, PMLR, 2023, pp. 2165–2183.

[40] K. M. Jatavallabhula et al., “Conceptfusion: Open-set multimodal 3d mapping,” RSS, 2023. [41] A. Radford et al., “Learning transferable visual models from natural language supervision,” in ICML, PMLR, 2021, pp. 8748–8763.

[42] Q. Gu et al., “Conceptgraphs: Open-vocabulary 3d scene graphs for perception and planning,” arXiv, 2023.

[43] W. Huang, C. Wang, R. Zhang, Y. Li, J. Wu, and L. Fei-Fei, “Voxposer: Composable 3d value maps for robotic manipulation with language models,” 2023.

[44] N. Keetha et al., “Splatam: Splat, track & map 3d gaussians for dense rgb-d slam,” CVPR, 2023.

[45] M. Li, S. Liu, H. Zhou, G. Zhu, N. Cheng, and H. Wang, Sgs-slam: Semantic gaussian splatting for neural dense slam, 2024.

[46] T. Chen, O. Shorinwa, W. Zeng, J. Bruno, P. Dames, and M. Schwager, Splat-nav: Safe real-time robot navigation in gaussian splatting maps, 2024.

[47] A. Dai, A. X. Chang, M. Savva, M. Halber, T. Funkhouser, and M. Nießner, “Scannet: Richly-annotated 3d reconstructions of indoor scenes,” in (CVPR), IEEE, 2017.

[48] C. Yeshwanth, Y.-C. Liu, M. Nießner, and A. Dai, Scannet++: A high-fidelity dataset of 3d indoor scenes, 2023.

[49] T. Schops, T. Sattler, and M. Pollefeys, “BAD SLAM: Bundle ¨ adjusted direct RGB-D SLAM,” in CVPR, 2019.

[50] X. Zhou, Z. Lin, X. Shan, Y. Wang, D. Sun, and M.-H. Yang, Drivinggaussian: Composite gaussian splatting for surrounding dynamic autonomous driving scenes, 2024.

[51] S. Agarwal et al., “Building rome in a day,” Communications of the ACM, vol. 54, no. 10, 2011.

[52] M. Tancik et al., “Nerfstudio: A modular framework for neural radiance field development,” in ACM SIGGRAPH 2023, 2023, pp. 1– 12.

[53] Z. Teed and J. Deng, Droid-slam: Deep visual slam for monocular, stereo, and rgb-d cameras, 2021.

[54] K. Shankar, M. Tjersland, J. Ma, K. Stone, and M. Bajracharya, “A learned stereo depth system for robotic manipulation in homes,” IEEE Robotics and Automation Letters, vol. 7, no. 2, 2022.

[55] J.-C. Shi, M. Wang, H.-B. Duan, and S.-H. Guan, “Language embedded 3d gaussians for open-vocabulary scene understanding,” arXiv preprint arXiv:2311.18482, 2023.

[56] A. Topiwala, P. Inani, and A. Kathpal, “Frontier based exploration for autonomous robot,” arXiv preprint arXiv:1806.03581, 2018.

[57] A. Pandey, S. Pandey, and D. Parhi, “Mobile robot navigation and obstacle avoidance techniques: A review,” Int Rob Auto J, vol. 2, no. 3, p. 00 022, 2017.

:::info
Authors:

  1. Justin Yu
  2. Kush Hari
  3. Kishore Srinivas
  4. Karim El-Refai
  5. Adam Rashid
  6. Chung Min Kim
  7. Justin Kerr
  8. Richard Cheng
  9. Muhammad Zubair Irshad
  10. Ashwin Balakrishna
  11. Thomas Kollar Ken Goldberg

:::


:::info
This paper is available on arxiv under CC by 4.0 Deed (Attribution 4.0 International) license.

:::

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article Alien Director Ridley Scott Knows Where He Went Wrong With Prometheus – BGR Alien Director Ridley Scott Knows Where He Went Wrong With Prometheus – BGR
Next Article Get the Best Device Deals on America’s Best Network Get the Best Device Deals on America’s Best Network
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

You Should Disable This Invasive New Microsoft Feature Right Now – Here’s Why – BGR
You Should Disable This Invasive New Microsoft Feature Right Now – Here’s Why – BGR
News
Young woman says she was on social media ‘all day long’ as a child in landmark addiction trial
News
Lenovo leak reveals a foldable gaming handheld that’s also a Windows laptop
Lenovo leak reveals a foldable gaming handheld that’s also a Windows laptop
News
Anthropic says 'virtually no progress' on Pentagon AI talks as deadline looms
Anthropic says 'virtually no progress' on Pentagon AI talks as deadline looms
News

You Might also Like

QIELend: Bringing Efficient DeFi Lending to The QIE Blockchain | HackerNoon
Computing

QIELend: Bringing Efficient DeFi Lending to The QIE Blockchain | HackerNoon

8 Min Read
What Does It Mean to Be Human When Tortured? | HackerNoon
Computing

What Does It Mean to Be Human When Tortured? | HackerNoon

6 Min Read
Pipe Network Launches SolanaCDN: A Free, Open-Source Validator Client With Built-In Acceleration  | HackerNoon
Computing

Pipe Network Launches SolanaCDN: A Free, Open-Source Validator Client With Built-In Acceleration | HackerNoon

5 Min Read
Python is a Video Latency Suicide Note: How I Hit 29 FPS with Zero-Copy C++ ONNX | HackerNoon
Computing

Python is a Video Latency Suicide Note: How I Hit 29 FPS with Zero-Copy C++ ONNX | HackerNoon

8 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?