Toward Learning-Based Visuomotor Navigation With Neural Radiance Fields
Abstract
Creating memory representations is essential for developing viable navigation strategies for intelligent agents. Although neural radiance fields (NeRFs) have shown great promise as a novel method for spatial representation, their potential for integration into learning-based navigation as a memory structure has been largely overlooked in the existing literature. In this article, we introduce a navigation pipeline that incorporates NeRF into visuomotor navigation. Initially, we present a derivative radiance field that facilitates one-shot pose and depth estimation from a single query image. By assuming equivalence between density and space occupancy, we generate a geometric accessibility map based on an offline-constructed NeRF prior. Utilizing the above information, we design a global planner that decomposes long-term tasks by performing waypoint estimation and rendering. Finally, we employ an imitation-learned local controller to achieve a reliable navigation policy. Our pipeline effectively utilizes NeRF's compact spatial representation for task decomposition and action generation, enabling efficient navigation. Experimental results highlight its advantages over recent implicit and explicit memory approaches in image-goal navigation tasks. Moreover, we conduct interpretability studies and apply our algorithm in real-world scenarios to further attest to its practicality and effectiveness.