Off-campus WSU users: To download campus access dissertations, please use the following link to log into our proxy server with your WSU access ID and password, then click the "Off-campus Download" button below.

Non-WSU users: Please talk to your librarian about requesting this dissertation through interlibrary loan.

Access Type

WSU Access

Date of Award

January 2025

Degree Type

Dissertation

Degree Name

Ph.D.

Department

Computer Science

First Advisor

Farshad Fotouhi

Abstract

In dynamic environments, the presence of moving objects presents significant challenges for visual SLAM systems, leading to inconsistent trajectory estimation and mapping inaccuracies. Traditional SLAM systems are primarily designed for static environments, where scene elements remain stationary. However, in real-world scenarios, dynamic objects such as pedestrians, vehicles, and other moving entities frequently occlude environmental features, causing disruptions in scene continuity and reducing mapping accuracy. These challenges are further exacerbated by the unpredictable nature of dynamic objects, leading to frequent occlusions, motion blur, and inconsistent feature matching. Addressing these issues requires advanced dynamic object handling mechanisms that can effectively segment, track, and inpaint occluded regions, ensuring seamless scene reconstruction and robust trajectory estimation.

We propose DynaGaussian-SLAM, a novel dense visual SLAM system designed for highly dynamic environments by integrating dynamic object handling, inpainting, and memory-based optimization. Building upon the foundations of REDO-SLAM and GAN-SLAM, our approach enhances robustness in dynamic scenes while maintaining computational efficiency. REDO-SLAM’s motion-aware dynamic object segmentation, leveraging a combination of optical flow and Deep Reinforcement Learning (DRL), enables efficient mask prediction, while GAN-SLAM’s flow-guided video inpainting recovers occluded backgrounds to improve scene consistency.

REDO-SLAM's motion-aware dynamic object segmentation technique efficiently detects and segments dynamic objects by analyzing motion patterns using optical flow. This is further enhanced with DRL, enabling adaptive mask prediction that can generalize to various dynamic scenarios. By integrating this segmentation framework, DynaGaussian-SLAM effectively tracks and segments moving objects in real-time, ensuring consistent trajectory estimation even in highly dynamic environments.

Building on GAN-SLAM's flow-guided video inpainting, DynaGaussian-SLAM restores occluded backgrounds, ensuring seamless scene reconstruction. GAN-SLAM utilizes optical flow to guide the inpainting process, maintaining temporal consistency and generating realistic textures for occluded regions. This method significantly enhances scene continuity by accurately reconstructing background elements hidden by dynamic objects. By leveraging this approach, DynaGaussian-SLAM not only improves visual fidelity but also enhances mapping accuracy by maintaining consistent environmental features, even in the presence of moving objects. To further optimize computational efficiency, we introduce a hierarchical memory-based transformer that leverages both long-term and short-term memory to minimize redundant computations. Long-term memory stores keyframe information, preventing repeated model inference for previously observed scenes, while short-term memory retains adjacent frame details for rapid data retrieval. This dual-memory architecture ensures efficient data processing, reducing latency and enhancing real-time performance. By strategically managing memory usage, DynaGaussian-SLAM achieves faster computation speeds while maintaining high accuracy in trajectory estimation.

One of the key innovations in DynaGaussian-SLAM is the adaptive fusion of dynamic object segmentation and inpainting, creating a cohesive pipeline for consistent scene reconstruction. Unlike conventional SLAM systems that treat segmentation and inpainting as separate modules, DynaGaussian-SLAM dynamically fuses these processes to achieve seamless scene consistency. This adaptive fusion is guided by flow information, allowing real-time adjustments in segmentation masks and inpainting regions based on object motion and occlusions. This integrated approach ensures that the reconstructed scene remains coherent, enhancing mapping accuracy and robustness in dynamic environments.

Our experimental evaluation on the TUM and KITTI datasets demonstrates that DynaGaussian-SLAM outperforms state-of-the-art dynamic SLAM systems in terms of robustness, scalability, and computational efficiency. The system achieves higher accuracy in dynamic object segmentation and trajectory estimation while maintaining real-time performance. Comparative analysis with REDO-SLAM and GAN-SLAM shows that DynaGaussian-SLAM improves mapping consistency and reduces processing latency. Furthermore, the hierarchical memory-based transformer enhances computational efficiency, resulting in lower energy consumption and faster processing speeds.

Off-campus Download

Share

COinS