Off-campus WSU users: To download campus access dissertations, please use the following link to log into our proxy server with your WSU access ID and password, then click the "Off-campus Download" button below.

Non-WSU users: Please talk to your librarian about requesting this dissertation through interlibrary loan.

Access Type

WSU Access

Date of Award

January 2024

Degree Type

Dissertation

Degree Name

Ph.D.

Department

Computer Science

First Advisor

Dongxiao Zhu

Abstract

Medical image segmentation is a crucial process in medical imaging analysis, enabling precise delineation of anatomical structures and pathological regions. This dissertation explores the evolution and application of advanced deep learning models, specifically focusing on the integration of transformers and convolutional neural networks (CNNs) for enhanced medical image segmentation. The primary goal is to improve segmentation accuracy and efficiency in clinical settings, particularly for CT and MRI images.

The dissertation is structured around three key innovations. First, we introduce FocalUNETR, a novel transformer-based architecture designed to address the limitations of traditional CNNs in capturing long-range dependencies and global context in 2D CT-based prostate segmentation. FocalUNETR employs focal self-attention mechanisms and incorporates an auxiliary boundary-aware regression task to enhance segmentation precision, particularly in cases with unclear boundaries. Second, we present SwinAttUNet, a hybrid architecture combining CNNs and Swin Transformers for automatic 3D multi-organ segmentation on CT images. This approach leverages the local feature recognition capabilities of CNNs and the global contextual understanding of transformers. Third, we develop MulModSeg, a multi-modal segmentation strategy aimed at improving the segmentation of unpaired CT and MRI images. MulModSeg enhances feature extraction and model robustness by incorporating modality-conditioned text embedding and an alternating training procedure.

Extensive experiments on private and public datasets validate the effectiveness of these proposed methods. FocalUNETR achieves superior performance in 2D prostate segmentation, while SwinAttUNet outperforms state-of-the-art 3D segmentation models in both quantitative and qualitative evaluations. MulModSeg shows marked improvements in multi-modal segmentation tasks, highlighting its potential for clinical applications. This dissertation provides comprehensive frameworks for developing more accurate, efficient, and robust segmentation models, paving the way for future advancements in medical imaging and diagnostics.

Off-campus Download

Share

COinS