Off-campus WSU users: To download campus access dissertations, please use the following link to log into our proxy server with your WSU access ID and password, then click the "Off-campus Download" button below.

Non-WSU users: Please talk to your librarian about requesting this dissertation through interlibrary loan.

Access Type

WSU Access

Date of Award

January 2016

Degree Type

Dissertation

Degree Name

Ph.D.

Department

Computer Science

First Advisor

Chandan K. Reddy

Abstract

Predicting time-to-event from longitudinal data where different events occur at different time points is an extremely important problem in several domains such as healthcare, economics, social networks and seismology, to name a few. A unique challenge in this problem involves building predictive models from right censored data (also called as survival data). This is a phenomenon where instances whose event of interest are not yet observed within a given observation time window and are considered to be right censored. Effective models for predicting time-to-event labels from such right censored data with good accuracy can have a significant impact in these domains.

However, existing methods in the literature cannot capture various complexities present in real-world survival data such as feature groups and intra and inter-event correlations. To address such challenges, we briefly summarize the major contributions of the methods proposed here as (i) modeling intra-event correlations in survival data using structured sparsity-based regularizers, (ii) learning novel representations for survival data by inferring inter-event and intra-event correlations, (iii) extending linear regression-based methods to learn predictive models from right censored data and (iv) identifying censored instances and events from the data which are contributing extensively to learning a model with lesser number of training instances using active learning. We present optimization-based algorithms corresponding to each of the aforementioned contributions in this dissertation utilizing diverse techniques such as regularization, representation learning and active learning. Our methods are tested on different real-world longitudinal datasets such as electronic health records (EHRs), crowdfunding data, gene-expression data and several publicly available synthetic survival datasets. The results demonstrate the goodness of these methods when compared to state-of-the-art survival analysis, classification and regression methods from the literature.

Off-campus Download

Share

COinS