Statistical analysis of truncated data


Wei-Yann Tsai
2010-01-08  12:30 - 14:30
Room 405, Mathematics Research Center Building (ori. New Math. Bldg.)

Survival analysis is complicated by issues of censoring, where an event time is known to occur only in a certain period of time, and by truncation, where individuals enter the study only if they survive a sufficient length of time or individuals are included in the study only if the event has occurred by a given date. This talk will survey nonparametric and semi-parametric analysis of truncated data and will be divided into four major themes. The first theme introduces the literature history, concept and terminology. We will analyze three examples of truncated data sets (the first from economics, the second from astronomy and the third from HIV/AIDS research) and hold a detailed discussion on truncation. The second major theme is the estimation of summarizing statistics based on truncated data (one sample problem). We will be looking at the nonparametric estimation of the survival function, the cumulative hazard rate, the mean and the median. In addition, the identifiability of the survival function and testing independence and quasi-independence of truncated time and event time will be touched upon. The third major theme is hypothesis testing and regression analysis, where the focus will be on the log rank test, truncated linear regression and the proportion hazards model. The final theme is the application of the above analysis to the biased-sampling data and missing data problems. We will discuss how to embed the missing data and/or biased-sampling data into the truncation data structure and using the truncated data technique to provide the statistic inference. I will conclude the talk with future research topics.