Multiple Imputation (MI) Abstract

PharmaSUG May 2017
Scott Kosten, DataCeutics
Senior Consultant, Clinical SAS Programmer Analyst

Multiple Imputation (MI) is a technique for handling missing data.  MI is becoming an increasingly popular method for sensitivity analyses in order to assess the impact of missing data.  The statistical theory behind MI is a very intense and evolving field of research for statisticians.  Therefore, it is important, from a programmer perspective, that we understand the method in order to collaborate with statisticians on the right MI technique to use.  In SAS®, MI is done using two procedures, PROC MI and PROC MIANALYZE in conjunction with other standard analysis procedures (e.g. FREQ, GENMOD or MIXED procedures).  We will describe the 3-step process in order to perform MI analyses.  Our goal is to remove some of the mystery behind these procedures and address typical misunderstandings of the MI process.  We will also illustrate how multiple imputed data can be represented using the ADaM standards and principals through an example-driven discussion.  Lastly, we will do a run-time simulation in order to determine how the number of imputations influences the MI process.  SAS v9.4 with SAS/STAT® v13.2 was used in the examples presented, but we will call out any version dependencies throughout the text.  This paper is written to all levels of SAS users.  While we present a statistical programmer’s perspective, an introductory level understanding about statistics including p-values, hypothesis testing, confidence intervals, mixed models, and regression is beneficial.