Performance Assessment of High-dimensional Variable Estimation

Speaker: Yi Yang (McGill)

Time: 2:30-3:30, Nov 25, 2016

Room: KED B004

Title: Performance Assessment of High-dimensional Variable Estimation

Abstract:

Since model selection is ubiquitous in data analysis, reproducibility of statistical analysis demands a reality check of the employed model selection method no matter what label it may have in terms of good properties. Instability measures have been proposed for evaluating model selection uncertainty. However, low instability does not necessarily indicate that the selected model is trustworthy, since low instability can also arise when a certain method tends to select an overly parsimonious model. In this work, we propose an estimation method based on F and G measures to evaluate the accuracy of variable selection methods in terms of model identification (not prediction). We show that our approach provides reliable estimates of the true F and G measures of the selected models. This gives the data analyst a valuable tool to compare different model selection methods based on the data at hand. Extensive simulations are conducted to show its very good finite sample performance. We further demonstrate the application of our methods using several microarray gene expression data sets.