Efficient Estimation with Smooth Penalization


This dissertation proposes an oracle efficient estimator in the context of a sparse linear model. Chapter 1 introduces the penalty and the estimator that optimizes a penalized least squares objective. Unlike existing methods, the penalty is differentiable – once, and hence the estimator does not engage in model selection. This feature allows the estimator to reduce bias relative to a popular oracle efficient method (SCAD) when small, but not insignificant, coefficients are present. Consequently the estimator delivers a lower realized squared error of coefficients of interest. Furthermore, the objective function with the proposed penalty is shown to be convex; paired with differentiability, this ensures good computational properties of the estimator. Simulation evidence illustrates increased robustness of the estimator with the smooth penalty in the presence of small, but nonzero, coefficients. Chapter 2 focuses on better understanding asymptotic properties of the proposed penalized estimator when the standard asymptotic approximation might be unsatisfactory, and leveraging that understanding to improve inference. Conventional asymptotic analysis of efficient penalized estimators typically prohibits coefficients of a magnitude that lies in a certain range relative to the sampling error, and it is well understood that allowing for such coefficients can lead to slower rates of convergence of such estimators. I derive the asymptotic distribution for the penalized estimator with the once-differentiable penalty while allowing for coefficients in this range. The analysis is conducted under standard conditions on the tuning parameters, as well as under an alternative asymptotic framework that preserves nonlocal properties of these estimators. Inference by a modified bootstrap procedure is shown to be consistent both under the standard assumptions on tuning parameters that ensure oracle efficiency and under an alternative asymptotic view that excludes intermediate-magnitude coefficients but allows for nonnormal asymptotic distributions arising from penalization. Simulation evidence is presented that shows that the proposed smooth penalty paired with bootstrap inference provides good coverage together with smaller confidence intervals even under violations of exact sparsity that lead to poor performance by model-selection-based estimators. Finally, Chapter 3 applies the proposed approach to reevaluate the effect of location-specific human capital on agricultural output using the data and framework of Bazzi et al. (2016). Authors consider a relocation program carried out in Indonesia that created exogenous variation in where migrants were settled. A key estimate in that work is that a one-standard-deviation increase in a measure of agricultural similarity between migrants’ origins and destinations produced a 20% increase in rice productivity. The estimate comes from a regression with a relatively small set of controls, and I find that a more plausible estimate of the effect is 11%. Authors’ original result appears to be driven by omitted variable bias due to not including a small number of important controls, most notably education level of the local population, as measured by the average years of schooling of the locals. This finding fits well into the human capital transition mechanism envisioned in the original paper, in which interactions with more experienced locals would improve migrants’ productivity. Since the agricultural similarity happens to correlate with local education levels in the dataset under study, omitting schooling from the regression increases the estimate on agricultural similarity.

Date created
Resource type
Rights statement