About the Society
Papers, Posters, Syllabi
Submit an Item
Polmeth Mailing List
Below results based on the criteria 'model selection'
Total number of records returned: 5
Penalized Regression, Standard Errors, and Bayesian Lassos
Bayesian hierarchical models
Penalized regression methods for simultaneous variable selection and coefficient estimation, especially those based on the lasso of Tibshirani (1996), have received a great deal of attention in recent years, mostly through frequentist models. Properties such as consistency have been studied, and are achieved by different lasso variations. Here we look at a fully Bayesian formulation of the problem, which is flexible enough to encompass most versions of the lasso that have been previously considered. The advantages of the hierarchical Bayesian formulations are many. In addition to the usual ease-of-interpretation of hierarchical models, the Bayesian formulation produces valid standard errors (which can be problematic for the frequentist lasso), and is based on a geometrically ergodic Markov chain. We compare the performance of the Bayesian lassos to their frequentist counterparts using simulations and data sets that previous lasso papers have used, and see that in terms of prediction mean squared error, the Bayesian lasso performance is similar to and, in some cases, better than, the frequentist lasso.
Nonnested Model Testing for World Politics: Assessing Binary Choice Models
Clarke, Kevin A.
nonnested hypothesis testing
he major goal of this project is to introduce and develop a methodology of nonnested hypothesis testing that researchers in world politics will find useful. I make use of both the Cox test for nonnested hypotheses and the Vuong test for nonnested model selection. I argue for a sequential approach where the Vuong test will be used depending upon the outcome of the Cox test. In keeping with the goal of making this methodology useful for world politics research, I discuss both tests in the context of binary choice models, specifically probits. I apply the methodology developed to the problem of testing alternative models of the escalation of great power militarized disputes.
Power-law distributions in empirical data
likelihood ratio test
Power-law distributions occur in many situations of scientific interest and have significant consequences for our understanding of natural and man-made phenomena. Unfortunately, the empirical detection and characterization of power laws is made difficult by the large fluctuations that occur in the tail of the distribution. In particular, standard methods such as least-squares fitting are known to produce systematically biased estimates of parameters for power-law distributions and should not be used in most circumstances. Here we describe statistical techniques for making accurate parameter estimates for power-law data, based on maximum likelihood methods and the Kolmogorov-Smirnov statistic. We also show how to tell whether the data follow a power-law distribution at all, defining quantitative measures that indicate when the power law is a reasonable fit to the data and when it is not. We demonstrate these methods by applying them to twenty-four real-world data sets from a range of different disciplines. Each of the data sets has been conjectured previously to follow a power-law distribution. In some cases we find these conjectures to be consistent with the data while in others the power law is ruled out.
Statistical Inference After Model Selection
Conventional statistical inference requires that a model of how the data were generated be known before the data are analyzed. Yet in criminology, and in the social sciences more broadly, a variety of model selection procedures are routinely undertaken followed by statistical tests and confidence intervals computed for a "final" model. In this paper, we examine such practices and show how they are typically misguided. The parameters being estimated are no longer well defined, and post-model-selection sampling distributions are mixtures with properties that are very different from what is conventionally assumed. Confidence intervals and statistical tests do not perform as they should. We examine in some detail the specific mechanisms responsible. We also offer some suggestions for better practice.
Spike and Slab Prior Distributions for Simultaneous Bayesian Hypothesis Testing, Model Selection, and Prediction, of Nonlinear Outcomes
Spike and Slab Prior
Bayesian Model Selection
Bayesian Model Averaging
Adaptive Rejection Sampling
Generalized Linear Model
A small body of literature has used the spike and slab prior specification for model selection with strictly linear outcomes. In this setup a two-component mixture distribution is stipulated for coefficients of interest with one part centered at zero with very high precision (the spike) and the other as a distribution diffusely centered at the research hypothesis (the slab). With the selective shrinkage, this setup incorporates the zero coefficient contingency directly into the modeling process to produce posterior probabilities for hypothesized outcomes. We extend the model to qualitative responses by designing a hierarchy of forms over both the parameter and model spaces to achieve variable selection, model averaging, and individual coefficient hypothesis testing. To overcome the technical challenges in estimating the marginal posterior distributions possibly with a dramatic ratio of density heights of the spike to the slab, we develop a hybrid Gibbs sampling algorithm using an adaptive rejection approach for various discrete outcome models, including dichotomous, polychotomous, and count responses. The performance of the models and methods are assessed with both Monte Carlo experiments and empirical applications in political science.