image image
Media

Search Results


Below results based on the criteria 'selection'
Total number of records returned: 33

1
Paper
Sink or Swim:" What Happened to California's Bilingual Students after Proposition 227?
Bali, Valentina A.

Uploaded 08-21-2000
Keywords initiative
bilingual education
California
Proposition 227
Heckman selection
Abstract Proposition 227, passed in California in 1998, aimed to dismantle bilingual programs in the state's public schools. Using individual level data from a southern California school district, I find that in 1998, before Proposition 227, limited-English-proficient (LEP) students enrolled in bilingual classes had lower scores in reading than LEP students not enrolled in bilingual classes: 2.4 points less on a scale from 1 to 99. In math these bilingual students scored 0.5 points higher than non-bilingual LEPs. But in 1999, after Proposition 227 the same set of students had scores no worse than non-bilingual LEP students in reading and were still 0.5 points higher in math. Proposition 227, which interrupted bilingual programs early and emphasized English instruction, then, did not set bilingual students back relative to non-bilingual LEP and may have even benefitted them.

2
Paper
Power-law distributions in empirical data
Clauset, Aaron
Shalizi, Cosma
Newman, Mark

Uploaded 06-11-2007
Keywords Power-law distributions
Pareto
Zipf
maximum likelihood
heavy-tailed distributions
likelihood ratio test
model selection
Abstract Power-law distributions occur in many situations of scientific interest and have significant consequences for our understanding of natural and man-made phenomena. Unfortunately, the empirical detection and characterization of power laws is made difficult by the large fluctuations that occur in the tail of the distribution. In particular, standard methods such as least-squares fitting are known to produce systematically biased estimates of parameters for power-law distributions and should not be used in most circumstances. Here we describe statistical techniques for making accurate parameter estimates for power-law data, based on maximum likelihood methods and the Kolmogorov-Smirnov statistic. We also show how to tell whether the data follow a power-law distribution at all, defining quantitative measures that indicate when the power law is a reasonable fit to the data and when it is not. We demonstrate these methods by applying them to twenty-four real-world data sets from a range of different disciplines. Each of the data sets has been conjectured previously to follow a power-law distribution. In some cases we find these conjectures to be consistent with the data while in others the power law is ruled out.

3
Paper
The-Stage Estimation of Stochastic Truncation Models with Limited Dependent Variables
Boehmke, Frederick

Uploaded 04-13-2000
Keywords selection bias
stochastic truncation
maximum likelihood
simulation
monte carlo
initiative
interest groups
Abstract Recent work has made progress in estimating models involving selection bias of a par­ ticularly strong nature: all nonrespondents are unit nonresponders, meaning that no data is available for them. These models are reasonable successful in circumstances where the dependent variable of interest is continuous, but they are less practical empirically when it is latent and only discrete outcomes or choices are observed. I develop a method in this paper to estimate these models that is much more practical in terms of estimation. The model uses a small amount of auxiliary information to estimate the selection equation and these parameters are then used to estimate the equation of interest in a maximum likelihood setting. After presenting monte carlo analysis to support the model, I apply the technique to a substantive problem: which interest groups are likely to turn to the initiative process to achieve their policy goals.

4
Paper
Non-parametric Mechanisms and Causal Modeling
Glynn, Adam
Quinn, Kevin

Uploaded 07-15-2007
Keywords Neyman-Rubin model
non-parametric structural equations
causal inference
covariate selection
unmeasured confounding
Abstract Political scientists tend to think about causality in terms of mechanisms. In this paper we argue that non-parametric structural equation models are consistent with how many empirical political scientists think about causality and are consistent with the powerful and well-respected Neyman-Rubin Causal Model. Furthermore, using examples we demonstrate that two important practical questions are more easily addressed within the mechanistic framework: What (if any) set or sets of conditioning variables will allow the identification of average causal effects in a regression or matching model? When unmeasured confounding is present, what (if any) adjustment will non-parametrically identify the average causal effect?

5
Paper
Logistic Regression in Rare Events Data (revised)
King, Gary
Zeng, Langche

Uploaded 07-09-1999
Keywords rare events
logit
logistic regression
binary dependent variables
bias correction
case-control
choice-based
endogenous selection
selection bias
Abstract This paper is for the \r\nmethods conference; it \r\nis a revised version of \r\na paper that was \r\npreviously sent to the \r\npaper server.

6
Paper
Endogeneity in Probit Response Models
Freedman, David
Sekhon, Jasjeet

Uploaded 05-28-2008
Keywords Bivariate probit
sample selection
identification
indefinite Hessian
optimization
Abstract In this paper, we look at conventional methods for removing endogeneity bias in regression models, including the linear model and the probit model. The usual Heckman two-step procedure should not be used in the probit model: from a theoretical perspective, this procedure is unsatisfactory, and likelihood methods are superior. However, serious numerical problems occur when standard software packages try to maximize the biprobit likelihood function, even if the number of covariates is small. The log likelihood surface may be nearly flat, or may have saddle points with one small positive eigenvalue and several large negative eigenvalues. We draw conclusions for statistical practice. Finally, we describe the conditions under which parameters in the model are identifable; these results appear to be new.

7
Paper
Logistic Regression in Rare Events Data
King, Gary
Zeng, Langche

Uploaded 05-20-1999
Keywords rare events
logit
logistic regression
binary dependent variables
bias correction
case-control
choice-based
endogenous selection
selection bias
Abstract Rare events are binary dependent variables with dozens to thousands of times fewer ones (events, such as wars, vetoes, cases of political activism, or epidemiological infections) than zeros (``nonevents''). In many literatures, rare events have proven difficult to explain and predict, a problem that seems to have at least two sources. First, popular statistical procedures, such as logistic regression, can sharply underestimate the probability of rare events. We recommend corrections that outperform existing methods and change the estimates of absolute and relative risks by as much as some estimated effects reported in the literature. Second, commonly used data collection strategies are grossly inefficient for rare events data. The fear of collecting data with too few events has led to data collections with huge numbers of observations but relatively few, and poorly measured, explanatory variables, such as in international conflict data with more than a quarter million dyads, only a few of which are at war. As it turns out, easy procedures exist for making valid inferences when sampling all available events (e.g., wars) and a tiny fraction of non-events (peace). This enables scholars to save as much as 99% of their (non-fixed) data collection costs, or to collect much more meaningful explanatory variables. We provide methods that link these two results, enabling both types of corrections to work simultaneously, and software that implements the methods developed.

8
Paper
Modeling Sample Selection for Durations with Time-Varying Covariates
Boehmke, Frederick

Uploaded 07-02-2008
Keywords selection
selection bias
duration
time-vary covariates
event history
exchange rates
Abstract We extend previous estimators for duration data that suffer from non-random sample selection to allow for time-varying covariates. Rather that a continuous-time duration model, we propose a discrete-time alternative that models the (constant) effects of sample selection at the time of selection across all years of the resulting spell. Properties of the estimator are compared to those of a naive discrete duration model through Monte Carlo analysis and indicate that our estimator outperforms the naive model when selection is non-trivial. We then apply this estimator to the question of the duration of monetary regimes.

9
Paper
Inference from Response-Based Samples with Limited Auxiliary Information
King, Gary
Zeng, Langche

Uploaded 07-09-1999
Keywords rare events
logit
logistic regression
binary dependent variables
bias correction
case-control
choice-based
endogenous selection
selection bias
epidemiology
Abstract This paper is for the methods conference; it is related to "Logistic Regression in Rare Events Data," also by us; the conference presentation will be based on both papers. We address a disagreement between epidemiologists and econometricians about inference in response-based (a.k.a. case-control, choice-based, retrospective, etc.) samples. Epidemiologists typically make the rare event assumption (that the probability of disease is arbitrarily small), which makes the relative risk easy to estimate via the odds ratio. Econometricians do not like this assumption since it is false and implies that attributable risk (a.k.a. a first difference) is zero, and they have developed methods that require no auxiliary information. These methods produce bounds on the quantities of interest that, unfortunately, are often fairly wide and always encompass a conclusion of no treatment effect (relative risks of 1 or attributable risks of 0) no matter how strong the true effect is. We simplify the existing bounds for attributable risk, making it much easier to estimate, and then suggest one possible resolution of the disagreement by providing a method that allows researchers to include easily available information (such as that the fraction of the population with the disease falls within at most [.001,.05]); this method considerably narrows the bounds on the quantities of interest. We also offer software to implement the methods suggested. We would very much appreciate any comments you might have!

10
Paper
The Trouble with Tobit: A District-Level Sample Selection Model of Voting for Extreme Right Parties in Europe, 1980-2004
Bowyer, Benjamin

Uploaded 07-07-2008
Keywords Tobit
Heckman sample selection
censored data
aggregate data
extreme right parties
Abstract The growing electoral success of extreme right parties (ERPs) in many European countries has sparked academic interest in explaining variation in extreme right success. However, much of the extant research on the electoral success of extreme right parties suffers from at least two types of selection bias. The first involves the selection of cases and occurs when only those national elections that were contested by extreme right parties are included in the cross-national analysis. To address this problem, a growing number of scholars of ERP electoral support employ Tobit models to analyze national-level election results pooled across countries and election years. However, this approach conceals a second source of selection bias: ERPs are extremely selective about which election districts within a country they choose to contest. The correct specification of this process of self-selection requires the recognition of two fundamental points. First, the causal factors that determine whether an extreme right party contests an election are not identical to those that influence its share of the vote if it does appear on the ballot. Second, this decision about when and where to field candidates is one that is observable at the level of the election district. This paper argues that the appropriate way to model is as a Heckman sample selection model estimated at the level of electoral district. I present a preliminary analysis of a dataset that pools district-level election results for eighteen European countries from 1980-2004 (N=12,050), the results of which demonstrate the value of this approach.

11
Paper
Selection Bias in a Model of Candidate Entry Decisions
Kanthak, Kristen
Morton, Becky
Gerber, Elisabeth R.

Uploaded 07-13-1999
Keywords selection bias
Poisson estimation
population uncertainty
Abstract In recent years, several states have changed or considered changing their laws regulating how political parties nominate candidates for office. We focus on one potentially important consequence of these changes: How do primary election laws affect candidate entry decisions? We have constructed and solved a formal model of individual candidate behavior in which potential candidates can choose to: 1) enter the electoral competition as major party candidates; 2) enter as minor party candidates; 3) enter as independents; or 4) not enter. Based on our analysis of the model, we hypothesize that the expected utility of each choice is a function, in part, of a state's primary election laws. We test our hypotheses with data on candidate choice from recent US Congressional elections. Estimation of our model is complicated, however, by the fact that we do not observe the choices of potential candidates who choose not to enter (i.e., the sample is truncated) and the observed dependent variable (i.e., candidate choices to run as major party, minor party, or independent candidates) is measured as a discrete, unordered polychotomous choice. We employ a two-stage Heckman (1979)-type estimation procedure that utilizes a Poisson framework for estimating candidate entry rates. We find that our estimates of the effects of electoral institutions on the partisan affiliation decisions of independent candidates are unaffected by sample selection. Our estimates of the partisan affiliation decisions of minor party candidates, however, change when we account for non-random sample selection.

12
Paper
Estimating Treatment Effects in the Presence of Noncompliance and Nonresponse: The Generalized Endogenous Treatment Model
Esterling, Kevin
Neblo, Michael
Lazer, David

Uploaded 02-14-2008
Keywords Average Treatment Effects
Principal Stratification
Selection on Unobservables
Latent Variable Models
Deliberation Experiment
Political Efficacy
Abstract If ignored, non-compliance with a treatment and nonresponse on outcome measures can bias estimates of treatment effects in a randomized experiment. To identify treatment effects in the case where compliance and response are conditioned on unobservables, we propose the parametric generalized endogenous treatment (GET) model. As a multilevel random effect model, GET improves on current approaches to principal stratification by incorporating behavioral responses within an experiment to measure each subjects' latent compliance type. We use Monte Carlo methods to show GET has a lower MSE for treatment effect estimates than existing approaches to principal stratification that impute, rather than measure, compliance type for subjects assigned to the control. In an application, we use data from a recent field experiment to assess whether exposure to a deliberative session with their member of Congress changes constituents' levels of internal and external efficacy. Since it conditions on subjects' latent compliance type, GET is able to test whether exposure to the treatment is ignorable after balancing on covariates via matching methods. We show that internally efficacious subjects disproportionately select into the deliberative sessions, and that matching apparently does not break the latent dependence between treatment compliance and outcome. The results suggest that exposure to the deliberative sessions improves external, but not internal, efficacy.

13
Paper
The Influence of the Initiative Process on Interest Groups and Lobbying Techniques
Boehmke, Frederick

Uploaded 09-22-1999
Keywords Initiative
direct democracy
survey analysis
interest groups
lobbying
selection
bias
Abstract I use survey data on interest groups and their activities drawn from four state populations to test hypotheses about the implications of direct democracy for the characteristics and strategic choices of interest groups. I use this data to test predictions about direct democracy's effect for group populations, confirming previous work (Boehmke 1999b) and extending it by exploring more detailed characteristics such as membership and resources. I then link these characteristics to lobbying techniques to test if the initiative process has an impact at the group level. As expected, groups involved in initiative campaigns tend to accentuate outside lobbying strategies, but even groups not currently involved in initiatives are influenced by the possibility of its use. This is because the initiative process alters the characteristics that can be effectively used when attempting to influence policy. The analysis makes use of a technique to correct for heterogeneous response rates across group types. By gathering information about a high percentage of an additional, smaller sample, I am able to correct for this response rate differential through a weighting procedure. The correction is found to have a substantial effect on the results: its absence would leave the researcher to conclude that the initiative plays little role in state interest group activities. This data will also be used to test and correct for possible sample selection bias.

14
Paper
Statistical Inference After Model Selection
Berk, Richard
Brown, Lawrence
Zhao, Linda

Uploaded 04-29-2009
Keywords Statistical Inference
Model Selection
Abstract Conventional statistical inference requires that a model of how the data were generated be known before the data are analyzed. Yet in criminology, and in the social sciences more broadly, a variety of model selection procedures are routinely undertaken followed by statistical tests and confidence intervals computed for a "final" model. In this paper, we examine such practices and show how they are typically misguided. The parameters being estimated are no longer well defined, and post-model-selection sampling distributions are mixtures with properties that are very different from what is conventionally assumed. Confidence intervals and statistical tests do not perform as they should. We examine in some detail the specific mechanisms responsible. We also offer some suggestions for better practice.

15
Paper
Voting, Abstention, and Individual Expectations in the 1992 Presidential Election
Herron, Michael C.

Uploaded 04-07-1998
Keywords voting
abstention
selection bias
1992 election
Abstract This paper develops and applies to the 1992 presidential election a statistical model of voting and abstention in three--candidate elections. The model allows us to estimate key preference--related covariates in 1992, the extent to which abstention rates were correlated with political preferences, and the impact on abstention rates of expectations regarding the election winner. Throughout this paper, we contrast our results with those in Alvarez and Nagler (1995), a study of the 1992 election that does not incorporate abstention, and in so doing we illustrate the selection bias risked by presidential election voting research that ignores abstention. Our results highlight the importance of retrospective voting in 1992, and we identify numerous policy issues, for example, the death penalty, environmental spending, and social security, that individuals used to distinguish the three candidates in the 1992 election. Abortion, we find, played only a minor role in candidate choice. We find support for the angry voting hypothesis, namely, that angry individuals often supported the independent candidate, Ross Perot. Concerning abstention, we find that supporters of the Democratic challenger Bill Clinton abstained at higher rates than supporters of Perot and the incumbent president George Bush. And, we find that expectations concerning the likelihood that Clinton was going to be victorious in 1992 influenced abstention rates. Namely, Clinton supporters who believed that Clinton was likely to win voted at higher rates than individuals who believed otherwise. The opposite relation holds for Bush supporters: such individuals, when they predicted a Clinton victory, frequently abstained from voting. The results in this paper suggests that empirical voting studies should explicitly model the impact of expectations on voting and abstention and, more generally, should model abstention as a viable, individual--level

16
Paper
Spike and Slab Prior Distributions for Simultaneous Bayesian Hypothesis Testing, Model Selection, and Prediction, of Nonlinear Outcomes
Pang, Xun
Gill, Jeff

Uploaded 07-13-2009
Keywords Spike and Slab Prior
Hypothesis Testing
Bayesian Model Selection
Bayesian Model Averaging
Adaptive Rejection Sampling
Generalized Linear Model
Abstract A small body of literature has used the spike and slab prior specification for model selection with strictly linear outcomes. In this setup a two-component mixture distribution is stipulated for coefficients of interest with one part centered at zero with very high precision (the spike) and the other as a distribution diffusely centered at the research hypothesis (the slab). With the selective shrinkage, this setup incorporates the zero coefficient contingency directly into the modeling process to produce posterior probabilities for hypothesized outcomes. We extend the model to qualitative responses by designing a hierarchy of forms over both the parameter and model spaces to achieve variable selection, model averaging, and individual coefficient hypothesis testing. To overcome the technical challenges in estimating the marginal posterior distributions possibly with a dramatic ratio of density heights of the spike to the slab, we develop a hybrid Gibbs sampling algorithm using an adaptive rejection approach for various discrete outcome models, including dichotomous, polychotomous, and count responses. The performance of the models and methods are assessed with both Monte Carlo experiments and empirical applications in political science.

17
Paper
The Two Faces of Public Opinion
Berinsky, Adam

Uploaded 04-13-1998
Keywords public opinion
selection bias
item non-response
social desirability
bivariate probit
Abstract In this paper I trace out the aggregate effects of the social forces in the survey interview that might influence the opinions which individuals express. First, I advance the "Mediated Communication" theory of the survey response, which builds on existing models of public opinion in the political science literature by accounting for effects related to the social context of the survey setting. I then discuss how the aggregation process could compound these individual-level effects to create an opinion signal which is a poor representation of the collective public's policy preferences. As an illustration of these effects and the resulting difficulties involved in measuring aggregate opinion on socially sensitive issues, I use National Elections Study (NES) data from 1990-1994 to show that public opinion polls overstate support for school integration. Specifically, individuals who harbor anti-integration sentiments are likely to hide their socially unacceptable opinions behind a mask of indifference. Finally, in order to confirm the validity of these findings, I show that the same methods which predict that opinion polls understate true opposition to government involvement in school integration also predict the results of the 1989 New York City mayoral election -- an election where the charged racial atmosphere made accurate polling difficult, if not impossible -- more accurately than the marginals of the pre-election polls taken in the weeks leading to the election. All told, these results suggests that survey questions on school integration -- and more generally questions on racial attitudes -- may provide an inaccurate picture of true public sentiment on such sensitive issues.

18
Paper
Penalized Regression, Standard Errors, and Bayesian Lassos
Kyung, Minjung
Gill, Jeff
Ghosh, Malay
Casella, George

Uploaded 02-23-2010
Keywords model selection
lassos
Bayesian hierarchical models
LARS algorithm
EM/Gibbs sampler
Geometric Ergodicity
Gibbs Sampling
Abstract Penalized regression methods for simultaneous variable selection and coefficient estimation, especially those based on the lasso of Tibshirani (1996), have received a great deal of attention in recent years, mostly through frequentist models. Properties such as consistency have been studied, and are achieved by different lasso variations. Here we look at a fully Bayesian formulation of the problem, which is flexible enough to encompass most versions of the lasso that have been previously considered. The advantages of the hierarchical Bayesian formulations are many. In addition to the usual ease-of-interpretation of hierarchical models, the Bayesian formulation produces valid standard errors (which can be problematic for the frequentist lasso), and is based on a geometrically ergodic Markov chain. We compare the performance of the Bayesian lassos to their frequentist counterparts using simulations and data sets that previous lasso papers have used, and see that in terms of prediction mean squared error, the Bayesian lasso performance is similar to and, in some cases, better than, the frequentist lasso.

19
Paper
Nonnested Model Testing for World Politics: Assessing Binary Choice Models
Clarke, Kevin A.

Uploaded 08-17-1998
Keywords nonnested hypothesis testing
Cox test
model selection
Abstract he major goal of this project is to introduce and develop a methodology of nonnested hypothesis testing that researchers in world politics will find useful. I make use of both the Cox test for nonnested hypotheses and the Vuong test for nonnested model selection. I argue for a sequential approach where the Vuong test will be used depending upon the outcome of the Cox test. In keeping with the goal of making this methodology useful for world politics research, I discuss both tests in the context of binary choice models, specifically probits. I apply the methodology developed to the problem of testing alternative models of the escalation of great power militarized disputes.

20
Paper
The "Miracle" Revisited: An Examination of The Micro-Foundations of Aggregate Public Opinion
Berinsky, Adam

Uploaded 08-18-1997
Keywords public opinion
heteroskedastic probit
ordered probit
selection bias
item non-response
Abstract One of the best-known findings in the public opinion literature is that individual responses to survey questions, by and large, both exhibit little constraint and are highly unstable over time. One response to this bleak finding has been to search for coherence and stability at the aggregate level. Scholars who adopt this approach -- most notably Page and Shapiro (1992) -- argue that though most individuals are poorly informed about politics and may have unstable attitudes, the "miracle TRUNCATED.

21
Paper
Modeling History Dependence in Network-Behavior Coevolution
Franzese, Robert
Hays, Jude
Kachi, Aya

Uploaded 07-21-2010
Keywords path dependence
history dependence
network
coevolution
spatial econometrics
selection
homophily
SIENA
RSIENA
markov chain
logit
p-star
military alliance
conflict behavior
Abstract Spatial interdependence--the dependence of outcomes in some units on those in others--is substantively and theoretically ubiquitous and central across the social sciences. Spatial association is also omnipresent empirically. However, spatial association may arise from three importantly distinct processes: common exposure of actors to exogenous external and internal stimuli, interdependence of outcomes/behaviors across actors (contagion), and/or the putative outcomes may affect the variable along which the clustering occurs (selection). Accurate inference about any of these processes generally requires an empirical strategy that addresses all three well. From a spatial-econometric perspective, this suggests spatiotemporal empirical models with exogenous covariates (common exposure) and spatial lags (contagion), with the spatial weights being endogenous (selection). From a longitudinal network-analytic perspective, we can identify the same three processes as potential sources of network effects and network formation. From that perspective, actors' self-selection into networks (by, e.g., behavioral homophily) and actors' behavior that is contagious through those network connections likewise demands theoretical and empirical models in which networks and behavior coevolve over time. This paper begins building such modeling by, on the theoretical side, extending a Markov type-interaction model to allow endogenous tie-formation, and, on the empirical side, merging a simple spatial-lag logit model of contagious behavior with a simple p-star logit model of network formation, building this synthetic discrete-time empirical model from the theoretical base of the modified Markov type-interaction model. One interesting consequence of network-behavior coevolution--identically: endogenous patterns of spatial interdependence--emphasized here is how it can produce history-dependent political dynamics, including equilibrium phat and path dependence (Page 2006). The paper explores these implications, and then concludes with a preliminary demonstration of the strategy applied to alliance formation and conflict behavior among the great powers in the first half of the twentieth century.

22
Paper
The Etiology of Public Support for the Designated Hitter Rule
Zorn, Christopher
Gill, Jeff

Uploaded 03-21-2004
Keywords baseball
designated hitter
public opinion
ideology
selection model
Abstract Since its introduction in 1973, major league baseball’s designated hitter (DH) rule has been the subject of continuing controversy. Here, we investigate the political and socio–demographic determinants of public opinion towards baseball’s DH rule, using data from a nationwide poll conducted during September, 1997. Our findings suggest that, while both self–proclaimed Democrats and Republicans are more likely to follow baseball than are political independents, it is Democrats, not Republicans, who tend to favor the DH. In addition, older respondents were more likely to oppose the rule, while respondents from the Midwest tended to favor it.

23
Paper
The Spatial Theory of Voting and the Presidential Election of 1824
Jenkins, Jeffery A.
Sala, Brian R.

Uploaded 08-15-1997
Keywords spatial voting theory
ideological voting
presidential selection
Nominate scores
Abstract One recent analysis claims that in at least five p residential contests since the end of World War II a relatively minor vote shift in a small number of states would have produced Electoral College deadlock, leading to a House election for president (Longley and Peirce 1996). A presidential contest in the House would raise fundamental questions from agency theory - do members "shirk" the collective preferences of their constituent-principals on highly salient votes and, if so, what explains the choices they do make? Can vote choices be rationalized in a theory of ideological voting, or are legislators highly susceptible to interest-group pressures and enticements? We apply a spatial-theoretic model of voting to the House balloting for president in 1825 in order to test competing hypotheses about how MCs would likely vote in a presidential ballot. We find that a sincere voting model based on ideal points for MCs and candidates derived from Nominate scores closely matches the choices made by MCs in 1825.

24
Paper
An Alternative Solution to the Heckman Selection Problem: Selection Bias as Functional Form Misspecification
Kenkel, Brenton
Signorino, Curtis

Uploaded 07-18-2012
Keywords selection models
functional form misspecification
nonparametric models
polynomial regression
Abstract The "selection problem" is typically seen as a form of omitted variable bias. We recast the problem as one of functional form misspecification and examine two situations in which flexible or nonparametric estimation techniques may be used as a complement or alternative to traditional selection models. First, we show that such techniques can allow a researcher to recover the conditional relationship between covariates and the expected outcome, even if data on the probability of selection into the subsample is unavailable. We demonstrate the validity of this approach analytically and using Monte Carlo simulations. Second, we show that flexible methods can be used to validate or improve a linear selection model specification when a researcher does possess the prior-stage data. We illustrate this process with an application to data from Mroz (1987) on women's wages.

25
Paper
Selection Bias and Continuous-Time Duration Models: Consequences and a Proposed Solution
Boehmke, Frederick
Morey, Daniel
Shannon, Megan

Uploaded 07-15-2003
Keywords duration
selection bias
exponential
monte carlo
Abstract In this paper we explore the consequences of non-random sample selection for continuous time duration analysis. While the consequences of selectivity are reasonably well-understood in linear regression and common discrete choice models, we have little or no understanding of how it affects duration models. In this paper we study this issue by conducting a series of Monte Carlo analyses that estimate common duration models on data that suffer from selectivity. Our findings indicate that the consequences are severe: both coefficients and standard errors may be biased in an unknown direction. In addition, we find that selection bias may create the appearance of (non-existent) duration dependence. Given these difficulties, we develop a solution for self-selectivity bias in duration models and present evidence that demonstrates its superiority to models that ignore the problem.

26
Paper
Legislative Entrepreneurship and Campaign Finance
Wawro, Gregory

Uploaded 07-21-1997
Keywords campaign finance
fixed effects
panel data
selection bias
Abstract Drawing on models of service--induced and investor PAC campaign contributions, I analyze the role that legislative entrepreneurship plays in PACs' contribution decisions. I explore the possibility that PACs use campaign contributions to invest in members of Congress with the expectation that members will reciprocate by engaging in entrepreneurial behavior to the benefit of PACs. To determine whether a relationship exists between legislative entrepreneurship and PAC contributions I compute measures of entrepreneurial behavior for individual members of the U.S. House using detailed data on bill sponsorship and congressional hearings from the 97th through the 101st Congress. In order to cleanly estimate the effects of legislative entrepreneurship, we need to account for unobservable member--specific factors that enter into the PAC contribution calculus. To account for such factors I employ panel data methods which require very few assumptions about the data and provide a way to test whether the manipulations of the data that are required for a panel analysis introduce bias.

27
Paper
Primaries and Turnout
Kanthak, Kristen
Morton, Becky

Uploaded 07-09-2003
Keywords primaries
turnout
bivariate probit
selection model
treatment effects
Abstract We consider the effects of differences in primary systems on voter turnout in primaries as well as the effect of holding primaries on general election turnout and support for candidates chosen in primaries. The analysis is based on a group majority voting model of turnout where candidates from two major parties simultaneously make strategic entry decisions and mobilize voters strategically in primaries and general elections if they choose to enter. We evaluate the model's predictions using data from midterm Congressional primaries and general elections in the 1980s. We use a two-stage estimation process. First, the model's predictions concerning the effects of primary system differences on whether primaries occur and the vote totals in primaries is estimated using a maximum likelihood bivariate probit selection model. We find that primary system variables do have significant effects on whether primaries are held and to some extent affect vote totals in primaries, although there are interesting party specific differences suggesting that Republicans see advantages from mobilizing voters in open primary systems while Democrats benefit in semi-closed primary systems. Second, the estimated vote totals in the primaries are used as treatment variables via an instrumental variables approach in a simultaneous equation system with two dependent variables general election vote totals and the vote share of the Democratic party's candidate. We find that voting in primaries has a positive and significant effect on voting in general elections and significantly increase the vote share of the party holding the primary, suggesting that the arguments that primaries by their existence decrease voter turnout and hurt parties holding them have no support.

28
Paper
Heterogeneity and Bias in Models of Vote Choice
Berinsky, Adam

Uploaded 04-21-1997
Keywords voting models
selection bias
heteroskedasticity
missing data
Abstract Voters in the United States do not behave in a homogenous manner. Voting models typically account for such heterogeneity by seeking to decompose the process of vote choice into a number of distinct components. By examining voting choice data in this way, researchers are able to ascertain reasonable estimates of the average effect of various socio-economic and political variables on the candidate selection process. Models of this sort, while plausible, may not properly reflect the true heterogeneity of the American voter. At their core, simple models assume that voters use a common and uniform decision rule when deciding between candidates. But it is possible, if not likely, that different groups and classes of citizens use differently tructured processes to determine their choice of candidates. Searchers have attempted to account for this heterogeneity in a variety of ways. Rivers(1988) and Jackson (1992), for example, have accounted for differences in the voting behavior of individuals by allowing the mean effect of theoretically important variables to vary across individuals. While these approaches are extremely promising, in this paper I will take a different approach and examine three more subtle forms of heterogeneity in the vote choice process: (1) heterogeneity induced by non-random selection from the full population of citizens into the vote choice model sample; (2) heterogeneity due to the interaction of selection bias and non-constant variance; and (3) heterogeneity in the patterns of missing data across groups of the respondents. While much of the discussion in the paper is focused on the first two forms of heterogeneity, it is the third form of heterogeneity - one not typically addressed in the political science literature - that is the most important determinant of the degree of bias in vote choice models. Thus, heterogeneity within the sample of respondents affects the vote choice model estimates, just not in the way I originally envisioned. It is not just heterogeneity in the variance term, or in the selection into the vote choice process that poses a threat to accurate estimates of the power of the predictors in our vote choice models. Rather, it is the failure to preserve or account for the heterogeneity of the paths by which people answer survey questions that is the real bogeyman of vote choice models.

29
Paper
An Estimator for Some Binary-Outcome Selection Models without Exclusion Restrictions
Sartori, Anne E.

Uploaded 07-09-2001
Keywords selection bias
discrete choice
small-sample properties
Abstract This paper provides a new estimator for selection models with dichotomous dependent variables when identical factors affect the selection equation and the equation of interest. Such situations arise naturally in game-theoretic models where selection is typically nonrandom and identical explanatory variables influence all decisions under investigation. When its own identifying assumption is reasonable, the estimator allows the researcher to avoid the painful choice among identifying from functional form alone (using a Heckman-type estimator), adding a theoretically unjustified variable to the selection equation in a mistaken attempt to "boost" identification, or giving upon estimation entirely. The paper compares the small-sample properties of the estimator with those of the Heckman- type estimator and ordinary probit using Monte Carlo methods. A brief analysis of the causes of enduring rivalries and war, following Lemke and Reed (2001),

30
Paper
Death by Survey: Estimating Adult Mortality without Selection Bias
King, Gary
Gakidou, Emmanuela

Uploaded 07-14-2005
Keywords surveys
selection bias
mortality data
extrapolation
international relations
Abstract The widely used methods for estimating adult mortality rates from sample survey responses about the survival of siblings, parents, spouses, and others depend crucially on an assumption that we demonstrate does not hold in real data. We show that when this assumption is violated -- so that the mortality rate varies with sibship size -- mortality estimates can be massively biased. By using insights from work on the statistical analysis of selection bias, survey weighting, and extrapolation problems, we propose a new and relatively simple method of recovering the mortality rate with both greatly reduced potential for bias and increased clarity about the source of necessary assumptions.

31
Paper
Using Auxiliary Data to Estimate Selection Bias Models
Boehmke, Frederick

Uploaded 07-06-2001
Keywords selection bias
two-stage estimation
survey design
initiative
interest groups
Abstract Recent work has made progress in estimating models involving selection bias of a particularly strong nature: all nonrespondents are unit nonresponders, meaning that no data is available for them. These models are reasonable successful in circumstances where the dependent variable of interest is continuous, but they are less practical empirically when it is latent and only discrete outcomes or choices are observed. I develop a method in this paper to estimate these models that is much more practical in terms of estimation. The model uses a small amount of auxiliary information to estimate the selection equation parameters which are then held fixed to estimate the equation of interest parameters in a maximum likelihood setting. After presenting monte carlo analysis to support the model, I apply the technique to a substantive problem: which interest groups are likely to to be involved in support of potential initiatives to achieve their policy goals.

32
Paper
Does Private Money Buy Public Policy? Campaign Contributions and Regulatory Outcomes in Telecommunications
de Figueiredo, Rui

Uploaded 10-06-2006
Keywords campaign contributions
regulation
selection bias
omitted variable bias
Abstract To what extent can market participants affect the outcomes of regulatory policy? In this paper, we study the effects of one potential source of influence – campaign contributions – from competing interests in the local telecommunications industry, on regulatory policy decisions of state public utility commissions. Using a unique new data set, we find, in contrast to much of the literature on campaign contributions, that there is a significant effect of private money on regulatory outcomes. This result is robust to numerous alternative model specifications. We also assess the extent of omitted variable bias that would have to exist to obviate the estimated result. We find that for our result to be spurious, omitted variables would have to explain more than five times the variation in the mix of private money as is explained by the variables included in our analysis. We consider this to be very unlikely.

33
Poster
Weighted Estimation for Analyses with Missing Data
Samii, Cyrus

Uploaded 07-21-2010
Keywords missing data
doubly robust
inverse probability weighting
semi-parametric
post-treatment
regression
sample selection
Abstract Missing data plague data analyses in political science. The recent applied statistics literature reflects renewed interest in weighting methods for missing data problems. Three properties are stressed in this literature: (i) robustness, (ii) the ability to use post-treatment information in causal analysis, and (iii) methods to gain efficiency. I present these results, hoping to show the potential in using refashioned weighting methods for political science research.


< prev 1 next>
   
wustlArtSci