image image
Media

Search Results


Below results based on the criteria 'selection bias'
Total number of records returned: 16

1
Paper
Logistic Regression in Rare Events Data (revised)
King, Gary
Zeng, Langche

Uploaded 07-09-1999
Keywords rare events
logit
logistic regression
binary dependent variables
bias correction
case-control
choice-based
endogenous selection
selection bias
Abstract This paper is for the \r\nmethods conference; it \r\nis a revised version of \r\na paper that was \r\npreviously sent to the \r\npaper server.

2
Paper
Logistic Regression in Rare Events Data
King, Gary
Zeng, Langche

Uploaded 05-20-1999
Keywords rare events
logit
logistic regression
binary dependent variables
bias correction
case-control
choice-based
endogenous selection
selection bias
Abstract Rare events are binary dependent variables with dozens to thousands of times fewer ones (events, such as wars, vetoes, cases of political activism, or epidemiological infections) than zeros (``nonevents''). In many literatures, rare events have proven difficult to explain and predict, a problem that seems to have at least two sources. First, popular statistical procedures, such as logistic regression, can sharply underestimate the probability of rare events. We recommend corrections that outperform existing methods and change the estimates of absolute and relative risks by as much as some estimated effects reported in the literature. Second, commonly used data collection strategies are grossly inefficient for rare events data. The fear of collecting data with too few events has led to data collections with huge numbers of observations but relatively few, and poorly measured, explanatory variables, such as in international conflict data with more than a quarter million dyads, only a few of which are at war. As it turns out, easy procedures exist for making valid inferences when sampling all available events (e.g., wars) and a tiny fraction of non-events (peace). This enables scholars to save as much as 99% of their (non-fixed) data collection costs, or to collect much more meaningful explanatory variables. We provide methods that link these two results, enabling both types of corrections to work simultaneously, and software that implements the methods developed.

3
Paper
Inference from Response-Based Samples with Limited Auxiliary Information
King, Gary
Zeng, Langche

Uploaded 07-09-1999
Keywords rare events
logit
logistic regression
binary dependent variables
bias correction
case-control
choice-based
endogenous selection
selection bias
epidemiology
Abstract This paper is for the methods conference; it is related to "Logistic Regression in Rare Events Data," also by us; the conference presentation will be based on both papers. We address a disagreement between epidemiologists and econometricians about inference in response-based (a.k.a. case-control, choice-based, retrospective, etc.) samples. Epidemiologists typically make the rare event assumption (that the probability of disease is arbitrarily small), which makes the relative risk easy to estimate via the odds ratio. Econometricians do not like this assumption since it is false and implies that attributable risk (a.k.a. a first difference) is zero, and they have developed methods that require no auxiliary information. These methods produce bounds on the quantities of interest that, unfortunately, are often fairly wide and always encompass a conclusion of no treatment effect (relative risks of 1 or attributable risks of 0) no matter how strong the true effect is. We simplify the existing bounds for attributable risk, making it much easier to estimate, and then suggest one possible resolution of the disagreement by providing a method that allows researchers to include easily available information (such as that the fraction of the population with the disease falls within at most [.001,.05]); this method considerably narrows the bounds on the quantities of interest. We also offer software to implement the methods suggested. We would very much appreciate any comments you might have!

4
Paper
Selection Bias in a Model of Candidate Entry Decisions
Kanthak, Kristen
Morton, Becky
Gerber, Elisabeth R.

Uploaded 07-13-1999
Keywords selection bias
Poisson estimation
population uncertainty
Abstract In recent years, several states have changed or considered changing their laws regulating how political parties nominate candidates for office. We focus on one potentially important consequence of these changes: How do primary election laws affect candidate entry decisions? We have constructed and solved a formal model of individual candidate behavior in which potential candidates can choose to: 1) enter the electoral competition as major party candidates; 2) enter as minor party candidates; 3) enter as independents; or 4) not enter. Based on our analysis of the model, we hypothesize that the expected utility of each choice is a function, in part, of a state's primary election laws. We test our hypotheses with data on candidate choice from recent US Congressional elections. Estimation of our model is complicated, however, by the fact that we do not observe the choices of potential candidates who choose not to enter (i.e., the sample is truncated) and the observed dependent variable (i.e., candidate choices to run as major party, minor party, or independent candidates) is measured as a discrete, unordered polychotomous choice. We employ a two-stage Heckman (1979)-type estimation procedure that utilizes a Poisson framework for estimating candidate entry rates. We find that our estimates of the effects of electoral institutions on the partisan affiliation decisions of independent candidates are unaffected by sample selection. Our estimates of the partisan affiliation decisions of minor party candidates, however, change when we account for non-random sample selection.

5
Paper
Voting, Abstention, and Individual Expectations in the 1992 Presidential Election
Herron, Michael C.

Uploaded 04-07-1998
Keywords voting
abstention
selection bias
1992 election
Abstract This paper develops and applies to the 1992 presidential election a statistical model of voting and abstention in three--candidate elections. The model allows us to estimate key preference--related covariates in 1992, the extent to which abstention rates were correlated with political preferences, and the impact on abstention rates of expectations regarding the election winner. Throughout this paper, we contrast our results with those in Alvarez and Nagler (1995), a study of the 1992 election that does not incorporate abstention, and in so doing we illustrate the selection bias risked by presidential election voting research that ignores abstention. Our results highlight the importance of retrospective voting in 1992, and we identify numerous policy issues, for example, the death penalty, environmental spending, and social security, that individuals used to distinguish the three candidates in the 1992 election. Abortion, we find, played only a minor role in candidate choice. We find support for the angry voting hypothesis, namely, that angry individuals often supported the independent candidate, Ross Perot. Concerning abstention, we find that supporters of the Democratic challenger Bill Clinton abstained at higher rates than supporters of Perot and the incumbent president George Bush. And, we find that expectations concerning the likelihood that Clinton was going to be victorious in 1992 influenced abstention rates. Namely, Clinton supporters who believed that Clinton was likely to win voted at higher rates than individuals who believed otherwise. The opposite relation holds for Bush supporters: such individuals, when they predicted a Clinton victory, frequently abstained from voting. The results in this paper suggests that empirical voting studies should explicitly model the impact of expectations on voting and abstention and, more generally, should model abstention as a viable, individual--level

6
Paper
The Two Faces of Public Opinion
Berinsky, Adam

Uploaded 04-13-1998
Keywords public opinion
selection bias
item non-response
social desirability
bivariate probit
Abstract In this paper I trace out the aggregate effects of the social forces in the survey interview that might influence the opinions which individuals express. First, I advance the "Mediated Communication" theory of the survey response, which builds on existing models of public opinion in the political science literature by accounting for effects related to the social context of the survey setting. I then discuss how the aggregation process could compound these individual-level effects to create an opinion signal which is a poor representation of the collective public's policy preferences. As an illustration of these effects and the resulting difficulties involved in measuring aggregate opinion on socially sensitive issues, I use National Elections Study (NES) data from 1990-1994 to show that public opinion polls overstate support for school integration. Specifically, individuals who harbor anti-integration sentiments are likely to hide their socially unacceptable opinions behind a mask of indifference. Finally, in order to confirm the validity of these findings, I show that the same methods which predict that opinion polls understate true opposition to government involvement in school integration also predict the results of the 1989 New York City mayoral election -- an election where the charged racial atmosphere made accurate polling difficult, if not impossible -- more accurately than the marginals of the pre-election polls taken in the weeks leading to the election. All told, these results suggests that survey questions on school integration -- and more generally questions on racial attitudes -- may provide an inaccurate picture of true public sentiment on such sensitive issues.

7
Paper
The "Miracle" Revisited: An Examination of The Micro-Foundations of Aggregate Public Opinion
Berinsky, Adam

Uploaded 08-18-1997
Keywords public opinion
heteroskedastic probit
ordered probit
selection bias
item non-response
Abstract One of the best-known findings in the public opinion literature is that individual responses to survey questions, by and large, both exhibit little constraint and are highly unstable over time. One response to this bleak finding has been to search for coherence and stability at the aggregate level. Scholars who adopt this approach -- most notably Page and Shapiro (1992) -- argue that though most individuals are poorly informed about politics and may have unstable attitudes, the "miracle TRUNCATED.

8
Paper
Legislative Entrepreneurship and Campaign Finance
Wawro, Gregory

Uploaded 07-21-1997
Keywords campaign finance
fixed effects
panel data
selection bias
Abstract Drawing on models of service--induced and investor PAC campaign contributions, I analyze the role that legislative entrepreneurship plays in PACs' contribution decisions. I explore the possibility that PACs use campaign contributions to invest in members of Congress with the expectation that members will reciprocate by engaging in entrepreneurial behavior to the benefit of PACs. To determine whether a relationship exists between legislative entrepreneurship and PAC contributions I compute measures of entrepreneurial behavior for individual members of the U.S. House using detailed data on bill sponsorship and congressional hearings from the 97th through the 101st Congress. In order to cleanly estimate the effects of legislative entrepreneurship, we need to account for unobservable member--specific factors that enter into the PAC contribution calculus. To account for such factors I employ panel data methods which require very few assumptions about the data and provide a way to test whether the manipulations of the data that are required for a panel analysis introduce bias.

9
Paper
Heterogeneity and Bias in Models of Vote Choice
Berinsky, Adam

Uploaded 04-21-1997
Keywords voting models
selection bias
heteroskedasticity
missing data
Abstract Voters in the United States do not behave in a homogenous manner. Voting models typically account for such heterogeneity by seeking to decompose the process of vote choice into a number of distinct components. By examining voting choice data in this way, researchers are able to ascertain reasonable estimates of the average effect of various socio-economic and political variables on the candidate selection process. Models of this sort, while plausible, may not properly reflect the true heterogeneity of the American voter. At their core, simple models assume that voters use a common and uniform decision rule when deciding between candidates. But it is possible, if not likely, that different groups and classes of citizens use differently tructured processes to determine their choice of candidates. Searchers have attempted to account for this heterogeneity in a variety of ways. Rivers(1988) and Jackson (1992), for example, have accounted for differences in the voting behavior of individuals by allowing the mean effect of theoretically important variables to vary across individuals. While these approaches are extremely promising, in this paper I will take a different approach and examine three more subtle forms of heterogeneity in the vote choice process: (1) heterogeneity induced by non-random selection from the full population of citizens into the vote choice model sample; (2) heterogeneity due to the interaction of selection bias and non-constant variance; and (3) heterogeneity in the patterns of missing data across groups of the respondents. While much of the discussion in the paper is focused on the first two forms of heterogeneity, it is the third form of heterogeneity - one not typically addressed in the political science literature - that is the most important determinant of the degree of bias in vote choice models. Thus, heterogeneity within the sample of respondents affects the vote choice model estimates, just not in the way I originally envisioned. It is not just heterogeneity in the variance term, or in the selection into the vote choice process that poses a threat to accurate estimates of the power of the predictors in our vote choice models. Rather, it is the failure to preserve or account for the heterogeneity of the paths by which people answer survey questions that is the real bogeyman of vote choice models.

10
Paper
Death by Survey: Estimating Adult Mortality without Selection Bias
King, Gary
Gakidou, Emmanuela

Uploaded 07-14-2005
Keywords surveys
selection bias
mortality data
extrapolation
international relations
Abstract The widely used methods for estimating adult mortality rates from sample survey responses about the survival of siblings, parents, spouses, and others depend crucially on an assumption that we demonstrate does not hold in real data. We show that when this assumption is violated -- so that the mortality rate varies with sibship size -- mortality estimates can be massively biased. By using insights from work on the statistical analysis of selection bias, survey weighting, and extrapolation problems, we propose a new and relatively simple method of recovering the mortality rate with both greatly reduced potential for bias and increased clarity about the source of necessary assumptions.

11
Paper
Does Private Money Buy Public Policy? Campaign Contributions and Regulatory Outcomes in Telecommunications
de Figueiredo, Rui

Uploaded 10-06-2006
Keywords campaign contributions
regulation
selection bias
omitted variable bias
Abstract To what extent can market participants affect the outcomes of regulatory policy? In this paper, we study the effects of one potential source of influence – campaign contributions – from competing interests in the local telecommunications industry, on regulatory policy decisions of state public utility commissions. Using a unique new data set, we find, in contrast to much of the literature on campaign contributions, that there is a significant effect of private money on regulatory outcomes. This result is robust to numerous alternative model specifications. We also assess the extent of omitted variable bias that would have to exist to obviate the estimated result. We find that for our result to be spurious, omitted variables would have to explain more than five times the variation in the mix of private money as is explained by the variables included in our analysis. We consider this to be very unlikely.

12
Paper
Modeling Sample Selection for Durations with Time-Varying Covariates
Boehmke, Frederick

Uploaded 07-02-2008
Keywords selection
selection bias
duration
time-vary covariates
event history
exchange rates
Abstract We extend previous estimators for duration data that suffer from non-random sample selection to allow for time-varying covariates. Rather that a continuous-time duration model, we propose a discrete-time alternative that models the (constant) effects of sample selection at the time of selection across all years of the resulting spell. Properties of the estimator are compared to those of a naive discrete duration model through Monte Carlo analysis and indicate that our estimator outperforms the naive model when selection is non-trivial. We then apply this estimator to the question of the duration of monetary regimes.

13
Paper
Selection Bias and Continuous-Time Duration Models: Consequences and a Proposed Solution
Boehmke, Frederick
Morey, Daniel
Shannon, Megan

Uploaded 07-15-2003
Keywords duration
selection bias
exponential
monte carlo
Abstract In this paper we explore the consequences of non-random sample selection for continuous time duration analysis. While the consequences of selectivity are reasonably well-understood in linear regression and common discrete choice models, we have little or no understanding of how it affects duration models. In this paper we study this issue by conducting a series of Monte Carlo analyses that estimate common duration models on data that suffer from selectivity. Our findings indicate that the consequences are severe: both coefficients and standard errors may be biased in an unknown direction. In addition, we find that selection bias may create the appearance of (non-existent) duration dependence. Given these difficulties, we develop a solution for self-selectivity bias in duration models and present evidence that demonstrates its superiority to models that ignore the problem.

14
Paper
An Estimator for Some Binary-Outcome Selection Models without Exclusion Restrictions
Sartori, Anne E.

Uploaded 07-09-2001
Keywords selection bias
discrete choice
small-sample properties
Abstract This paper provides a new estimator for selection models with dichotomous dependent variables when identical factors affect the selection equation and the equation of interest. Such situations arise naturally in game-theoretic models where selection is typically nonrandom and identical explanatory variables influence all decisions under investigation. When its own identifying assumption is reasonable, the estimator allows the researcher to avoid the painful choice among identifying from functional form alone (using a Heckman-type estimator), adding a theoretically unjustified variable to the selection equation in a mistaken attempt to "boost" identification, or giving upon estimation entirely. The paper compares the small-sample properties of the estimator with those of the Heckman- type estimator and ordinary probit using Monte Carlo methods. A brief analysis of the causes of enduring rivalries and war, following Lemke and Reed (2001),

15
Paper
Using Auxiliary Data to Estimate Selection Bias Models
Boehmke, Frederick

Uploaded 07-06-2001
Keywords selection bias
two-stage estimation
survey design
initiative
interest groups
Abstract Recent work has made progress in estimating models involving selection bias of a particularly strong nature: all nonrespondents are unit nonresponders, meaning that no data is available for them. These models are reasonable successful in circumstances where the dependent variable of interest is continuous, but they are less practical empirically when it is latent and only discrete outcomes or choices are observed. I develop a method in this paper to estimate these models that is much more practical in terms of estimation. The model uses a small amount of auxiliary information to estimate the selection equation parameters which are then held fixed to estimate the equation of interest parameters in a maximum likelihood setting. After presenting monte carlo analysis to support the model, I apply the technique to a substantive problem: which interest groups are likely to to be involved in support of potential initiatives to achieve their policy goals.

16
Paper
The-Stage Estimation of Stochastic Truncation Models with Limited Dependent Variables
Boehmke, Frederick

Uploaded 04-13-2000
Keywords selection bias
stochastic truncation
maximum likelihood
simulation
monte carlo
initiative
interest groups
Abstract Recent work has made progress in estimating models involving selection bias of a par­ ticularly strong nature: all nonrespondents are unit nonresponders, meaning that no data is available for them. These models are reasonable successful in circumstances where the dependent variable of interest is continuous, but they are less practical empirically when it is latent and only discrete outcomes or choices are observed. I develop a method in this paper to estimate these models that is much more practical in terms of estimation. The model uses a small amount of auxiliary information to estimate the selection equation and these parameters are then used to estimate the equation of interest in a maximum likelihood setting. After presenting monte carlo analysis to support the model, I apply the technique to a substantive problem: which interest groups are likely to turn to the initiative process to achieve their policy goals.


< prev 1 next>
   
wustlArtSci