Working Papers
2005
39 records found
The difference between ``significant'' and ``not significant'' is not itself statistically significant
Gelman, Andrew, Stern, Hal
Submitted: 2005-12-23
Keywords: multilevel modeling, multiple comparisons, replication, statistical significance
Abstract: (click to show/hide) A common error in statistical analyses is to summarize comparisons by declarations of statistical significance or non-significance. There are a number of difficulties with this approach. First is the oft-cited dictum that statistical significance is not the same as practical significance. Another difficulty is that this dichotomization into significant and non-significant results encourages the dismissal of observed differences in favor of the usually less interesting null hypothesis of no difference. Here, we focus on a less commonly noted problem, namely that changes in statistical significance are not themselves significant. By this, we are not merely making the commonplace observation that any particular threshold is arbitrary---for example, only a small change is required to move an estimate from a 5.1% significance level to 4.9%, thus moving it into statistical significance. Rather, we are pointing out that even large changes in significance levels can correspond to small, non-significant changes in the underlying variables. We illustrate with a theoretical and an applied example.
Gelman, Andrew, Stern, Hal
Submitted: 2005-12-23
Keywords: multilevel modeling, multiple comparisons, replication, statistical significance
Abstract: (click to show/hide) A common error in statistical analyses is to summarize comparisons by declarations of statistical significance or non-significance. There are a number of difficulties with this approach. First is the oft-cited dictum that statistical significance is not the same as practical significance. Another difficulty is that this dichotomization into significant and non-significant results encourages the dismissal of observed differences in favor of the usually less interesting null hypothesis of no difference. Here, we focus on a less commonly noted problem, namely that changes in statistical significance are not themselves significant. By this, we are not merely making the commonplace observation that any particular threshold is arbitrary---for example, only a small change is required to move an estimate from a 5.1% significance level to 4.9%, thus moving it into statistical significance. Rather, we are pointing out that even large changes in significance levels can correspond to small, non-significant changes in the underlying variables. We illustrate with a theoretical and an applied example.
Rich state, poor state, red state, blue state:What's the matter with Connecticut?
Gelman, Andrew, Shor, Boris, Bafumi, Joseph, Park, David
Submitted: 2005-11-29
Keywords: availability heuristic, ecological fallacy, hierarchical model, income and voting, multilevel model, presidential elections, public opinion, secret weapon, varying-slope model
Abstract: (click to show/hide) We find that income matters more in ``red America'' than in ``blue America.'' In poor states, rich people are much more likely than poor people to vote for the Republican presidential candidate, but in rich states (such as Connecticut), income has a very low correlation with vote preference. In addition to finding this pattern and studying its changes over time, we use the concepts of typicality and availability from cognitive psychology to explain how these patterns can be commonly misunderstood. Our results can be viewed either as a debunking of the journalistic image of rich ``latte'' Democrats and poor ``Nascar'' Republicans, or as support for the journalistic images of political and cultural differences between red and blue states---differences which are not explained by differences in individuals' incomes. For decades, the Democrats have been viewed as the party of the poor, with the Republicans representing the rich. Recent presidential elections, however, have shown a reverse pattern, with Democrats performing well in the richer ``blue'' states in the northeast and west coast, and Republicans dominating in the ``red'' states in the middle of the country. Through multilevel modeling of individual-level survey data and county- and state-level demographic and electoral data, we reconcile these patterns. Key methods used in this research are: (1) plots of repeated cross-sectional analyses, (2) varying-intercept, varying-slope multilevel models, and (3) a graph that simultaneously shows within-group and between-group patterns in a multilevel model. These statistical tools help us understand patterns of variation within and between states in a way that would not be possible from classical regressions or by looking at tables of coefficient estimates.
Gelman, Andrew, Shor, Boris, Bafumi, Joseph, Park, David
Submitted: 2005-11-29
Keywords: availability heuristic, ecological fallacy, hierarchical model, income and voting, multilevel model, presidential elections, public opinion, secret weapon, varying-slope model
Abstract: (click to show/hide) We find that income matters more in ``red America'' than in ``blue America.'' In poor states, rich people are much more likely than poor people to vote for the Republican presidential candidate, but in rich states (such as Connecticut), income has a very low correlation with vote preference. In addition to finding this pattern and studying its changes over time, we use the concepts of typicality and availability from cognitive psychology to explain how these patterns can be commonly misunderstood. Our results can be viewed either as a debunking of the journalistic image of rich ``latte'' Democrats and poor ``Nascar'' Republicans, or as support for the journalistic images of political and cultural differences between red and blue states---differences which are not explained by differences in individuals' incomes. For decades, the Democrats have been viewed as the party of the poor, with the Republicans representing the rich. Recent presidential elections, however, have shown a reverse pattern, with Democrats performing well in the richer ``blue'' states in the northeast and west coast, and Republicans dominating in the ``red'' states in the middle of the country. Through multilevel modeling of individual-level survey data and county- and state-level demographic and electoral data, we reconcile these patterns. Key methods used in this research are: (1) plots of repeated cross-sectional analyses, (2) varying-intercept, varying-slope multilevel models, and (3) a graph that simultaneously shows within-group and between-group patterns in a multilevel model. These statistical tools help us understand patterns of variation within and between states in a way that would not be possible from classical regressions or by looking at tables of coefficient estimates.
Covariate Functional Form in Cox Models
Keele, Luke
Submitted: 2005-10-25
Keywords: Cox model, event history, survival models, splines, semi-parametric, duration models
Abstract: (click to show/hide) In most event history models, the effect of a covariate on the hazard is assumed to have a log-linear functional form. For continuous covariates, this assumption is often violated as the effect is highly nonlinear. Assuming a log-linear functional form when the nonlinear form applies causes specification errors leading to erroneous statistical conclusions. Scholars can, instead of ignoring the presence of nonlinear effects, test for such nonlinearity and incorporate it into the model. I review methods to test for and model nonlinear functional forms for covariates in the Cox model. Testing for such nonlinear effects is important since such nonlinearity can appear as nonproportional hazards, but time varying terms will not correct the misspecification. I investigate the consequences of nonlinear function forms using data on international conflicts from 1950-1985. I demonstrate that the conclusions drawn from this data depend on fitting the correct functional form for the covariates.
Keele, Luke
Submitted: 2005-10-25
Keywords: Cox model, event history, survival models, splines, semi-parametric, duration models
Abstract: (click to show/hide) In most event history models, the effect of a covariate on the hazard is assumed to have a log-linear functional form. For continuous covariates, this assumption is often violated as the effect is highly nonlinear. Assuming a log-linear functional form when the nonlinear form applies causes specification errors leading to erroneous statistical conclusions. Scholars can, instead of ignoring the presence of nonlinear effects, test for such nonlinearity and incorporate it into the model. I review methods to test for and model nonlinear functional forms for covariates in the Cox model. Testing for such nonlinear effects is important since such nonlinearity can appear as nonproportional hazards, but time varying terms will not correct the misspecification. I investigate the consequences of nonlinear function forms using data on international conflicts from 1950-1985. I demonstrate that the conclusions drawn from this data depend on fitting the correct functional form for the covariates.
Struggles with survey weighting and regression modeling
Gelman, Andrew
Submitted: 2005-10-12
Keywords: multilevel modeling, poststratication, sampling weights, shrinkage
Abstract: (click to show/hide) The general principles of Bayesian data analysis imply that models for survey responses should be constructed conditional on all variables that affect the probability of inclusion and nonresponse, which are also the variables used in survey weighting and clustering. However, such models can quickly become very complicated, with potentially thousands of post-stratification cells. It is then a challenge to develop general families of multilevel probability models that yield reasonable Bayesian inferences. We discuss in the context of several ongoing public health and social surveys. This work is currently open-ended, and we conclude with thoughts on how research could proceed to solve these problems.
Gelman, Andrew
Submitted: 2005-10-12
Keywords: multilevel modeling, poststratication, sampling weights, shrinkage
Abstract: (click to show/hide) The general principles of Bayesian data analysis imply that models for survey responses should be constructed conditional on all variables that affect the probability of inclusion and nonresponse, which are also the variables used in survey weighting and clustering. However, such models can quickly become very complicated, with potentially thousands of post-stratification cells. It is then a challenge to develop general families of multilevel probability models that yield reasonable Bayesian inferences. We discuss in the context of several ongoing public health and social surveys. This work is currently open-ended, and we conclude with thoughts on how research could proceed to solve these problems.
How many people do you know in prison?: using overdispersion in count data to estimate social structure in networks
Zheng, Tian, Salganik, Matt, Gelman, Andrew
Submitted: 2005-10-12
Keywords: negative binomial distribution, overdispersion, sampling, social networks, social structure
Abstract: (click to show/hide) Networks--sets of objects connected by relationships--are important in a number of fields. The study of networks has long been central to sociology, where researchers have attempted to understand the causes and consequences of the structure of relationships in large groups of people. Using insight from previous network research, Killworth et al. (1998a,b) and McCarty et al. (2001) developed and evaluated a method for estimating the sizes of hard-to-count populations using network data collected from a simple random sample of Americans. In this paper we show how, using a multilevel overdispersed Poisson regression model, these data can also be used to estimate aspects of social structure in the population. Our work goes beyond most previous research on networks by using variation, as well as average responses, as a source of information. We apply our method to the McCarty et al. data and find that Americans vary greatly in their number of acquaintances. Further, Americans show great variation in propensity to form ties to people in some groups (e.g., males in prison, the homeless, and American Indians), but little variation for other groups (e.g., twins, people named Michael or Nicole). We also explore other features of these data and consider ways in which survey data can be used to estimate network structure.
Zheng, Tian, Salganik, Matt, Gelman, Andrew
Submitted: 2005-10-12
Keywords: negative binomial distribution, overdispersion, sampling, social networks, social structure
Abstract: (click to show/hide) Networks--sets of objects connected by relationships--are important in a number of fields. The study of networks has long been central to sociology, where researchers have attempted to understand the causes and consequences of the structure of relationships in large groups of people. Using insight from previous network research, Killworth et al. (1998a,b) and McCarty et al. (2001) developed and evaluated a method for estimating the sizes of hard-to-count populations using network data collected from a simple random sample of Americans. In this paper we show how, using a multilevel overdispersed Poisson regression model, these data can also be used to estimate aspects of social structure in the population. Our work goes beyond most previous research on networks by using variation, as well as average responses, as a source of information. We apply our method to the McCarty et al. data and find that Americans vary greatly in their number of acquaintances. Further, Americans show great variation in propensity to form ties to people in some groups (e.g., males in prison, the homeless, and American Indians), but little variation for other groups (e.g., twins, people named Michael or Nicole). We also explore other features of these data and consider ways in which survey data can be used to estimate network structure.
Making Inferences from 2x2 Tables: The Inadequacy of the Fisher Exact
Test for Observational Data and a Principled Bayesian Alternative
Sekhon, Jasjeet
Submitted: 2005-08-17
Keywords: Fisher exact test, randomization inference, permutation tests, Bayesian tests, difference of proportions, observational data
Abstract: (click to show/hide) The Fisher exact test is the dominant method of making inferences from 2x2 tables where the number of observations is small. Although the Fisher test and approximations to it are used in a large number of studies, these tests rest on a data generating process which is inappropriate for most applications for which they are used. The canonical Fisher test assumes that both of the margins in a 2x2 table are fixed by construction---i.e., both the treatment and outcome margins are fixed a priori. If the data were generated by an alternative process, such as binomial, negative binomial or Poisson binomial sampling, the Fisher exact test and approximations to it do not have correct coverage. A Bayesian method is offered which has correct coverage, is powerful, is consistent with a binomial process and can be extended easily to other distributions. A prominent 2x2 table which has been used in the literature by Geddes (1990) and Sekhon (2004) to explore the relationship between foreign threat and social revolution (Skocpol, 1979) is reanalyzed. The Bayesian method finds a significant relationship even though the Fisher and related tests do not. A Monte Carlo sampling experiment is provided which shows that the Bayesian method dominates the usual alternatives in terms of both test coverage and power when the data are generated by a binomial process.
Sekhon, Jasjeet
Submitted: 2005-08-17
Keywords: Fisher exact test, randomization inference, permutation tests, Bayesian tests, difference of proportions, observational data
Abstract: (click to show/hide) The Fisher exact test is the dominant method of making inferences from 2x2 tables where the number of observations is small. Although the Fisher test and approximations to it are used in a large number of studies, these tests rest on a data generating process which is inappropriate for most applications for which they are used. The canonical Fisher test assumes that both of the margins in a 2x2 table are fixed by construction---i.e., both the treatment and outcome margins are fixed a priori. If the data were generated by an alternative process, such as binomial, negative binomial or Poisson binomial sampling, the Fisher exact test and approximations to it do not have correct coverage. A Bayesian method is offered which has correct coverage, is powerful, is consistent with a binomial process and can be extended easily to other distributions. A prominent 2x2 table which has been used in the literature by Geddes (1990) and Sekhon (2004) to explore the relationship between foreign threat and social revolution (Skocpol, 1979) is reanalyzed. The Bayesian method finds a significant relationship even though the Fisher and related tests do not. A Monte Carlo sampling experiment is provided which shows that the Bayesian method dominates the usual alternatives in terms of both test coverage and power when the data are generated by a binomial process.
A simple scheme to improve the efficiency of referenda
Casella, Alessandra, Gelman, Andrew
Submitted: 2005-08-16
Keywords: storable votes, bonus votes, weighted voting, referendum
Abstract: (click to show/hide) This paper proposes a simple scheme designed to elicit and reward intensity of preferences in referenda: voters faced with a number of binary proposals are given one regular vote for each proposal plus an additional number of bonus votes to cast as desired. Decisions are taken according to the majority of votes cast. In our base case, where there is no systematic di¤erence between proposalssupporters and opponents, there is always a positive number of bonus votes such that ex ante utility is increased by the scheme, relative to simple majority voting. When the distributions of valuations of supporters and opponents differ, the improvement in efficiency is guaranteed only if the distributions can be ranked according to first order stochastic dominance. If they are, however, the existence of welfare gains is independent of the exact number of bonus votes.
Casella, Alessandra, Gelman, Andrew
Submitted: 2005-08-16
Keywords: storable votes, bonus votes, weighted voting, referendum
Abstract: (click to show/hide) This paper proposes a simple scheme designed to elicit and reward intensity of preferences in referenda: voters faced with a number of binary proposals are given one regular vote for each proposal plus an additional number of bonus votes to cast as desired. Decisions are taken according to the majority of votes cast. In our base case, where there is no systematic di¤erence between proposalssupporters and opponents, there is always a positive number of bonus votes such that ex ante utility is increased by the scheme, relative to simple majority voting. When the distributions of valuations of supporters and opponents differ, the improvement in efficiency is guaranteed only if the distributions can be ranked according to first order stochastic dominance. If they are, however, the existence of welfare gains is independent of the exact number of bonus votes.
Diagnostics for multivariate imputation
Abayomi, Kobi, Gelman, Andrew, Levy, Marc
Submitted: 2005-08-16
Keywords: missing data, multiple imputation, regression diagnostics
Abstract: (click to show/hide) We consider three sorts of diagnostics for random imputations: (a) displays of the completed data, intended to reveal unusual patterns that might suggest problems with the imputations, (b) comparisons of the distributions of observed and imputed data values, and (c) checks of the fit of observed data to the model used to create the imputations. We formulate these methods in terms of sequential regression multivariate imputation [Van Buuren and Oudshoom 2000, and Raghunathan, Van Hoewyk, and Solenberger 2001], an iterative procedure in which the missing values of each variable are randomly imputed conditional on all the other variables in the completed data matrix. We also consider a recalibration procedure for sequential regression imputations. We apply these methods to the 2002 Environmental Sustainability Index (ESI), a linear aggregation of 68 environmental variables on 142 countries, with 22% missing values.
Abayomi, Kobi, Gelman, Andrew, Levy, Marc
Submitted: 2005-08-16
Keywords: missing data, multiple imputation, regression diagnostics
Abstract: (click to show/hide) We consider three sorts of diagnostics for random imputations: (a) displays of the completed data, intended to reveal unusual patterns that might suggest problems with the imputations, (b) comparisons of the distributions of observed and imputed data values, and (c) checks of the fit of observed data to the model used to create the imputations. We formulate these methods in terms of sequential regression multivariate imputation [Van Buuren and Oudshoom 2000, and Raghunathan, Van Hoewyk, and Solenberger 2001], an iterative procedure in which the missing values of each variable are randomly imputed conditional on all the other variables in the completed data matrix. We also consider a recalibration procedure for sequential regression imputations. We apply these methods to the 2002 Environmental Sustainability Index (ESI), a linear aggregation of 68 environmental variables on 142 countries, with 22% missing values.
Validation of software for Bayesian models using posterior quantiles
Cook, Samantha, Gelman, Andrew, Rubin, Donald
Submitted: 2005-08-16
Keywords: Bayesian inference, Markov chain Monte Carlo, simulation, computation, hierarchical models
Abstract: (click to show/hide) We present a simulation-based method designed to establish the computational correctness of software developed to fit a specific Bayesian model, capitalizing on properties of Bayesian posterior distributions. We illustrate the validation technique with two examples. The validation method is shown to find errors in software when they exist and, moreover, the validation output can be informative about the nature and location of such errors.
Cook, Samantha, Gelman, Andrew, Rubin, Donald
Submitted: 2005-08-16
Keywords: Bayesian inference, Markov chain Monte Carlo, simulation, computation, hierarchical models
Abstract: (click to show/hide) We present a simulation-based method designed to establish the computational correctness of software developed to fit a specific Bayesian model, capitalizing on properties of Bayesian posterior distributions. We illustrate the validation technique with two examples. The validation method is shown to find errors in software when they exist and, moreover, the validation output can be informative about the nature and location of such errors.
Sampling people or people in places? The BES as an election study
Johnston, Ron, Harris, Rich, Jones, Kelvyn
Submitted: 2005-08-15
Keywords: British Election Study, representativeness, sampling
Abstract: (click to show/hide) UK general elections involve a number of separate, though complexly inter-linked, contests for support among the parties. Two of these are reflected in the main types of model of voting behaviour used by political scientists, whereas the third involves the separate contests that take place – in most cases among the main political parties – in the (now) 646 constituencies which send representatives to the House of Commons. Ideally, electoral surveys should take account of all three. In this note, we explore the extent to which that is the case with the 2005 British Election Study – with the coverage restricted to England and Wales only, for technical reasons – and explore the implications of our findings for future electoral studies.
Johnston, Ron, Harris, Rich, Jones, Kelvyn
Submitted: 2005-08-15
Keywords: British Election Study, representativeness, sampling
Abstract: (click to show/hide) UK general elections involve a number of separate, though complexly inter-linked, contests for support among the parties. Two of these are reflected in the main types of model of voting behaviour used by political scientists, whereas the third involves the separate contests that take place – in most cases among the main political parties – in the (now) 646 constituencies which send representatives to the House of Commons. Ideally, electoral surveys should take account of all three. In this note, we explore the extent to which that is the case with the 2005 British Election Study – with the coverage restricted to England and Wales only, for technical reasons – and explore the implications of our findings for future electoral studies.
Another geography of turnout? Respondents and non-respondents to the 2005 British Election Study
Johnston, Ron, Harris, Rich
Submitted: 2005-08-15
Keywords: British Election Study, response rates, ecological analyses
Abstract: (click to show/hide) An issue of growing concern in studies of voting patterns using survey data is the falling response rate achieved by face-to-face surveys. The 2005 British Election Study (BES) pre-campaign survey achieved interviews with 55.6 per cent of the 6450 individuals sampled – a drop of nearly 20 percentage points over the average for the surveys undertaken in the 1960s and some 15 points over those in the 1970s. Of the addresses selected, no contact could be made at 5.8 per cent, the individuals selected at a further 26.0 per cent refused to be interviewed, 4.4 per cent were otherwise unproductive and 8.0 per cent of the addresses were ‘out of scope (deadwood)’. To what extent does this failure to reach a very substantial minority of the addresses selected have any impact upon the results of the sample survey and the conclusions drawn therefrom?
Johnston, Ron, Harris, Rich
Submitted: 2005-08-15
Keywords: British Election Study, response rates, ecological analyses
Abstract: (click to show/hide) An issue of growing concern in studies of voting patterns using survey data is the falling response rate achieved by face-to-face surveys. The 2005 British Election Study (BES) pre-campaign survey achieved interviews with 55.6 per cent of the 6450 individuals sampled – a drop of nearly 20 percentage points over the average for the surveys undertaken in the 1960s and some 15 points over those in the 1970s. Of the addresses selected, no contact could be made at 5.8 per cent, the individuals selected at a further 26.0 per cent refused to be interviewed, 4.4 per cent were otherwise unproductive and 8.0 per cent of the addresses were ‘out of scope (deadwood)’. To what extent does this failure to reach a very substantial minority of the addresses selected have any impact upon the results of the sample survey and the conclusions drawn therefrom?
Publication, Publication
King, Gary
Submitted: 2005-07-26
Keywords: replication, data sharing, class assignments
Abstract: (click to show/hide) I show herein how to write a publishable paper by beginning with the replication of a published article. This strategy seems to work well for class projects in producing papers that ultimately get published, helping to professionalize students into the discipline, and teaching them the scientific norms of the free exchange of academic information. I begin by briefly revisiting the prominent debate on replication our discipline had a decade ago and some of the progress made in data sharing since. (This paper is forthcoming in PS: Political Science and Politics. The current version is available at http://gking.harvard.edu.)
King, Gary
Submitted: 2005-07-26
Keywords: replication, data sharing, class assignments
Abstract: (click to show/hide) I show herein how to write a publishable paper by beginning with the replication of a published article. This strategy seems to work well for class projects in producing papers that ultimately get published, helping to professionalize students into the discipline, and teaching them the scientific norms of the free exchange of academic information. I begin by briefly revisiting the prominent debate on replication our discipline had a decade ago and some of the progress made in data sharing since. (This paper is forthcoming in PS: Political Science and Politics. The current version is available at http://gking.harvard.edu.)
Understanding Interaction Models: Improving Empirical Analyses
Brambor, Thomas, Clark, William, Golder, Matt
Submitted: 2005-07-26
Keywords:
Abstract: (click to show/hide) Multiplicative interaction models are common in the quantitative political science literature. This is so for good reason. Institutional arguments frequently imply that the relationship between political inputs and outcomes varies depending on the institutional context. Models of strategic interaction typically produce conditional hypotheses as well. Although conditional hypotheses are ubiquitous in political science and multiplicative interaction models have been found to capture their intuition quite well, a survey of the top three political science journals from 1998 to 2002 suggests that the execution of these models is often flawed and inferential errors are common. We believe that considerable progress in our understanding of the political world can occur if scholars follow the simple check list of dos and don'ts for using multiplicative interaction models presented in this article. Only 10% of the articles in our survey followed the check list.
Brambor, Thomas, Clark, William, Golder, Matt
Submitted: 2005-07-26
Keywords:
Abstract: (click to show/hide) Multiplicative interaction models are common in the quantitative political science literature. This is so for good reason. Institutional arguments frequently imply that the relationship between political inputs and outcomes varies depending on the institutional context. Models of strategic interaction typically produce conditional hypotheses as well. Although conditional hypotheses are ubiquitous in political science and multiplicative interaction models have been found to capture their intuition quite well, a survey of the top three political science journals from 1998 to 2002 suggests that the execution of these models is often flawed and inferential errors are common. We believe that considerable progress in our understanding of the political world can occur if scholars follow the simple check list of dos and don'ts for using multiplicative interaction models presented in this article. Only 10% of the articles in our survey followed the check list.
Genetic Matching for Estimating Causal Effects: A General Multivariate Matching Method for Achieving Balance in Observational Studies
Diamond, Alexis, Sekhon, Jasjeet
Submitted: 2005-07-19
Keywords: Matching, Propensity Score, Causal Inference, Genetic Algorithm, Evolutionary Programming, Optimization, Program Evaluation
Abstract: (click to show/hide) Genetic matching is a new method for performing multivariate matching which uses an evolutionary search algorithm to determine the weight each covariate is given. The method utilizes an evolutionary algorithm developed by Mebane and Sekhon (1998; Sekhon and Mebane 1998) that maximizes the balance of observed potential confounders across matched treated and control units. The method is nonparametric and does not depend on knowing or estimating the propensity score, but the method is greatly improved when a known or estimated propensity score is incorporated. Genetic matching reliably reduces both the bias and the mean square error of the estimated causal effect even when the property of equal percent bias reduction (EPBR) does not hold. When this property does not hold, matching methods---such as Mahalanobis distance and propensity score matching---often perform poorly. Even if the EPBR property does hold and the propensity score is correctly specified, in finite samples, estimates based on genetic matching have lower mean square error than those based on the usual matching methods. We present a reanalysis of the LaLonde (1986) job training dataset which demonstrates the benefits of genetic matching and which helps to resolve a longstanding debate between Dehejia and Wahba (1999, 2002); Dehejia (2005) and Smith and Todd (2001, 2005a,b) over the ability of matching to overcome LaLonde's critique of nonexperimental estimators. Monte Carlos are also presented to demonstrate the properties of our method.
Diamond, Alexis, Sekhon, Jasjeet
Submitted: 2005-07-19
Keywords: Matching, Propensity Score, Causal Inference, Genetic Algorithm, Evolutionary Programming, Optimization, Program Evaluation
Abstract: (click to show/hide) Genetic matching is a new method for performing multivariate matching which uses an evolutionary search algorithm to determine the weight each covariate is given. The method utilizes an evolutionary algorithm developed by Mebane and Sekhon (1998; Sekhon and Mebane 1998) that maximizes the balance of observed potential confounders across matched treated and control units. The method is nonparametric and does not depend on knowing or estimating the propensity score, but the method is greatly improved when a known or estimated propensity score is incorporated. Genetic matching reliably reduces both the bias and the mean square error of the estimated causal effect even when the property of equal percent bias reduction (EPBR) does not hold. When this property does not hold, matching methods---such as Mahalanobis distance and propensity score matching---often perform poorly. Even if the EPBR property does hold and the propensity score is correctly specified, in finite samples, estimates based on genetic matching have lower mean square error than those based on the usual matching methods. We present a reanalysis of the LaLonde (1986) job training dataset which demonstrates the benefits of genetic matching and which helps to resolve a longstanding debate between Dehejia and Wahba (1999, 2002); Dehejia (2005) and Smith and Todd (2001, 2005a,b) over the ability of matching to overcome LaLonde's critique of nonexperimental estimators. Monte Carlos are also presented to demonstrate the properties of our method.
Identifying Intra-Party Voting Blocs in the UK House of Commons
Quinn, Kevin, Spirling, Arthur
Submitted: 2005-07-19
Keywords: roll-call analysis, UK House of Commons, Bayesian nonparametrics, Dirichlet process mixtures
Abstract: (click to show/hide) Legislative voting records are an important source of information about legislator preferences, intra-party cohesiveness, and the divisiveness of various policy issues. Standard methods of analyzing a legislative voting record tend to have serious drawbacks when applied to legislatures, such as the UK House of Commons, that feature highly disciplined parties, strategic voting, and large amounts of missing data. We present a method (based on a Dirichlet process mixture model) for analyzing such voting records that does not suffer from these same problems. We apply the method to the voting records of Labour and Conservative Party MPs in the 1997-2001 session of the UK House of Commons. Our method has a number of advantages over existing approaches. It is model-based and thus allows one to make probability statements about quantities of interest. It allows one to estimate the number of voting blocs within a party or any other group of MPs. It handles missing data in a principled fashion and does not rely on an ad hoc distance metric between voting profiles. Finally, it can be used as both a predictive model and an exploratory model. We illustrate these points in our analysis of the UK data.
Quinn, Kevin, Spirling, Arthur
Submitted: 2005-07-19
Keywords: roll-call analysis, UK House of Commons, Bayesian nonparametrics, Dirichlet process mixtures
Abstract: (click to show/hide) Legislative voting records are an important source of information about legislator preferences, intra-party cohesiveness, and the divisiveness of various policy issues. Standard methods of analyzing a legislative voting record tend to have serious drawbacks when applied to legislatures, such as the UK House of Commons, that feature highly disciplined parties, strategic voting, and large amounts of missing data. We present a method (based on a Dirichlet process mixture model) for analyzing such voting records that does not suffer from these same problems. We apply the method to the voting records of Labour and Conservative Party MPs in the 1997-2001 session of the UK House of Commons. Our method has a number of advantages over existing approaches. It is model-based and thus allows one to make probability statements about quantities of interest. It allows one to estimate the number of voting blocs within a party or any other group of MPs. It handles missing data in a principled fashion and does not rely on an ad hoc distance metric between voting profiles. Finally, it can be used as both a predictive model and an exploratory model. We illustrate these points in our analysis of the UK data.
Measuring District Level Preferences with Implications for the Analysis of U.S. Elections
Levendusky, Matthew, Pope, Jeremy, Jackman, Simon
Submitted: 2005-07-19
Keywords:
Abstract: (click to show/hide) Studies of American politics frequently rely on measures of a district's long-run partisan or ideological disposition. Numerous proxies have been proposed in the literature, but are hampered by significant practical drawbacks or have unknown measurement properties. We propose a statistical model that marries district level demographic data with electoral returns from 1952 to 2000, to produce a new measure of district level preferences. We also use ourmodel to estimate incumbency advantage, themagnitude of national swings, homestate effects, inter alia, underscoring an important methodological point: good estimates of interesting structural parameters require good measures of unobserved yet structurally relevant phenomena, and vice-versa. Our model also provides a more reasonable way of dealing with uncontested seats. Over time, district-level presidential vote has become an increasingly reliable proxy for district preferences, and by the decade of 1990s is an excellent proxy, which will be reassuring to the many researchers who have relied on presidential vote in this way.
Levendusky, Matthew, Pope, Jeremy, Jackman, Simon
Submitted: 2005-07-19
Keywords:
Abstract: (click to show/hide) Studies of American politics frequently rely on measures of a district's long-run partisan or ideological disposition. Numerous proxies have been proposed in the literature, but are hampered by significant practical drawbacks or have unknown measurement properties. We propose a statistical model that marries district level demographic data with electoral returns from 1952 to 2000, to produce a new measure of district level preferences. We also use ourmodel to estimate incumbency advantage, themagnitude of national swings, homestate effects, inter alia, underscoring an important methodological point: good estimates of interesting structural parameters require good measures of unobserved yet structurally relevant phenomena, and vice-versa. Our model also provides a more reasonable way of dealing with uncontested seats. Over time, district-level presidential vote has become an increasingly reliable proxy for district preferences, and by the decade of 1990s is an excellent proxy, which will be reassuring to the many researchers who have relied on presidential vote in this way.
Unemployment and Violence in Northern Ireland: a missing data model for ecological inference
Honaker, James
Submitted: 2005-07-19
Keywords: Multiple Imputation, Ecological Inference, Count Data, Political Violence
Abstract: (click to show/hide) Contrary to the body of literature in political violence, and the rhetoric of many parties of the conflict, time-series models of ``the troubles'' in Northern Ireland by White (1993) and Thompson (1989) have found no evidence that economic conditions effect the intensity, sources or direction of violence. I show that several methodological flaws exist in previous models. They fail to address the discrete, count nature of the data, the contagion present from aggregation over time, pooling issues from different types of violence, and the over dispersal of deaths. However, the key problem, acknowledged even by the authors themselves, is that all measures of unemployment aggregate Protestant and Catholic unemployment rates into one single measure. Using a model that combines methods of Multiple Imputation to recover missing data (King Honaker Joseph Scheve 2001) and the literature of models for Ecological Inference problems (especially King 1997) I estimate the disaggregated unemployment rates by religion from the available data. Unemployment is shown to be a leading cause of the violence by Republican factions in Northern Ireland.
Honaker, James
Submitted: 2005-07-19
Keywords: Multiple Imputation, Ecological Inference, Count Data, Political Violence
Abstract: (click to show/hide) Contrary to the body of literature in political violence, and the rhetoric of many parties of the conflict, time-series models of ``the troubles'' in Northern Ireland by White (1993) and Thompson (1989) have found no evidence that economic conditions effect the intensity, sources or direction of violence. I show that several methodological flaws exist in previous models. They fail to address the discrete, count nature of the data, the contagion present from aggregation over time, pooling issues from different types of violence, and the over dispersal of deaths. However, the key problem, acknowledged even by the authors themselves, is that all measures of unemployment aggregate Protestant and Catholic unemployment rates into one single measure. Using a model that combines methods of Multiple Imputation to recover missing data (King Honaker Joseph Scheve 2001) and the literature of models for Ecological Inference problems (especially King 1997) I estimate the disaggregated unemployment rates by religion from the available data. Unemployment is shown to be a leading cause of the violence by Republican factions in Northern Ireland.
A Method for Weighting Survey Samples of Low-Incidence Voters
Nagler, Jonathan, Alvarez, R. Michael
Submitted: 2005-07-19
Keywords:
Abstract: (click to show/hide) In this paper we describe a method for weighting surveys of a sub-sample of voters. We focus on the case of Latino voters. And we analyze data for three surveys: two opinion polls leading up to the 2004 presidential election, and the national exit poll from the 2004 election. We take advantage of much data when it is available, the large amount of data describing the demographics of Hispanic citizens. And we combine this with a model of turnout of those citizens to improve our estimate of the demographics characteristics of Hispanic voters. We show that alternate weighting schemes can substantively alter inferences about population parameters. [This is an incomplete version of the paper, it omits calculations of uncertainty which are some of the fundamental quantities of interest of the paper.]
Nagler, Jonathan, Alvarez, R. Michael
Submitted: 2005-07-19
Keywords:
Abstract: (click to show/hide) In this paper we describe a method for weighting surveys of a sub-sample of voters. We focus on the case of Latino voters. And we analyze data for three surveys: two opinion polls leading up to the 2004 presidential election, and the national exit poll from the 2004 election. We take advantage of much data when it is available, the large amount of data describing the demographics of Hispanic citizens. And we combine this with a model of turnout of those citizens to improve our estimate of the demographics characteristics of Hispanic voters. We show that alternate weighting schemes can substantively alter inferences about population parameters. [This is an incomplete version of the paper, it omits calculations of uncertainty which are some of the fundamental quantities of interest of the paper.]
Bridging Institutions and Time: Creating Comparable Preference Estimates for Presidents, Senators, Representatives and Justices, 1950-2002
Bailey, Michael
Submitted: 2005-07-19
Keywords: ideal point estimation, Supreme Court, Congress
Abstract: (click to show/hide) Difficulty in comparing preferences across time and institutional contexts hinders the empirical testing of many important theories in political science. In this paper, I characterize these difficulties and provide a measurement approach that relies on inter-temporal and inter-institutional ``bridge'' observations and Bayesian Markov chain simulation methods. I generate preference estimates for Presidents, Senators, Representatives and Supreme Court Justices that are comparable across time and across institutions. Such preference estimates are indispensable in a variety of important research projects, including research on statutory interpretation, executive influence on the Supreme Court and Senate influence on court appointments.
Bailey, Michael
Submitted: 2005-07-19
Keywords: ideal point estimation, Supreme Court, Congress
Abstract: (click to show/hide) Difficulty in comparing preferences across time and institutional contexts hinders the empirical testing of many important theories in political science. In this paper, I characterize these difficulties and provide a measurement approach that relies on inter-temporal and inter-institutional ``bridge'' observations and Bayesian Markov chain simulation methods. I generate preference estimates for Presidents, Senators, Representatives and Supreme Court Justices that are comparable across time and across institutions. Such preference estimates are indispensable in a variety of important research projects, including research on statutory interpretation, executive influence on the Supreme Court and Senate influence on court appointments.
Trade and Militarized Conflict: How Modeling Strategic Interactions Between States Makes a Difference
Rowan, Shawn E.
Submitted: 2005-07-19
Keywords: trade, conflict, interdependence, asymmetry, strategic
Abstract: (click to show/hide) The study between the interaction of war and foreign trade has occupied scholars from political science and economics for thousands of years. I contribute to the trade and conflict debate by accounting for the strategic interaction between states that most or all theories in international relations (IR) assume. I use a strategic statistical model (Signorino 1999, 2003b) that endogenizes the actions that leads states to militarized conflict and peace. The results of the strategic probit model reveal non-linear, asymmetric relationships between trade dependence and militarized conflict for each state in the dyad. Not only are these effects non-linear, but, in equilibrium, also depend on the actions taken by the other state in the dyad. The trade dependence of one state on another can have either a pacifying or a positive effect on militarized conflict. Additionally, these effects are only realized for initial increases in trade dependence and that once a threshold is reached, the effects of trade dependence are constant.
Rowan, Shawn E.
Submitted: 2005-07-19
Keywords: trade, conflict, interdependence, asymmetry, strategic
Abstract: (click to show/hide) The study between the interaction of war and foreign trade has occupied scholars from political science and economics for thousands of years. I contribute to the trade and conflict debate by accounting for the strategic interaction between states that most or all theories in international relations (IR) assume. I use a strategic statistical model (Signorino 1999, 2003b) that endogenizes the actions that leads states to militarized conflict and peace. The results of the strategic probit model reveal non-linear, asymmetric relationships between trade dependence and militarized conflict for each state in the dyad. Not only are these effects non-linear, but, in equilibrium, also depend on the actions taken by the other state in the dyad. The trade dependence of one state on another can have either a pacifying or a positive effect on militarized conflict. Additionally, these effects are only realized for initial increases in trade dependence and that once a threshold is reached, the effects of trade dependence are constant.
Heterogeneity in Supreme Court Decision-Making: How Case-Level Factors Alter Preference-Based Behavior
Bartels, Brandon
Submitted: 2005-07-19
Keywords: Supreme Court decision-making, multilevel modeling, heterogeneity
Abstract: (click to show/hide) Many theoretical perspectives of Supreme Court decision-making, most notably the attitudinal model, assume that justices’ policy preferences exhibit a uniform impact on their decisions across a wide variety of situations. I argue that there exists meaningful heterogeneity in the impact of policy preferences that can be explained theoretically and tested empirically. Adopting social psychological insights from theories of the attitude-behavior relationship, I develop a theoretical framework specifying the mechanisms--attitude strength and accountability--that explain variation in the preference-behavior relationship for justices. Case-level factors associated with each mechanism are hypothesized to moderate the impact of preferences. To test the hypotheses, I use a multilevel (hierarchical) modeling framework and conceive of Supreme Court voting data from the 1994-2002 terms as a two-level hierarchy: justices’ choices nested within cases. Estimates from a random coefficient model indicate that case-level variables associated with both attitude strength and accountability systematically explain variation in the preference-behavior relationship. Using an average partial effects post-estimation procedure, I present in-depth substantive interpretations of the results that highlight the compelling ways in which these case-level factors alter the nature of preference-based behavior. In addition to providing important substantive conclusions about Supreme Court decision-making, the paper also illustrates how a multilevel modeling framework is well-qualified to test heterogeneity-related hypotheses in social and behaviorial processes.
Bartels, Brandon
Submitted: 2005-07-19
Keywords: Supreme Court decision-making, multilevel modeling, heterogeneity
Abstract: (click to show/hide) Many theoretical perspectives of Supreme Court decision-making, most notably the attitudinal model, assume that justices’ policy preferences exhibit a uniform impact on their decisions across a wide variety of situations. I argue that there exists meaningful heterogeneity in the impact of policy preferences that can be explained theoretically and tested empirically. Adopting social psychological insights from theories of the attitude-behavior relationship, I develop a theoretical framework specifying the mechanisms--attitude strength and accountability--that explain variation in the preference-behavior relationship for justices. Case-level factors associated with each mechanism are hypothesized to moderate the impact of preferences. To test the hypotheses, I use a multilevel (hierarchical) modeling framework and conceive of Supreme Court voting data from the 1994-2002 terms as a two-level hierarchy: justices’ choices nested within cases. Estimates from a random coefficient model indicate that case-level variables associated with both attitude strength and accountability systematically explain variation in the preference-behavior relationship. Using an average partial effects post-estimation procedure, I present in-depth substantive interpretations of the results that highlight the compelling ways in which these case-level factors alter the nature of preference-based behavior. In addition to providing important substantive conclusions about Supreme Court decision-making, the paper also illustrates how a multilevel modeling framework is well-qualified to test heterogeneity-related hypotheses in social and behaviorial processes.
A Markov Switching Model of Congressional Partisan Regimes
Jones, Bryan, Kim, Chang-Jin, Startz, Richard
Submitted: 2005-07-18
Keywords: Markov switching, electoral realignment, critical elections, partisan regimes, Congressional elections
Abstract: (click to show/hide) Studies of development and change in partisan fortunes in the US emphasize epochs of partisan stability, separated by critical events or turning points. Yet to date we have no estimates of legislative regimes as they relate to electoral realignments. In this paper we study partisan balances in the US Congress using the method of Markov switching. Our estimates for the House of Representatives are based on election changes from 1854, roughly the date of the establishment of the modern incarnation of the two-party system, to the present. For the Senate, we estimate partisan balance from 1914, the date of popular election of Senators. We use this method to estimate an underlying unobserved state parameter, ‘partisan regime’. Basically a partisan regime denotes a built-in congressional electoral advantage that persists through time, and that changes in a disjoint and episodic fashion. The method allows the direct estimation of critical transition points between Republican and Democratic partisan coalitions. Republican regimes characterized House elections during three periods: 1860 through 1872, 1894 through 1906, and 1918 through 1928. A three-state estimate for the House suggested the emergence of a third state in 1994. For the Senate, the two-state model does not fit adequately. We estimate a three-state model in which a Republican regime dominated from 1914 through 1928; a Democratic regime characterized the period 1930-1934, and a Democratic-leaning regime characterized the period 1938 to the present (1936 is a transition year). Combined with existing historical evidence, our analysis isolates four critical congressional elections: 1874; 1894; 1930; and 1994.
Jones, Bryan, Kim, Chang-Jin, Startz, Richard
Submitted: 2005-07-18
Keywords: Markov switching, electoral realignment, critical elections, partisan regimes, Congressional elections
Abstract: (click to show/hide) Studies of development and change in partisan fortunes in the US emphasize epochs of partisan stability, separated by critical events or turning points. Yet to date we have no estimates of legislative regimes as they relate to electoral realignments. In this paper we study partisan balances in the US Congress using the method of Markov switching. Our estimates for the House of Representatives are based on election changes from 1854, roughly the date of the establishment of the modern incarnation of the two-party system, to the present. For the Senate, we estimate partisan balance from 1914, the date of popular election of Senators. We use this method to estimate an underlying unobserved state parameter, ‘partisan regime’. Basically a partisan regime denotes a built-in congressional electoral advantage that persists through time, and that changes in a disjoint and episodic fashion. The method allows the direct estimation of critical transition points between Republican and Democratic partisan coalitions. Republican regimes characterized House elections during three periods: 1860 through 1872, 1894 through 1906, and 1918 through 1928. A three-state estimate for the House suggested the emergence of a third state in 1994. For the Senate, the two-state model does not fit adequately. We estimate a three-state model in which a Republican regime dominated from 1914 through 1928; a Democratic regime characterized the period 1930-1934, and a Democratic-leaning regime characterized the period 1938 to the present (1936 is a transition year). Combined with existing historical evidence, our analysis isolates four critical congressional elections: 1874; 1894; 1930; and 1994.
The Dangers of Extreme Counterfactuals
King, Gary, Zeng, Langche
Submitted: 2005-07-18
Keywords: propensity score, extrapolation, counterfactual, convex hull, distance, model dependence
Abstract: (click to show/hide) We address the problem that occurs when inferences about counterfactuals -- predictions, ``what if'' questions, and causal effects -- are attempted far from the available data. The danger of these extreme counterfactuals is that substantive conclusions drawn from statistical models that fit the data well turn out to be based largely on speculation hidden in convenient modeling assumptions that few would be willing to defend. Yet existing statistical strategies provide few reliable means of identifying extreme counterfactuals. We offer a proof that inferences farther from the data are more model-dependent, and then develop easy-to-apply methods to evaluate how model-dependent our answers would be to specified counterfactuals. These methods require neither sensitivity testing over specified classes of models nor evaluating any specific modeling assumptions. If an analysis fails the simple tests we offer, then we know that substantive results are sensitive to at least some modeling choices that are not based on empirical evidence. The most recent version of this paper and software that implements the methods described is available at http://gking.harvard.edu.
King, Gary, Zeng, Langche
Submitted: 2005-07-18
Keywords: propensity score, extrapolation, counterfactual, convex hull, distance, model dependence
Abstract: (click to show/hide) We address the problem that occurs when inferences about counterfactuals -- predictions, ``what if'' questions, and causal effects -- are attempted far from the available data. The danger of these extreme counterfactuals is that substantive conclusions drawn from statistical models that fit the data well turn out to be based largely on speculation hidden in convenient modeling assumptions that few would be willing to defend. Yet existing statistical strategies provide few reliable means of identifying extreme counterfactuals. We offer a proof that inferences farther from the data are more model-dependent, and then develop easy-to-apply methods to evaluate how model-dependent our answers would be to specified counterfactuals. These methods require neither sensitivity testing over specified classes of models nor evaluating any specific modeling assumptions. If an analysis fails the simple tests we offer, then we know that substantive results are sensitive to at least some modeling choices that are not based on empirical evidence. The most recent version of this paper and software that implements the methods described is available at http://gking.harvard.edu.
Higher-Dimension Markov Models
Epstein, David, O'Halloran, Sharyn
Submitted: 2005-07-18
Keywords: Markov models
Abstract: (click to show/hide) Markov transition models are becoming a popular tool for exploring the dynamics of systems that can take on a finite number of states. However, their application in political science has thus far been limited to the two-state case. This paper explains the techniques necessary to estimate and interpret higher-dimension Markov models. We then apply them to the study of democratic transitions, where we find that a threestate model including an intermediary ``partial democracy'' category out-performs the previous two-state model of Przeworski, et. al. (2000).
Epstein, David, O'Halloran, Sharyn
Submitted: 2005-07-18
Keywords: Markov models
Abstract: (click to show/hide) Markov transition models are becoming a popular tool for exploring the dynamics of systems that can take on a finite number of states. However, their application in political science has thus far been limited to the two-state case. This paper explains the techniques necessary to estimate and interpret higher-dimension Markov models. We then apply them to the study of democratic transitions, where we find that a threestate model including an intermediary ``partial democracy'' category out-performs the previous two-state model of Przeworski, et. al. (2000).
Attributing Effects to A Cluster Randomized Get-Out-The-Vote Campaign: An Application of Randomization Inference Using Full Matching
Bowers, Jake, Hansen, Ben
Submitted: 2005-07-18
Keywords: causal inference, randomization inference, attributable effects, full matching, instrumental variables, missing data, field experiments, clustering
Abstract: (click to show/hide) Statistical analysis requires a probability model: commonly, a model for the dependence of outcomes $Y$ on confounders $X$ and a potentially causal variable $Z$. When the goal of the analysis is to infer $Z$'s effects on $Y$, this requirement introduces an element of circularity: in order to decide how $Z$ affects $Y$, the analyst first determines, speculatively, the manner of $Y$'s dependence on $Z$ and other variables. This paper takes a statistical perspective that avoids such circles, permitting analysis of $Z$'s effects on $Y$ even as the statistician remains entirely agnostic about the conditional distribution of $Y$ given $X$ and $Z$, or perhaps even denies that such a distribution exists. Our assumptions instead pertain to the conditional distribution $Z vert X$, and the role of speculation in settling them is reduced by the existence of random assignment of $Z$ in a field experiment as well as by poststratification, testing for overt bias before accepting a poststratification, and optimal full matching. Such beginnings pave the way for ``randomization inference'', an approach which, despite a long history in the analysis of designed experiments, is relatively new to political science and to other fields in which experimental data are rarely available. The approach applies to both experiments and observational studies. We illustrate this by applying it to analyze A. Gerber and D. Green's New Haven Vote 98 campaign. Conceived as both a get-out-the-vote campaign and a field experiment in political participation, the study assigned households to treatment and desired to estimate the effect of treatment on the individuals nested within the households. We estimate the number of voters who would not have voted had the campaign not prompted them to --- that is, the total number of votes attributable to the interventions of the campaigners --- while taking into account the non-independence of observations within households, non-random compliance, and missing responses. Both our statistical inferences about these attributable effects and the stratification and matching that precede them rely on quite recent developments from statistics; our matching, in particular, has novel features of potentially wide applicability. Our broad findings resemble those of the original analysis by citet{gerbergreen00}.
Bowers, Jake, Hansen, Ben
Submitted: 2005-07-18
Keywords: causal inference, randomization inference, attributable effects, full matching, instrumental variables, missing data, field experiments, clustering
Abstract: (click to show/hide) Statistical analysis requires a probability model: commonly, a model for the dependence of outcomes $Y$ on confounders $X$ and a potentially causal variable $Z$. When the goal of the analysis is to infer $Z$'s effects on $Y$, this requirement introduces an element of circularity: in order to decide how $Z$ affects $Y$, the analyst first determines, speculatively, the manner of $Y$'s dependence on $Z$ and other variables. This paper takes a statistical perspective that avoids such circles, permitting analysis of $Z$'s effects on $Y$ even as the statistician remains entirely agnostic about the conditional distribution of $Y$ given $X$ and $Z$, or perhaps even denies that such a distribution exists. Our assumptions instead pertain to the conditional distribution $Z vert X$, and the role of speculation in settling them is reduced by the existence of random assignment of $Z$ in a field experiment as well as by poststratification, testing for overt bias before accepting a poststratification, and optimal full matching. Such beginnings pave the way for ``randomization inference'', an approach which, despite a long history in the analysis of designed experiments, is relatively new to political science and to other fields in which experimental data are rarely available. The approach applies to both experiments and observational studies. We illustrate this by applying it to analyze A. Gerber and D. Green's New Haven Vote 98 campaign. Conceived as both a get-out-the-vote campaign and a field experiment in political participation, the study assigned households to treatment and desired to estimate the effect of treatment on the individuals nested within the households. We estimate the number of voters who would not have voted had the campaign not prompted them to --- that is, the total number of votes attributable to the interventions of the campaigners --- while taking into account the non-independence of observations within households, non-random compliance, and missing responses. Both our statistical inferences about these attributable effects and the stratification and matching that precede them rely on quite recent developments from statistics; our matching, in particular, has novel features of potentially wide applicability. Our broad findings resemble those of the original analysis by citet{gerbergreen00}.
Revisiting Dynamic Specification
De Boef, Suzanna, Keele, Luke
Submitted: 2005-07-18
Keywords: time series, dynamics, error correction, auto-distributed lag models
Abstract: (click to show/hide) Dramatic change in the world around us has stimulated a wealth of interest in research questions about the dynamics of political processes. At the same time we have seen increases in the number of time series data sets and the length of typical time series. Parallel advances have occurred in time series econometrics. These events have turned more political scientists into time series analysts and motivated more political methodologists to delve further into the annals of time series econometrics. But before taking the next advanced time series course, we recommend that time series analysts devote more time to issues of specification and interpretation. While advances in time series methods have helped us to change how we think about the process of political change in important ways, too often analysts have failed to recognize the wide number of general models available for stationary time series data, have estimated restricted models without testing the implied restrictions, and have done a poor job of drawing interpretations from their results. The consequences, at best, are poor connections between theory and tests and thus a limited cumulation of knowledge. More likely, the costs include biased results as well. We identify a number of general dynamic specifications, each a linear parameterization of the basic autoregressive distributed lag model and each highlighting different types of information. We then discuss the consequences of imposing restrictions on any of them. We recommend that analysts start with one or a combination of these general models and test for restrictions before adopting them. We illustrate this strategy with data on support for the Supreme Court and on presidential approval. Finally, we recommend that analysts make use of the wide array of information that can be gleaned from dynamic specifications. Such a practice will help us to better equate dynamic econometrics with dynamic theory.
De Boef, Suzanna, Keele, Luke
Submitted: 2005-07-18
Keywords: time series, dynamics, error correction, auto-distributed lag models
Abstract: (click to show/hide) Dramatic change in the world around us has stimulated a wealth of interest in research questions about the dynamics of political processes. At the same time we have seen increases in the number of time series data sets and the length of typical time series. Parallel advances have occurred in time series econometrics. These events have turned more political scientists into time series analysts and motivated more political methodologists to delve further into the annals of time series econometrics. But before taking the next advanced time series course, we recommend that time series analysts devote more time to issues of specification and interpretation. While advances in time series methods have helped us to change how we think about the process of political change in important ways, too often analysts have failed to recognize the wide number of general models available for stationary time series data, have estimated restricted models without testing the implied restrictions, and have done a poor job of drawing interpretations from their results. The consequences, at best, are poor connections between theory and tests and thus a limited cumulation of knowledge. More likely, the costs include biased results as well. We identify a number of general dynamic specifications, each a linear parameterization of the basic autoregressive distributed lag model and each highlighting different types of information. We then discuss the consequences of imposing restrictions on any of them. We recommend that analysts start with one or a combination of these general models and test for restrictions before adopting them. We illustrate this strategy with data on support for the Supreme Court and on presidential approval. Finally, we recommend that analysts make use of the wide array of information that can be gleaned from dynamic specifications. Such a practice will help us to better equate dynamic econometrics with dynamic theory.
Testing the Pooling Assumption with Cross-Sectional Time Series Data: A Proposal and an Assesment with Simulation Experiments
Stanig, Piero
Submitted: 2005-07-17
Keywords: Cross-Sectional Time Series Data, heterogeneity of coefficients
Abstract: (click to show/hide) I propose to use the loss of fit of the cross-validated predictions relative to the fit of the predictions from a pooled regression to test the assumption of constant betas across countries in a CSTS setting. The performance of this measure is a) evaluated in several simulation experiments that reproduce research situations common in comparative politics, and b) compared to the “cross-validated standard error of the regression”, proposed by Franzese(2002). I show that the measure I propose depends much less on the stochastic component in the DGP, and is better able to detect the country-specificity of the betas. I calculate the critical values that can be used to test the pooling assumption in some typical comparative politics CSTS situations. Finally, to evaluate the behavior of the measure with an actual dataset, I replicate the results of Alvarez et al. (1991) as replicated in Beck et al. (1993), calculate the proposed measure, and show that the pooling assumption does not seem to be inappropriate for the model they estimate.
Stanig, Piero
Submitted: 2005-07-17
Keywords: Cross-Sectional Time Series Data, heterogeneity of coefficients
Abstract: (click to show/hide) I propose to use the loss of fit of the cross-validated predictions relative to the fit of the predictions from a pooled regression to test the assumption of constant betas across countries in a CSTS setting. The performance of this measure is a) evaluated in several simulation experiments that reproduce research situations common in comparative politics, and b) compared to the “cross-validated standard error of the regression”, proposed by Franzese(2002). I show that the measure I propose depends much less on the stochastic component in the DGP, and is better able to detect the country-specificity of the betas. I calculate the critical values that can be used to test the pooling assumption in some typical comparative politics CSTS situations. Finally, to evaluate the behavior of the measure with an actual dataset, I replicate the results of Alvarez et al. (1991) as replicated in Beck et al. (1993), calculate the proposed measure, and show that the pooling assumption does not seem to be inappropriate for the model they estimate.
Death by Survey: Estimating Adult Mortality without Selection Bias
King, Gary, Gakidou, Emmanuela
Submitted: 2005-07-14
Keywords: surveys, selection bias, mortality data, extrapolation, international relations
Abstract: (click to show/hide) The widely used methods for estimating adult mortality rates from sample survey responses about the survival of siblings, parents, spouses, and others depend crucially on an assumption that we demonstrate does not hold in real data. We show that when this assumption is violated -- so that the mortality rate varies with sibship size -- mortality estimates can be massively biased. By using insights from work on the statistical analysis of selection bias, survey weighting, and extrapolation problems, we propose a new and relatively simple method of recovering the mortality rate with both greatly reduced potential for bias and increased clarity about the source of necessary assumptions.
King, Gary, Gakidou, Emmanuela
Submitted: 2005-07-14
Keywords: surveys, selection bias, mortality data, extrapolation, international relations
Abstract: (click to show/hide) The widely used methods for estimating adult mortality rates from sample survey responses about the survival of siblings, parents, spouses, and others depend crucially on an assumption that we demonstrate does not hold in real data. We show that when this assumption is violated -- so that the mortality rate varies with sibship size -- mortality estimates can be massively biased. By using insights from work on the statistical analysis of selection bias, survey weighting, and extrapolation problems, we propose a new and relatively simple method of recovering the mortality rate with both greatly reduced potential for bias and increased clarity about the source of necessary assumptions.
The Authority of Supreme Court Precedent: A Network Analysis
Fowler, James, Jeon, Sangick
Submitted: 2005-07-07
Keywords:
Abstract: (click to show/hide) We construct the complete network of 30,288 majority opinions written by the U.S. Supreme Court and the cases they cite from 1754 to 2002. Data from this network demonstrates quantitatively the evolution of the norm of stare decisis in the 19th Century and a significant deviation from this norm by the activist Warren court. We further describe a method for creating authority scores using the network data to identify the most important Court precedents. This method yields rankings that conform closely to evaluations by legal experts, and even predicts which cases they will identify as important in the future. An analysis of these scores over time allows us to test several hypotheses about the rise and fall of precedent. We show that reversed cases tend to be much more important than other decisions, and the cases that overrule them quickly become and remain even more important as the reversed decisions decline. We also show that the Court is careful to ground overruling decisions in past precedent, and the care it exercises is increasing in the importance of the decision that is overruled. Finally, authority scores corroborate qualitative assessments of which issues and cases the Court prioritizes and how these change over time.
Fowler, James, Jeon, Sangick
Submitted: 2005-07-07
Keywords:
Abstract: (click to show/hide) We construct the complete network of 30,288 majority opinions written by the U.S. Supreme Court and the cases they cite from 1754 to 2002. Data from this network demonstrates quantitatively the evolution of the norm of stare decisis in the 19th Century and a significant deviation from this norm by the activist Warren court. We further describe a method for creating authority scores using the network data to identify the most important Court precedents. This method yields rankings that conform closely to evaluations by legal experts, and even predicts which cases they will identify as important in the future. An analysis of these scores over time allows us to test several hypotheses about the rise and fall of precedent. We show that reversed cases tend to be much more important than other decisions, and the cases that overrule them quickly become and remain even more important as the reversed decisions decline. We also show that the Court is careful to ground overruling decisions in past precedent, and the care it exercises is increasing in the importance of the decision that is overruled. Finally, authority scores corroborate qualitative assessments of which issues and cases the Court prioritizes and how these change over time.
Designing and Analyzing Randomized Experiments
Horiuchi, Yusaku, Imai, Kosuke, Taniguchi, Naoko
Submitted: 2005-07-05
Keywords: Bayesian inference, causal inference, noncompliance, nonresponse, randomized block design
Abstract: (click to show/hide) In this paper, we demonstrate how to effectively design and analyze randomized experiments, which are becoming increasingly common in political science research. Randomized experiments provide researchers with an opportunity to obtain unbiased estimates of causal effects because the randomization of treatment guarantees that the treatment and control groups are on average equal in both observed and unobserved characteristics. Even in randomized experiments, however, complications can arise. In political science experiments, researchers often cannot force subjects to comply with treatment assignment or to provide the information necessary for the estimation of causal effects. Building on the recent statistical literature, we show how to make statistical adjustments for these noncompliance and nonresponse problems when analyzing randomized experiments. We also demonstrate how to design randomized experiments so that the potential impact of such complications is minimized.
Horiuchi, Yusaku, Imai, Kosuke, Taniguchi, Naoko
Submitted: 2005-07-05
Keywords: Bayesian inference, causal inference, noncompliance, nonresponse, randomized block design
Abstract: (click to show/hide) In this paper, we demonstrate how to effectively design and analyze randomized experiments, which are becoming increasingly common in political science research. Randomized experiments provide researchers with an opportunity to obtain unbiased estimates of causal effects because the randomization of treatment guarantees that the treatment and control groups are on average equal in both observed and unobserved characteristics. Even in randomized experiments, however, complications can arise. In political science experiments, researchers often cannot force subjects to comply with treatment assignment or to provide the information necessary for the estimation of causal effects. Building on the recent statistical literature, we show how to make statistical adjustments for these noncompliance and nonresponse problems when analyzing randomized experiments. We also demonstrate how to design randomized experiments so that the potential impact of such complications is minimized.
Who is the Best Connected Legislator? A Study of Cosponsorship Networks
Fowler, James
Submitted: 2005-06-29
Keywords:
Abstract: (click to show/hide) Using large-scale network analysis I map the cosponsorship networks of all 280,000 pieces of legislation proposed in the U.S. House and Senate from 1973 to 2004. In these networks a directional link can be drawn from each cosponsor of a piece of legislation to its sponsor. I use a number of statistics to describe these networks such as the quantity of legislation sponsored and cosponsored by each legislator, the number of legislators cosponsoring each piece of legislation, the total number of legislators who have cosponsored bills written by a given legislator, and network measures of closeness, betweenness, and eigenvector centrality. I then introduce a new measure I call ‘connectedness’ which uses information about the frequency of cosponsorship and the number of cosponsors on each bill to make inferences about the social distance between legislators. Connectedness predicts which members will pass more amendments on the floor, a measure which is commonly used as a proxy for legislative influence. It also predicts roll call vote choice even after controlling for ideology and partisanship.
Fowler, James
Submitted: 2005-06-29
Keywords:
Abstract: (click to show/hide) Using large-scale network analysis I map the cosponsorship networks of all 280,000 pieces of legislation proposed in the U.S. House and Senate from 1973 to 2004. In these networks a directional link can be drawn from each cosponsor of a piece of legislation to its sponsor. I use a number of statistics to describe these networks such as the quantity of legislation sponsored and cosponsored by each legislator, the number of legislators cosponsoring each piece of legislation, the total number of legislators who have cosponsored bills written by a given legislator, and network measures of closeness, betweenness, and eigenvector centrality. I then introduce a new measure I call ‘connectedness’ which uses information about the frequency of cosponsorship and the number of cosponsors on each bill to make inferences about the social distance between legislators. Connectedness predicts which members will pass more amendments on the floor, a measure which is commonly used as a proxy for legislative influence. It also predicts roll call vote choice even after controlling for ideology and partisanship.
Dynamic Models for Dynamic Theories: The Ins and Outs of Lagged Dependent Variables
Keele, Luke, Kelly, Nathan
Submitted: 2005-06-28
Keywords: time series, lagged dependent variables, OLS
Abstract: (click to show/hide) A lagged dependent variable in an OLS regression is often used as a means of capturing dynamic effects in political processes and as a method for ridding the model of autocorrelation. But recent work contends that the lagged dependent variable specification is too problematic for use in most situations. More specifically, if residual autocorrelation is present, the lagged dependent variable causes the coefficients for explanatory variables to be biased downward. We use a Monte Carlo analysis to assess empirically how much bias is present when a lagged dependent variable is used under a wide variety of circumstances. In our analysis, we compare the performance of the lagged dependent variable model to several other time series models. We show that while the lagged dependent variable is inappropriate in some circumstances, it remains the an appropriate model for the dynamic theories often tested by applied analysts. From the analysis, we develop several practical suggestions on when and how to use lagged dependent variables on the right hand side of a model.
Keele, Luke, Kelly, Nathan
Submitted: 2005-06-28
Keywords: time series, lagged dependent variables, OLS
Abstract: (click to show/hide) A lagged dependent variable in an OLS regression is often used as a means of capturing dynamic effects in political processes and as a method for ridding the model of autocorrelation. But recent work contends that the lagged dependent variable specification is too problematic for use in most situations. More specifically, if residual autocorrelation is present, the lagged dependent variable causes the coefficients for explanatory variables to be biased downward. We use a Monte Carlo analysis to assess empirically how much bias is present when a lagged dependent variable is used under a wide variety of circumstances. In our analysis, we compare the performance of the lagged dependent variable model to several other time series models. We show that while the lagged dependent variable is inappropriate in some circumstances, it remains the an appropriate model for the dynamic theories often tested by applied analysts. From the analysis, we develop several practical suggestions on when and how to use lagged dependent variables on the right hand side of a model.
Efficiency, Equity, and Timing in Voting Mechanisms
Battaglini, Marco, Palfrey, Thomas, Morton, Rebecca
Submitted: 2005-06-19
Keywords: sequential voting, simultaneous voting, costly voting, turnout
Abstract: (click to show/hide) In many voting situations some participants know the choices of earlier voters. We show that in such cases and voting is costly, later voters?' decisions are dependent on both the choices of previous voters and the cost of voting and are significantly different from the choices when voting is simultaneous. Using experiments we find support for our predictions. We also ?find that increasing the cost of voting decreases both informational and economic efficiency and subsidizing voting can increase efficiency. We find a tradeoff between efficiency and equity in sequential voting: Although sequential voting is generally more advantageous for all voters than simultaneous voting, there are significant additional advantages to later voters in sequential voting even when early voters are theoretically predicted to benefit.
Battaglini, Marco, Palfrey, Thomas, Morton, Rebecca
Submitted: 2005-06-19
Keywords: sequential voting, simultaneous voting, costly voting, turnout
Abstract: (click to show/hide) In many voting situations some participants know the choices of earlier voters. We show that in such cases and voting is costly, later voters?' decisions are dependent on both the choices of previous voters and the cost of voting and are significantly different from the choices when voting is simultaneous. Using experiments we find support for our predictions. We also ?find that increasing the cost of voting decreases both informational and economic efficiency and subsidizing voting can increase efficiency. We find a tradeoff between efficiency and equity in sequential voting: Although sequential voting is generally more advantageous for all voters than simultaneous voting, there are significant additional advantages to later voters in sequential voting even when early voters are theoretically predicted to benefit.
Problems with and Solutions for Two-dimensional Models of Continuous Dependent Variables
Goodrich, Ben
Submitted: 2005-05-24
Keywords: TSCS, fixed effects, random effects, between estimator, pooled OLS
Abstract: (click to show/hide) This paper addresses hierarchical models with continuous dependent variables, such as time-series-cross-section models. Building on the argument in Zorn (2001), the main point of this paper is that the pooled OLS estimator is deeply flawed – especially for time-series-cross-section data – but for reasons that have not explicitly been raised in previous papers. The pooled OLS estimator, the within-estimator, the between-estimator, and the random effects estimator can be seen as special cases of the fractionally pooled estimator presented in Bartels (1996), which allows all of these estimators to be evaluated in a common framework. Taking bias and efficiency into account, using both the within-estimator and the between-estimator is likely to be the best estimation strategy for the vast majority of applications in political science.
Goodrich, Ben
Submitted: 2005-05-24
Keywords: TSCS, fixed effects, random effects, between estimator, pooled OLS
Abstract: (click to show/hide) This paper addresses hierarchical models with continuous dependent variables, such as time-series-cross-section models. Building on the argument in Zorn (2001), the main point of this paper is that the pooled OLS estimator is deeply flawed – especially for time-series-cross-section data – but for reasons that have not explicitly been raised in previous papers. The pooled OLS estimator, the within-estimator, the between-estimator, and the random effects estimator can be seen as special cases of the fractionally pooled estimator presented in Bartels (1996), which allows all of these estimators to be evaluated in a common framework. Taking bias and efficiency into account, using both the within-estimator and the between-estimator is likely to be the best estimation strategy for the vast majority of applications in political science.
Unions and Class Bias in the U.S. Electorate, 1964-2000
Leighley, Jan, Nagler, Jonathan
Submitted: 2005-05-20
Keywords: turnout, voting, elections. unions
Abstract: (click to show/hide) This paper examines the impact of unions on turnout and assesses the consequences of the dramatic decline in union strength since 1964 for the composition of the U.S. electorate. Our analysis relies on individual-level data from 1964 through 2000. We first estimate individual-level models to test for the distinct effects of union membership and union strength on individuals' probabilities of voting and then test whether the effect of individual union membership and overall union strength varies across income levels. We find that unions increase turnout by increasing turnout of union members as well as turnout of non-members. And we find that the effects of union mobilization are approximately equal for the bottom two thirds of the income distribution, but are significantly less for the top third of the income distribution. By simulating what turnout would be were union membership at its 1964 level, we show that the decline in union membership since 1964 has led to a substantial increase in class-bias in the electorate.
Leighley, Jan, Nagler, Jonathan
Submitted: 2005-05-20
Keywords: turnout, voting, elections. unions
Abstract: (click to show/hide) This paper examines the impact of unions on turnout and assesses the consequences of the dramatic decline in union strength since 1964 for the composition of the U.S. electorate. Our analysis relies on individual-level data from 1964 through 2000. We first estimate individual-level models to test for the distinct effects of union membership and union strength on individuals' probabilities of voting and then test whether the effect of individual union membership and overall union strength varies across income levels. We find that unions increase turnout by increasing turnout of union members as well as turnout of non-members. And we find that the effects of union mobilization are approximately equal for the bottom two thirds of the income distribution, but are significantly less for the top third of the income distribution. By simulating what turnout would be were union membership at its 1964 level, we show that the decline in union membership since 1964 has led to a substantial increase in class-bias in the electorate.
Treatment Spillover Effects Across Survey Experiments
Lee, Daniel, Transue, John, Aldrich, John
Submitted: 2005-04-05
Keywords: survey experiments, experiments, survey methods
Abstract: (click to show/hide) Embedding experiments within surveys has reinvigorated survey research in general and especially in political science. These designs use random assignment to create true experiments within (typically nationally) representative sample surveys. Thus, they combine the internal validity of experiments with the external validity of national surveys. We investigate whether experimental treatments spill over and effect later experiments in an unintended manner. Using the 1991 Race and Politics survey, we find evidence of experimental spillover. Specifically we find that experiments at the beginning of a survey influence later experiments. We also find (much less) evidence of adjacent experiments affecting subsequent experiments. The paper concludes with a discussion of designs for future research that could aid our understanding of experimental spillover.
Lee, Daniel, Transue, John, Aldrich, John
Submitted: 2005-04-05
Keywords: survey experiments, experiments, survey methods
Abstract: (click to show/hide) Embedding experiments within surveys has reinvigorated survey research in general and especially in political science. These designs use random assignment to create true experiments within (typically nationally) representative sample surveys. Thus, they combine the internal validity of experiments with the external validity of national surveys. We investigate whether experimental treatments spill over and effect later experiments in an unintended manner. Using the 1991 Race and Politics survey, we find evidence of experimental spillover. Specifically we find that experiments at the beginning of a survey influence later experiments. We also find (much less) evidence of adjacent experiments affecting subsequent experiments. The paper concludes with a discussion of designs for future research that could aid our understanding of experimental spillover.
Strange Bedfellows or the Usual Suspects? Spatial Models of Ideology and Interest Group Coalitions
Almeida, Richard
Submitted: 2005-04-01
Keywords: Interest groups, coalitions, spatial theory, poisson regression, ideology
Abstract: (click to show/hide) Entering into coalitions has become a standard tactic for interest groups trying to maximize success while minimizing cost. The strategic conditions underlying decisions to form or join coalitions are beginning to be explored in the political science literature, yet very little is known about the process and criteria through which interest groups select coalition partners. In this paper, I explore the partner selection process by applying spatial theories of ideology and coalition formation to interest group participation on amicus curiae briefs. Previous work demonstrates that the lobbying efforts of groups can be used to generate a general measure of ideology for any group. These captured ideology scores are used in statistical models of interest group coalition partner selection on amicus curiae briefs from 1954-1985. This research demonstrates that the ideology scores captured for each group are powerful predictors of interest group coalition partner selection, even when controls for resources, group type, and other potential predictors are included.
Almeida, Richard
Submitted: 2005-04-01
Keywords: Interest groups, coalitions, spatial theory, poisson regression, ideology
Abstract: (click to show/hide) Entering into coalitions has become a standard tactic for interest groups trying to maximize success while minimizing cost. The strategic conditions underlying decisions to form or join coalitions are beginning to be explored in the political science literature, yet very little is known about the process and criteria through which interest groups select coalition partners. In this paper, I explore the partner selection process by applying spatial theories of ideology and coalition formation to interest group participation on amicus curiae briefs. Previous work demonstrates that the lobbying efforts of groups can be used to generate a general measure of ideology for any group. These captured ideology scores are used in statistical models of interest group coalition partner selection on amicus curiae briefs from 1954-1985. This research demonstrates that the ideology scores captured for each group are powerful predictors of interest group coalition partner selection, even when controls for resources, group type, and other potential predictors are included.
Methodology as ideology: mathematical modeling of trench warfare
Gelman, Andrew
Submitted: 2005-01-26
Keywords: cooperation, First World War, game theory, prisonerâs dilemma
Abstract: (click to show/hide) The Evolution of Cooperation, by Axelrod (1984), is a highly influential study that identifies the benefits of cooperative strategies in the iterated prisoner’s dilemma. We argue that the most extensive historical analysis in the book, a study of cooperative behavior in First World War trenches, is in error. Contrary to Axelrod’s claims, there soldiers in the Western Front were not generally in a prisoner’s dilemma (iterated or otherwise), and their cooperative behavior can be explained much more parsimoniously as immediately reducing their risks. We discuss the political implications of this misapplication of game theory.
Gelman, Andrew
Submitted: 2005-01-26
Keywords: cooperation, First World War, game theory, prisonerâs dilemma
Abstract: (click to show/hide) The Evolution of Cooperation, by Axelrod (1984), is a highly influential study that identifies the benefits of cooperative strategies in the iterated prisoner’s dilemma. We argue that the most extensive historical analysis in the book, a study of cooperative behavior in First World War trenches, is in error. Contrary to Axelrod’s claims, there soldiers in the Western Front were not generally in a prisoner’s dilemma (iterated or otherwise), and their cooperative behavior can be explained much more parsimoniously as immediately reducing their risks. We discuss the political implications of this misapplication of game theory.
Multilevel (hierarchical) modeling: what it can and can't do
Gelman, Andrew
Submitted: 2005-01-26
Keywords: Bayesian inference, hierarchical model, multilevel regression
Abstract: (click to show/hide) Multilevel (hierarchical) modeling is a generalization of linear and generalized linear modeling in which regression coefficients are themselves given a model, whose parameters are also estimated from data. We illustrate the strengths and limitations of multilevel modeling through an example of the prediction of home radon levels in U.S. counties. The multilevel model is highly effective for predictions at both levels of the model but could easily be misinterpreted for causal inference.
Gelman, Andrew
Submitted: 2005-01-26
Keywords: Bayesian inference, hierarchical model, multilevel regression
Abstract: (click to show/hide) Multilevel (hierarchical) modeling is a generalization of linear and generalized linear modeling in which regression coefficients are themselves given a model, whose parameters are also estimated from data. We illustrate the strengths and limitations of multilevel modeling through an example of the prediction of home radon levels in U.S. counties. The multilevel model is highly effective for predictions at both levels of the model but could easily be misinterpreted for causal inference.
