About the Society
Papers, Posters, Syllabi
Submit an Item
Polmeth Mailing List
Below results based on the criteria 'experiment'
Total number of records returned: 17
Sensitive Questions, Truthful Answers? Modeling the List Experiment Multivariately With LISTIT
Standard estimation procedures assume that empirical observations are accurate reflections of the true values of the dependent variable, but this assumption is dubious when model self-reported data on sensitive topics. List experiments can nullify incentives for respondents to lie to interviewers, but current data analysis techniques are limited to difference-in-means tests. I present a revised procedure and statistical estimator called listit to model the process multivariately. Monte Carlo simulations and a field test in Lebanon explore the behavior of this estimator.
Estimating Treatment Effects in the Presence of Noncompliance and Nonresponse: The Generalized Endogenous Treatment Model
Average Treatment Effects
Selection on Unobservables
Latent Variable Models
If ignored, non-compliance with a treatment and nonresponse on outcome measures can bias estimates of treatment effects in a randomized experiment. To identify treatment effects in the case where compliance and response are conditioned on unobservables, we propose the parametric generalized endogenous treatment (GET) model. As a multilevel random effect model, GET improves on current approaches to principal stratification by incorporating behavioral responses within an experiment to measure each subjects' latent compliance type. We use Monte Carlo methods to show GET has a lower MSE for treatment effect estimates than existing approaches to principal stratification that impute, rather than measure, compliance type for subjects assigned to the control. In an application, we use data from a recent field experiment to assess whether exposure to a deliberative session with their member of Congress changes constituents' levels of internal and external efficacy. Since it conditions on subjects' latent compliance type, GET is able to test whether exposure to the treatment is ignorable after balancing on covariates via matching methods. We show that internally efficacious subjects disproportionately select into the deliberative sessions, and that matching apparently does not break the latent dependence between treatment compliance and outcome. The results suggest that exposure to the deliberative sessions improves external, but not internal, efficacy.
Survey Context Effects in Anchoring Vignettes
differential item functioning
Methodologists (King et al. 2004; King and Wand 2007) have recently proposed a novel approach to adjusting for bias in interpersonal and cross- cultural comparisons in survey research. The method centers on the use of anchoring vignettes to allow the statistical correction of differential usage of ordinal response scales at the individual or group level. Using data from a randomized survey experiment I investigate whether analyses based on these vignettes may be vulnerable to the introduction of survey artifacts due to vignette ordering or the placement of the self-assessment item relative to the vignettes. I find several patterns of bias due to context effects. Researchers using anchoring vignettes should consider randomization or other methods to mitigate these problems.
Can October Surprise? A Natural Experiment Assessing Late Campaign Effects
Vote by mail
One consequence of the proliferation of vote-by-mail (VBM) in certain areas of the United States is the opportunity for voters to cast ballots weeks before Election Day. Understanding the ensuing effects of VBM on late campaign information loss has important implications for both the study of campaign dynamics and public policy debates on the expansion of convenience voting. Unfortunately, the self-selection of voters into VBM makes it difficult to casually identify the effect of VBM on election outcomes. We overcome this identification problem by exploiting a natural experiment, in which some precincts are assigned to be VBM-only based on an arbitrary threshold of the number of registered voters. We assess the effects of VBM on candidate performance in the 2008 California presidential primary via a regression discontinuity design. We show that VBM both increases the probability of selecting candidates who withdrew from the race in the interval after the distribution of ballots but before Election Day and affects the relative performance of candidates remaining in the race. Thus, we find evidence of late campaign information loss, pointing to the influence of campaign events and momentum in American politics, as well as the unintended consequences of convenience voting.
Competing Solutions to the Principal-Agent Model
quantal response equilibrium
strategic statistical model
Principal-Agent (PA) theory has been used for over three decades to model the relationship between an information-advantaged Agent and a Principal able to issue a contract ultimatum. For its common implementation as a game, the subgame-perfect Nash equilibrium is reasonably simple but generally wrong in predicting experimental or observational data. This paper implements PA theory theoretically and statistically as two kinds of strategic statistical model, then develops methods for testing competing behavioral hypotheses. I show that subgame-perfect Nash equilibrium, risk aversion/affinity, distributive justice/fairness theories, agent error, and random utility can be observationally distinct and how they might be distinguished statistically.
Measuring Political Support and Issue Ownership Using Endorsement Experiments, with Application to the Militant Groups in Pakistan
To measure the levels of support for political actors (e.g., candidates and parties) and the strength of their issue ownership, survey experiments are often conducted in which respondents are asked to express their opinion about a particular policy endorsed by a randomly selected political actor. These responses are contrasted with those from a control group that receives no endorsement. This survey methodology is particularly useful for studying sensitive political attitudes. We develop a Bayesian hierarchical measurement model for such endorsement experiments, demonstrate its statistical properties through simulations, and use it to measure support for Islamist militant groups in Pakistan. Our model uses item response theory to estimate support levels on the same scale as the ideal points of respondents. The model also estimates the strength of political actors' issue ownership for specic policies as well as the relationship between respondents' characteristics and support levels. Our analysis of a recent survey experiment in Pakistan reveals three key patterns. First, citizens' attitudes towards militant groups are geographically clustered. Second, once these regional differences are taken into account, respondents' characteristics have little predictive power for their support levels. Finally, militant groups tend to receive less support in the areas where they operate.
What Can We Learn with Statistical Truth Serum? Design and Analysis of the List Experiment
item count technique
Due to the inherent sensitivity of many survey questions, a number of researchers have adopted an indirect questioning technique known as the list experiment (or the item count technique) in order to minimize bias due to dishonest or evasive responses. However, standard practice with the list experiment requires a large sample size, is not readily adaptable to regression or multivariate modeling, and provides only limited diagnostics. This paper addresses all three of these issues. First, the paper presents design principles for the standard list experiment (and the double list experiment) to minimize bias and reduce variance as well as providing sample size formulas for the planning of studies. Additionally, this paper investigates the properties of a number of estimators and introduces an easy-to-use piecewise estimator that reduces necessary sample sizes in many cases. Second, this paper proves that standard-procedure list experiment data can be used to estimate the probability that an individual holds the socially undesirable opinion/behavior. This allows multivariate modeling. Third, this paper demonstrates that some violations of the behavioral assumptions implicit in the technique can be diagnosed with the list experiment data. The techniques in this paper are illustrated with examples from American politics.
The Perils of Failed Randomization: Investigating Regression Adjustment of Regionally Confounded Cross-National Data
Many important papers studying cross-national outcomes such as political regime type or economic development exploit treatment variables generated by either geological or pre-modern historical processes. A general and major problem with these treatments, however, derives from their heavy regional concentration. Despite not being caused by other variables that independently affect the dependent variable, due to geological or historical accidents, variables such as oil or settler mortality claimed to be exogenous are nonetheless highly correlated with potential confounders that impede drawing causal inferences. With the goal of eliminating bias by controlling for observables, many papers studying variables such as these use parametric procedures to control for regional dummies. While estimation techniques such as ordinary least squares (OLS) provide a seemingly straightforward methodological fix, OLS also obscures particular shortcomings of the data, and imposes strong assumptions to combine information across regions. The current paper takes a closer look at these assumptions and provides examples from top political science and economic journals to show how disaggregating the data can either help to support or to severely qualify existing results.
Does Encouragement Matter in Improving Gender Imbalances in Technical Fields? Evidence from a Randomized Controlled Trial
Education policy research looking at gender imbalances in technical fields often re- lies on observational data or small N experimental studies. Taking a different approach, we present the results of one of the first and largest randomized controlled trials on the topic. Using the 2014 Political Methodology Annual Meeting as our context, half of a pool of 3,945 political science graduate students were randomly assigned to receive two personalized emails encouraging them to apply to the conference (n = 1,976), while the other half received nothing (n = 1,969). We find a robust, positive effect associated with this simple intervention and suggestive evidence that women responded more strongly than men. However, we find that women’s conference acceptance rates are higher within the control group than in the treated group. This is not the case for men. The reason appears to be that female applicants in the treated group solicited supporting letters at lower rates. The contributions from this research are twofold. First, our findings are among the first large-scale randomized controlled interventions in higher education. Second, and less optimistically, our findings suggest that such “low dose” interventions may promote diversity in STEM fields, but that they have the potential to expose underlying disparities when used alone or in a non-targeted way.
Shaken, Not Stirred: Evidence on Ballot Order Effects from the California Alphabet Lottery, 1978 - 2002
Ho, Daniel E.
We analyze a natural experiment to answer the longstanding question of whether the name order of candidates on ballots affects election outcomes. Since 1975, California law has mandated randomizing the ballot order with a lottery, where alphabet letters would be shaken vigorously and selected from a container. Previous studies, relying overwhelmingly on non-randomized data, have yielded conflicting results about whether ballot order effects even exist. Using improved statistical methods, our analysis of statewide elections from 1978 to 2002 reveals that in general elections ballot order has a significant impact only on minor party candidates and candidates for nonpartisan offices. In primaries, however, being listed first benefits everyone. In fact, ballot order might have changed the winner in roughly nine percent of all primary races examined. These results are largely consistent with a theory of partisan cuing. We propose that all electoral jurisdictions randomize ballot order to minimize ballot effects.
An Experimental Test of Proximity and Directional Voting
Lewis and King (2000) discuss difficulties in evaluating the proximity hypothesis about issue voting versus the directional hypothesis. In this paper, we propose as a solution to this problem is asking individuals to evaluate candidates generated to represent certain issue positions experimentally. Such an approach controls candidates' positions, while holding other features constant, presents these fictitious candidates to randomly assigned subjects, and examines whether the relationship between subjects' evaluations of these candidates and their ideological beliefs is consistent with proximity or directional theory. Our results provide slightly more support for proximity theory, but our data are not entirely conclusive on this point.
The Robustness of Normal-theory LISREL Models: Tests Using a New Optimizer, the Bootstrap, and Sampling Experiments, with Applications
Mebane, Walter R.
Wells, Martin T.
linear structural relations
Asymptotic results from theoretical statistics show that the linear structural relations (LISREL) covariance structure model is robust to many kinds of departures from multivariate normality in the observed data. But close examination of the statistical theory suggests that the kinds of hypotheses about alternative models that are most often of interest in political science research are not covered by the nice robustness results. The typical size of political science data samples also raises questions about the applicability of the asymptotic normal theory. We present results from a Monte Carlo sampling experiment and from analysis of two real data sets both to illustrate the robustness results and to demonstrate how it is unwise to rely on them in substantive political science research. We propose new methods using the bootstrap to assess more accurately the distributions of parameter estimates and test statistics for the LISREL model. To implement the bootstrap we use optimization software two of us have developed, incorporating the quasi-Newton BFGS method in an evolutionary programming algorithm. We describe methods for drawing inferences about LISREL models that are much more reliable than the asymptotic normal-theory techniques. The methods we propose are implemented using the new software we have developed. Our bootstrap and optimization methods allow model assessment and model selection to use well understood statistical principles such as classical hypothesis testing.
Do you feel Angry? Are you sure? Testing the Reliability of Overt Emotional Cues and the Effects of Semantic Self Reports in Experimental Research
In the experimental study of political affect two assumptions are implicit. First, scholars assume that the affective state intended by the treatment is actually invoked. Second, scholars assume that semantic prompts such as, “Has (Barack Obama/John McCain) -- because of the kind of person he is, or because of something he has done -- ever made you feel: (insert word from feeling scale),” provide an accurate reliability check on the former assumption. However, work in psychology demonstrates that the use of semantic self reports is unreliable because participants do poorly at accurately reporting experienced emotion (Breckler 1984; Shacter and Singer 1962; Weber et al. 2007). If the presence of the prompt introduces error into the model or participants do not reliably recall their affective state then the use of semantic affective prompts is problematic. I ask: Q1: Is the semantic affective prompt an effective check on the reliability of an emotional cue? Additionally, I examine the use of overt anger cues versus subliminal anger cues in eliciting anger. Though most scholars use semantic self-reports as a direct test that the emotion of interest was elicited, others use subliminal primes to elicit emotional states outside of awareness (Bargh 1997). I ask: Q2: Is there a significant difference between models that invoke emotion overtly versus subliminally? I utilize a unique research design to tease out the effects of interest. To do so I set up treatment conditions which vary in the way the affective state is invoked (overtly/subliminally) and in the presence or absence of a semantic affective prompt. If find that challenges to the use of the semantic affective prompt are warranted: there is a mean difference in the responses of participants assigned to the semantic affective prompt condition and participants assigned to the no affective prompt condition.
Not in my Backyard: Estimating the Impact of Immigrant Race and Proximity on Immigration Policy Opinion
Anastasopoulos, L. Jason
Opinion polls taken over the past two years suggest that a majority of Americans prefer granting illegal immigrants amnesty. At the same time, however, restrictive state and local immigration laws which aim to identify, arrest and ultimately deport illegal immigrants also receive majority support. In this paper, I develop a survey experiment which employs Internet Protocol (IP) address geo-location technology in an attempt to understand this public opinion divide and to reassess the relevance of racial and cultural threat theories as explanations for immigration policy preferences. To accomplish this, skin tone and perceived proximity of a fictional illegal Mexican immigrant are manipulated using an embedded image and the location of the respondent, respectively. Within the stimulus, the respondent’s city and state are first determined using their IP address and are subsequently displayed to them. Three major findings emerge: (1) close immigrant proximity causes attitude polarization on immigration policy; (2) close immigrant proximity increases pro-immigration sentiment but only when immigrant had a light complexion; (3) immigrant race effects opinions on immigration policy in opposite directions depending upon where the immigrant is located: when no immigrant location information is provided, white respondents express greater pro-immigrant sentiment towards the darker immigrant. When respondents believe that the immigrant resides in their city and state, however, support for restrictive state immigration laws increases when the darker immigrant is shown.
The Big Sort(s): Diversity, White Flight and Polarization in Neighborhoods and Cities.
Anastasopoulos, L. Jason
agent based model
Recent scholarship suggests that ideological polarization in Congress can ultimately be explained by changes in residential choice decisions. According to Bishop (2008), Americans have become more likely to live near others ideologically similar to themselves. These trends, in turn, have contributed to a widening partisan and ideological gap between urban and suburban areas across American cities (Bishop 2008; Hui 2011; Nall 2012). In this paper, I develop the Migration-Flight-Polarization (MFP) theory to explain how changes in diversity brought about by internal migration and immigration hold the key to understanding the connection between residential choice decisions and geographic polarization along partisan and ideological lines. Using an original agent-based modeling simulation and data collected from Houston, Texas during Hurricane Katrina migration, I demonstrate that changes in diversity and ``white flight'' responses to these changes are responsible for the growing partisan divide in Houston neighborhoods and the City of Houston as a whole.
Direct Sports Competition and Interstate Conflict: Investigating a Randomized Natural Experiment
Past studies have found qualitative and quantitative evidence suggesting that international sports can increase tensions between countries and lead to military conflict. We provide a new test of this theory by taking advantage of the random assignment of countries to compete against each other in the group stage of the World Cup from 1930-2010. We first briefly explain how countries were randomly assigned to World Cup groups over this period. We then use randomization inference to show that the number of military disputes between pairs of countries that were assigned to the same group was much larger than would be expected if World Cup competition had no impact on international conflict. The results hold under many important robustness checks, including a placebo test on the previous outcome. To further illustrate this effect, we use Twitter data to show that competition on the playing field increased expressions of nationalism during the 2014 World Cup.
Tweetment Effects on the Tweeted: An Experiment to Reduce Twitter Harassment
I conduct an experiment which examines the impact of group norm promotion and social sanctioning on harassment and other manifestations of prejudice. I collect a sample of Twitter users who have harassed other users and use accounts I control to sanction the harassing behavior. By varying the identity of the bots that, I test theories about social norm promotion and the eects of network diversity. Novel contributions to the literature on prejudice reduction include real- world measures of prejudice and the ability to continuously measure treatment eects over an indenite time horizon.