
1 
Paper

Generalized Substantively Reweighted Least Squares Regression
Gill, Jeff

Uploaded 
01291997

Keywords 
Linear Models Robust Procedures Data Analysis Outlier Identification

Abstract 
Linear modeling often employs robust and resistant
techniques to compensate for undesirable properties in
the data. Conversely, Substantive Weighted Least
Squares differs from these techniques since it
seeks to analyze what makes the outliers
distinguishable in their use of resources. SWLS does
not see outliers as becoming potentially unbounded or
even that they are necessarily undesirable elements
of the data. SWLS runs consecutive weighted OLS models
downweighting each case whose jacknifed residual is
less than a specific threshold. Final iteration
significant variables are identified as those which
have a greater effect on higher performing cases and
therefore provide prescriptive recommendations.
GSRLS generalizes the SWLS technique by using
transformations relating the jackknifed residuals to
a common tabular distribution. This allows alphalevel
positive outlier identification. Here, GSRLS is first
placed in a theoretical context and further explored
through montecarlo simulation. In general, GSRLS
can be seen as a dataanalytic tool that exploits
certain characteristics of the linear model to find
variable influence on successful cases. 

2 
Paper

Presidential Approval: the case of George W. Bush
Beck, Nathaniel
Jackman, Simon
Rosenthal, Howard

Uploaded 
07192006

Keywords 
presidential approval public opinion polls house effects dynamic linear model Bayesian statistics Markov chain Monte Carlo state space pages of killer graphs

Abstract 
We use a Bayesian dynamic linear model to track approval for George W. Bush over time. Our analysis deals with several issues that have been usually addressed separately in the extant literature. First, our analysis uses polling data collected at a higher frequency than is typical, using over 1,100 published national polls, and data on macroeconomic conditions collected at the weekly level. By combining this much poll information, we are much better poised to examine the public's reactions to events over shorter time scales than can the typical analysis of approval that utilizes monthly or quarterly approval. Second, our statistical modeling explicitly deals with the sampling error of these polls, as well as the possibility of bias in the polls due to house effects. Indeed, quite aside from the question of ``what drives approval?'', there is considerable interest in the extent to which polling organizations systematically diverge from one another in assessing approval for the president. These bias parameters are not only necessary parts of any realistic model of approval that utilizes data from multiple polling organizations, but easily estimated via the Bayesian dynamics linear model. 

3 
Paper

Splitting a predictor at the upper quarter or third and the lower quarter or third
Gelman, Andrew
Park, David

Uploaded 
07062007

Keywords 
discretization linear regression statistical communication trichotomizing

Abstract 
A linear regression of $y$ on $x$ can be approximated by a simple difference: the average values of $y$ corresponding to the highest quarter or third of $x$, minus the average values of $y$ corresponding to the lowest quarter or third of $x$. A simple theoretical analysis shows this comparison performs reasonably well, with 80%90% efficiency compared to the linear regression if the predictor is uniformly or normally distributed. Discretizing $x$ into three categories claws back about half the efficiency lost by the commonlyused strategy of dichotomizing the predictor.
We illustrate with the example that motivated this research: an analysis of income and voting which we had originally performed for a scholarly journal but then wanted to communicate to a general audience.


4 
Paper

Degeneracy and Inference for Social Networks
Handcock, Mark S.

Uploaded 
07152002

Keywords 
Random graph models loglinear network model Markov fields Markov Chain Monte Carlo

Abstract 
Networks are a form of "relational data". Relational
data arise in many social science fields and graph
models are a natural approach to representing the
structure of these relations. This framework has many
applications including, for example, the structure
of social networks, the behavior of epidemics, the
interconnectedness of the WWW, and longdistance
telephone calling patterns.
We review stochastic models for such graphs, with
particular focus on sexual and drug use networks.
Commonly used Markov models were introduced by Frank
and Strauss (1986) and were derived from developments
in spatial statistics (Besag 1974). These models
recognize the complex dependencies within relational
data structures.
To date, the use of graph models for networks
has been limited by three interrelated factors:
the complexity of realistic models, lack of use of
simulation studies, and a poor understanding of the
properties of inferential methods. In this talk we
discuss these factors and the degeneracy of commonly
promoted models. We also review the role of Markov
Chain Monte Carlo (MCMC) algorithms for simulation
and likelihoodbased inference.
These ideas are applied to a sexual relations
network from Colorado Springs with the objective of
understanding the social determinants of HIV spread.
In this talk we focus on stochastic models for such
graphs that can be used to represent the structural
characteristics of the networks. In our applications,
the nodes usually represent people, and the edges
represent a specified relationship between the people. 

5 
Paper

Models of Path Dependence with an Empirical Application
Jackson, John
Kollman, Ken

Uploaded 
07172007

Keywords 
Path dependence partisanship nonlinear least squares

Abstract 
It is now commonplace in the social sciences to describe an outcome or process as path dependent. By path dependence, researchers generally mean that the sequence of events prior to the observation of the outcome has explanatory power. The paper develops models that have both path dependent and nonpath dependent properties, depending upon the value of a particular parameter. The paper then uses nonlinear least squares and a Monte Carlo simulation to explore how well this parameter can be estimated, meaning how well scholars can discriminate betwen the two processes. The methodology is applied to the evolution of attitudes on aid to minorities and partisanship between 1956 and 2000. The results are consistent with the path dependent model. 

7 
Paper

A default prior distribution for logistic and other regression models
Gelman, Andrew
Jakulin, Aleks
Pittau, Maria Grazia
Su, YuSung

Uploaded 
08032007

Keywords 
Bayesian inference generalized linear model least squares hierarchical model linear regression logistic regression multilevel model noninformative prior distribution

Abstract 
We propose a new prior distribution for classical (nonhierarchical) logistic regression models, constructed by first scaling all nonbinary variables to have mean 0 and standard deviation 0.5, and then placing independent Student$t$ prior distributions on the coefficients. As a default choice, we recommend the Cauchy distribution with center 0 and scale 2.5, which in the simplest setting is a longertailed version of the distribution attained by assuming onehalf additional success and onehalf additional failure in a logistic regression. We implement a procedure to fit generalized linear models in R with this prior distribution by incorporating an approximate EM algorithm into the usual iteratively weighted least squares. We illustrate with several examples, including a series of logistic regressions predicting voting preferences, an imputation model for a public health data set, and a hierarchical logistic regression in epidemiology.
We recommend this default prior distribution for routine applied use. It has the advantage of always giving answers, even when there is complete separation in logistic regression (a common problem, even when the sample size is large and the number of predictors is small) and also automatically applying more shrinkage to higherorder interactions. This can be useful in routine data analysis as well as in automated procedures such as chained equations for missingdata imputation. 

9 
Paper

Nonparametric Priors For Ordinal Bayesian Social Science Models: Specification and Estimation
Gill, Jeff
Casella, George

Uploaded 
08212008

Keywords 
generalized linear mixed model ordered probit Bayesian approaches nonparametric priors Dirichlet process mixture models nonparametric Bayesian inference

Abstract 
A generalized linear mixed model, ordered probit, is used to estimate levels of stress in presidential political appointees as a means of understanding their surprisingly short tenures. A Bayesian approach is developed, where the random effects are modeled with a Dirichlet process mixture prior, allowing for useful incorporation of prior information, but retaining some vagueness in the form of the prior. Applications of Bayesian models in the social sciences are typically done with ``noninformative'' priors, although some use of informed versions exists. There has been disagreement over this, and our approach may be a step in the direction of satisfying both camps. We give a detailed description of the data, show how to implement the model, and describe some interesting conclusions. The model utilizing a nonparametric prior fits better and reveals more information in the data than standard approaches. 

10 
Paper

Bayesian Learning about Ideal Points of U.S. Supreme Court Justices, 19531999
Martin, Andrew D.
Quinn, Kevin M.

Uploaded 
07092001

Keywords 
item response models dynamic linear models Markov chain Monte Carlo

Abstract 
At the heart of attitudinal and strategic explanations of judicial
behavior is the assumption that justices have policy preferences. These
preferences have been measured in a handful of ways, including using
factor analysis and multidimensional scaling techniques (Schubert, 1965,
1974), looking at past votes in a single policy area (Epstein et al.,
1989), contentanalyzing newspaper editorials at the time of appointment
to the Court (Segal and Cover, 1989), and recording the background
characteristics of the justices (Tate and Handberg, 1991). In this
manuscript we employ Markov chain Monte Carlo (MCMC) methods to Þt
Bayesian measurement models of judicial preferences for all justices
serving on the U.S. Supreme Court from 1953 to 1999. We are particularly
interested in considering to what extent ideal points of justices change
throughout their tenure on the Court, and how the proposals over which
they are voting also change across time. To do so, we Þt four longitudinal
item response models that include dynamic speciÞcations for the ideal
points and the casespeciÞc parameters. Our results suggest that justices
do not have constant ideal points, even after controlling for the types of
cases that come before the Court. 

15 
Paper

Spike and Slab Prior Distributions for Simultaneous Bayesian Hypothesis Testing, Model Selection, and Prediction, of Nonlinear Outcomes
Pang, Xun
Gill, Jeff

Uploaded 
07132009

Keywords 
Spike and Slab Prior Hypothesis Testing Bayesian Model Selection Bayesian Model Averaging Adaptive Rejection Sampling Generalized Linear Model

Abstract 
A small body of literature has used the spike and slab prior specification for model selection with strictly linear outcomes. In this setup a twocomponent mixture distribution is stipulated for coefficients of interest with one part centered at zero with very high precision (the spike) and the other as a distribution diffusely centered at the research hypothesis (the slab). With the selective shrinkage, this setup incorporates the zero coefficient contingency directly into the modeling process to produce posterior probabilities for hypothesized outcomes. We extend the model to qualitative responses by designing a hierarchy of forms over both the parameter and model spaces to achieve variable selection, model averaging, and individual coefficient hypothesis testing. To overcome the technical challenges in estimating the marginal posterior distributions possibly with a dramatic ratio of density heights of the spike to the slab, we develop a hybrid Gibbs sampling algorithm using an adaptive rejection approach for various discrete outcome models, including dichotomous, polychotomous, and count responses. The performance of the models and methods are assessed with both Monte Carlo experiments and empirical applications in political science. 

16 
Paper

Modeling Direction and Intensity in Ordinal Scales with Midpoints
Jones, Bradford S.
Sobel, Michael E.

Uploaded 
07211998

Keywords 
adjacent category logit loglinear models public opinion Congress

Abstract 
Political opinion analysts are frequently work with
semantically balanced ordinal scales. Such
survey items are frequently used to measure candidate
evaluations, public spending preferences,
positions on social issues, and candidate
and party placement. Because of the special nature of
these survey items (semantically balanced about a
midpoint), researchers may be interested in understanding
how both the response direction and response intensity
varies over time and/or across covariate classes. That
is, trends may be found in the tendency for respondents
to choose categories above vs. below the midpoint
(the response direction) and trends may be found in
the tendency for respondents to choose between or among
category labels above or below the midpoint. And
while political analysts are commonly interested in
response intensity and direction, traditional
methods used to model distributions on semantically
balanced ordinal scales are problematic. In this
paper, we discuss a class of models originally
developed by Sobel (1995, 1997, 1998) that allows
researchers to simultaneously model direction and
intensity in ordinal scales with midpoints.
Specifically, we parameterize the model as an
adjacent category logit model. Numerous
parsimonious models may be arrived at that describe
trends in the response direction and response intensity.
Because the adjacent category logit model is linear in the logits,
we estimate the model using loglinear models. We
present an application of the models to data
on approval ratings of House incumbents. We find that
the trends in response directions (the
tendency for respondents to evaluate the
incumbent favorably or not favorably) increase through
the 1980s, peaking in the late Eighties, and are
now declining over the 1990s. With regard to
response intensity, (that is, the tendency to
respond in the extreme categories vs. the moderate
categories), we find that intensity increases during
most presidential election cycles and vanishes
during midterm election years. We argue this finding
is related to the different levels of political
information citizens are exposed to in presidential
vs. midterm election cycles. 

17 
Paper

Identification, Inference, and Sensitivity Analysis for Causal Mediation Effects
Imai, Kosuke
Keele, Luke
Yamamoto, Teppei

Uploaded 
07202009

Keywords 
causal inference causal mediation analysis direct and indirect eects linear structural equation models sequential ignorability unmeasured confounders

Abstract 
Causal mediation analysis is routinely conducted by applied researchers in a variety of disciplines including epidemiology, political science, psychology, and sociology. The goal of such an analysis is to investigate alternative causal mechanisms by examining the roles of intermediate variables that lie in the causal path between the treatment and outcome variables. In this paper, we first prove that under a particular version of sequential ignorability assumption, the average causal mediation effect (ACME) is nonparametrically identified. We compare our identifying assumption with those proposed in the literature. Some practical implications of our identification result are also discussed. In particular, the popular estimator based on the linear structural equation model (LSEM) can be interpreted as an ACME estimator if the linearity and nointeraction assumptions are satisfied in addition to the proposed assumption. We show that this assumption can easily be relaxed within the framework of LSEM. Second, we consider a simple nonparametric estimator of the ACME in order to relax distributional and functional form assumptions. We also discuss a more general nonparametric approach. Third, we propose a new sensitivity analysis that can be easily implemented by applied researchers within the standard LSEM framework. Like the existing identifying assumptions, the proposed assumption may be too strong in many applied settings. Thus, sensitivity analysis is essential in order to examine the robustness of empirical findings to the possible existence of an unmeasured confounder. Finally, we apply the proposed methods to a randomized experiment from political psychology. 

18 
Paper

Direction and Intensity of Russian Macroeconomic Evaluations
Jones, Bradford S.
Willerton, John P.
Sobel, Michael E.

Uploaded 
08301998

Keywords 
Russia public opinion log linear models

Abstract 
The Russian macroeconomy has exhibited volatility since the transformation from
the Soviet Union to the Russian Federation. Much is known about the Russian
public opinion climate during the end of the Soviet era and the beginning of
the Russian Federation era; however, less well understood is the nature of
Russians' macroeconomic evaluations during this ongoing transformation. In
this paper, we analyze Russians' assessments of the macroeconomy using Russian
public opinion data asking respondents to assess the Russian national economy.
We establish four testable hypotheses. First, we hypothesize that the
direction of Russian opinion will be asymmetrically more negative than positive
across all periods in the study. Second, we hypothesize that economic
assessments will vary by residential region. Specifically, we contend the
response distribution for respondents from Moscow and St. Petersburg (MSP) will
differ from respondents from other residential regions. Third (and related to
the second), we hypothesize that the response distributions for MSP respondents
will be temporally heterogenous while the response distribution for respondents
outside MSP will be temporally homogenous. Fourth, we hypothesize that despite
the poor performance of the economy during the Russian Federation transition,
Russian public opinion will not exhibit extreme negativity in macroeconomic
evaluations. Using published survey data collected from the bi
monthly extsl{Russian Public Opinion Monitor} conducted by the Russian Center
for Public Opinion Research (VCIOM), for the period January 1994 to July 1996,
we examine both the direction and intensity of Russian opinion toward the
state of the national economy by estimating the distribution on the response
variable using an adjacent category logit model (Jones and Sobel 1998, Sobel
1995, 1997, 1998). From our analysis, we find first that the direction of
Russians' evaluation of the macroeconomy is consistently negative rather than
positivea finding that corroborates extant research; however, the
directional nature of economic assessments displays significant residential
variation between MSP and the rest of the country. Second, we find significant
residential variation in economic assessments. Specifically, the response
distribution for MSP respondents can be distinguished from the response
distribution from respondents in other residential regions, and also, the
response distribution for MSP respondents displays considerable temporal
heterogeneity. We argue this variability tends to follow changes in the
macroeconomic and political environments. Third, we do not find support for the
hypothesis of temporal homogeneity in the response distribution for respondents
outside of MSP. Nevertheless, residents in other cities and in rural regions
seem not to be as responsive to macroeconomic changes over the period, thus
eliciting milder temporal variability than MSP respondents. Fourth, we find
that in terms of the response distribution, the intensity of Russian pessimism
(or optimism) is extsl{not} extreme. 

20 
Paper

Direction and Intensity of Russian Macroeconomic Evaluations
Jones, Bradford S.
Willerton, John P.
Sobel, Michael E.

Uploaded 
08301998

Keywords 
Russia public opinion log linear models

Abstract 
The Russian macroeconomy has exhibited volatility since the transformation from
the Soviet Union to the Russian Federation. Much is known about the Russian
public opinion climate during the end of the Soviet era and the beginning of
the Russian Federation era; however, less well understood is the nature of
Russians' macroeconomic evaluations during this ongoing transformation. In
this paper, we analyze Russians' assessments of the macroeconomy using Russian
public opinion data asking respondents to assess the Russian national economy.
We establish four testable hypotheses. First, we hypothesize that the
direction of Russian opinion will be asymmetrically more negative than positive
across all periods in the study. Second, we hypothesize that economic
assessments will vary by residential region. Specifically, we contend the
response distribution for respondents from Moscow and St. Petersburg (MSP) will
differ from respondents from other residential regions. Third (and related to
the second), we hypothesize that the response distributions for MSP respondents
will be temporally heterogenous while the response distribution for respondents
outside MSP will be temporally homogenous. Fourth, we hypothesize that despite
the poor performance of the economy during the Russian Federation transition,
Russian public opinion will not exhibit extreme negativity in macroeconomic
evaluations. Using published survey data collected from the bi
monthly extsl{Russian Public Opinion Monitor} conducted by the Russian Center
for Public Opinion Research (VCIOM), for the period January 1994 to July 1996,
we examine both the direction and intensity of Russian opinion toward the
state of the national economy by estimating the distribution on the response
variable using an adjacent category logit model (Jones and Sobel 1998, Sobel
1995, 1997, 1998). From our analysis, we find first that the direction of
Russians' evaluation of the macroeconomy is consistently negative rather than
positivea finding that corroborates extant research; however, the
directional nature of economic assessments displays significant residential
variation between MSP and the rest of the country. Second, we find significant
residential variation in economic assessments. Specifically, the response
distribution for MSP respondents can be distinguished from the response
distribution from respondents in other residential regions, and also, the
response distribution for MSP respondents displays considerable temporal
heterogeneity. We argue this variability tends to follow changes in the
macroeconomic and political environments. Third, we do not find support for the
hypothesis of temporal homogeneity in the response distribution for respondents
outside of MSP. Nevertheless, residents in other cities and in rural regions
seem not to be as responsive to macroeconomic changes over the period, thus
eliciting milder temporal variability than MSP respondents. Fourth, we find
that in terms of the response distribution, the intensity of Russian pessimism
(or optimism) is extsl{not} extreme. 

21 
Paper

How can soccer improve statistical learning?
Filho, Dalson
Rocha, Enivaldo
Paranhos, Ranulfo
JÃºnior, JosÃ©

Uploaded 
03192014

Keywords 
quantitative methods linear regression soccer

Abstract 
This paper presents an active classroom exercise focusing on the interpretation of ordinary least squares regression coefficients. Methodologically, students analyze Brazilian soccer matches data, formulate and test classical hypothesis regarding home team advantage. Technically, our framework is simply adapted for others sports and has no implementation cost. In addition, the exercise is easily conducted by the instructor and highly enjoyable for the students. The intuitive approach also facilitates the understanding of linear regression practical application. 

23 
Paper

Aggregate Economic Conditions and Indivdual Forecasts: A Mulilevel Model of EconomicExpectations
Jones, Bradford S.
Haller, H. Brandon

Uploaded 
00000000

Keywords 
random coefficient modeling multilevel analysis hierarchical linear models

Abstract 
To what extent are individual economic expectations
related to actual economic conditions? This is the central
question examined in this paper. Surprisingly, little
research exists examining how economic expectations
are formed. Moreover, even less research has been
done examining the interaction between the state
of the national economy and individual forecasts.
Most research addressing expectation formation has
resided at the aggregate level. In this paper,
we utilize the methodology of random coefficient
models to explore the linkage between individuals
and the macroeconomic environment. We conceptualize
individuals as being "nested" within time periods.
Individual forecasts are treated as contextually conditioned
by the state of the economy. We find evidence that
aggregate economic indicators do influence the parameters
predicting economic expectations. Furthermore, the relationship
between the macroeconomy and individual expectations provides
strong support for Katona's (1972, 1975)
notion of "psychological economics." We
find that individual forecasts of the future are
"brighter" when aggregate economic conditions
are "darkest." Additionally, we find that
individuals tend to rely less on retrospective
evaluations of the economy when the economy is faring poorly. 

24 
Paper

Getting the Mean Right: Generalized Additive Models
Beck, Nathaniel
Jackman, Simon

Uploaded 
00000000

Keywords 
nonparametric regression smoothing loess nonlinear egression Monte Carlo analysis interaction effects incumbency cabinet duration violence

Abstract 
We examine the utility of the generalized additive model as an
alternative to the common linear model. Generalized additive models
are flexible in that they allow the effect of each independent
variable to be modelled nonparametrically while requiring that the
effect of all the independent variables is additive. GAMs are
common in the statistics literature but are conspicuously absent in
political science.
The paper presents the basic features of the generalized additive
model. Through Monte Carlo experimentation we show that there is
little danger of the generalized additive model finding spurious
structures. We use GAMS to reanalyze several political science data
sets. These applications show that generalized additive models can
be used to improve standard analyses by guiding researchers as to
the parametric shape of response functions. The technique also
provides interesting insights about data, particularly in terms of
modelling interactions. 

25 
Paper

Forecasting Time Series
Hinich, Melvin J.

Uploaded 
07081997

Keywords 
forecast autoregressive vector AR state space linear

Abstract 
The limits of forecasting a linear times series system are discussed.\r\nA stable autoregressive linear system can only be accurately predicted\r\nfor a few steps ahead of the last observation. If the time series is a\r\ndeterministic trend plus random fluctuations then the trend can be\r\npredicted as long as it is stable. 

26 
Poster

The Promise of EmpiricallyGrounded AgentBased Models: An Application to Immigration Politics in the United States
Velez, Yamil

Uploaded 
07212014

Keywords 
agentbased models nonlinear models immigration spatial modeling

Abstract 
Recent work on the topic of immigration has found that local changes in immigrant composition shape immigration preferences (Hopkins 2010; Newman 2013; Enos 2014). A growing literature in sociology has found that these changes also provoke flight among nativeborn residents (Crowder, Hall, and Tolnay 2011; Rathelot and Safi 2013). Whereas the literature in political science has largely tried to sidestep the issue of selfselection, I construct an empiricallygrounded agentbased model that treats both mobility and political action as potential strategies nativeborn residents invoke in the presence of ethnic change. I tie the behavior of nativeborn residents and immigrants to empirical spatial data and generate locationspecific predictions of where we ought to see political action with respect to immigration. I calibrate the model using a large data set of online antiimmigration petitions and find that indicators of tolerance are vital in predicting political action whereas the effects of other variables are more complex. Empiricallygrounded agentbased models hold the promise of allowing scholars to model dynamic relationships between agents and their environments. However, summarizing the inherently nonlinear outcomes these models produce requires further work. 

