Home
About the Society
Political Analysis
Political Methodologist
Conferences
Papers, Posters, Syllabi
Submit an Item
Polmeth Mailing List
Polmeth Membership
Scholarships
Search Results
Below results based on the criteria 'event data'
Total number of records returned: 13
1
Paper
Analyzing the Dynamics of International Mediation Processess in the Middle East and the former Yugoslavia
Gerner, Deborah J.
Schrodt, Philip A.
Uploaded
06-28-2001
Keywords
mediation
event data
cross-correlation
conflict
Middle East
Abstract
This paper discusses a new National Science Foundation-funded project that will examine the dynamics of third-party international mediation using statistical time-series analyses of political event data. Third-party mediation was attempted in over half of the conflicts in the post-WWII period and it is likely that the use of mediation has increased following the end of the Cold War. Surprisingly, there have been few systematic studies on mediation. Those that do exist have generally focused on relatively static contextual factors such as the the conflict's attributes and the prior relationship between the mediator and protagonists rather than on dynamic factors' both contextual and process that may contribute to the success or failure of mediation activities. In contrast, the extensive qualitative literature provides numerous hypotheses about dynamic aspects of mediation. This, however, primarily consists of case studies, often by mediation practitioners, that exhibit little cumulation and, when taken as a whole, are rife with contradictory assertions. The project will formally test a number of the hypotheses embedded in the theoretical and qualitative literatures on mediation, using automated coding of event data from news-wire sources and employing time-series and event- history methods. A system of specialized event codes that a sensitive to mediation activities will be developed, then events will be coded from news reports using the TABARI machine coding program. The research will look at the factors that influence (1) whether mediation is accepted by the parties in a conflict, (2) whether formal agreements are reached, and (3) whether the agreements actually reduce the level of conflict. The project will initially focus on conflicts in the Middle East, a region where the principal investigators have substantial field experience. After refining the statistical tests on the Middle East case, the analysis will be extended to event data on conflicts in the former Yugoslavia and West Africa. The paper presents the results of an empirical "plausibility probe" based on existing WEIS-coded event data for the Levant and the former Yugoslavia. It employs a simple measure of third-party mediation efforts as the independent variables and Goldstein-scaled cooperation as the dependent variable. In the Levant, we find a weak but consistent pattern of mediation correlating with past conflictual activity, and resulting in later increases in cooperation. In the former Yugoslavia, the analysis shows strikingly different results for the mediation efforts the UN, European states, and the US. All three respond to increased conflict, but the UN efforts correlate with greater conflict, the US efforts with greater cooperation, and the European efforts have no effect. These results are consistent with many of the qualitative assessments of these efforts, and suggest that the event data approach will produce credible results
2
Paper
Forecasting Conflict in the Balkans using Hidden Markov Models
Schrodt, Philip A.
Uploaded
08-24-2000
Keywords
forecasting
event data
hidden Markov models
conflict
Balkans
Yugoslavia
Abstract
This study uses hidden Markov models (HMM) to forecast conflict in the former Yugoslavia for the period January 1991 through January 1999. The political and military events reported in the lead sentences of Reuters news service stories were coded into the World Events Interaction Survey (WEIS) event data scheme. The forecasting scheme involved randomly selecting eight 100-event "templates" taken at a 1-, 3- or 6-month forecasting lag for high-conflict and low-conflict weeks. A separate HMM is developed for the high-conflict-week sequences and the low-conflict-week sequences. Forecasting is done by determining whether a sequence of observed events fit the high-conflict or low-conflict model with higher probability. Models were selected to maximize the difference between correct and incorrect predictions, evaluated by week. Three weighting schemes were used: unweighted (U), penalize false positives (P) and penalize false negatives (N). There is a relatively high level of convergence in the estimates‹the best and worst models of a given type vary in accuracy by only about 15% to 20%. In full-sample tests, the U and P models produce at overall accuracy of around 80%. However, these models correctly forecast only about 25% of the high-conflict weeks, although about 60% of the cases where a high-conflict week has been forecast turn out to have high conflict. In contrast, the N model has an overall accuracy of only about 50% in full-sample tests, but it correctly forecasts high-conflict weeks with 85% accuracy in the 3- and 6-month horizon and 92% accuracy in the 1-month horizon. However, this is achieved by excessive predictions of high-conflict weeks: only about 30% of the cases where a high-conflict week has been forecast are high-conflict. Models that use templates from only the previous year usually do about as well as models based on the entire sample. The models are remarkably insensitive to the length of the forecasting horizon‹the drop-off in accuracy at longer forecasting horizons is very small, typically around 2%-4%. There is also no clear difference in the estimated coefficients for the 1-month and 6-month models. An extensive analysis was done of the coefficient estimates in the full-sample model to determine what the model was "looking at" in order to make predictions. While a number of statistically significant differences exist between the high and low conflict models, these do not fall into any neat patterns. This is probably due to a combination of the large number of parameters being estimated, the multiple local maxima in the estimation surface, and the complications introduced by the presence of a number of very low probability event categories. Some experiments with simplified models indicate that it is possible to use models with substantially fewer parameters without markedly decreasing the accuracy of the predictions; in fact predictions of the high conflict periods actually increase in accuracy quite substantially.
3
Paper
Detecting United States Mediation Styles in the Middle East, 1979-1998
Schrodt, Philip A.
Uploaded
03-04-1999
Keywords
event data
mediation
Middle East
time series
hidden Markov models
Abstract
This research is part of the "Multiple Paths to Knowledge Project" sponsored by the James A. Baker III Institute for Public Policy, Rice University, and the Program in Foreign Policy Decision Making, Texas A&M University. The paper deals with the problem of determining whether the mediation styles used by four U.S. Secretaries of State -- George Schultz, James Baker, Warren Christopher and Madeline Albright -- are sufficiently distinct that they can be detected in event data. The mediation domain is the Israel-Palestinian conflict from April 1979 to December 1998, the event data are coded from the Reuters news service reports using the WEIS event coding scheme, and the classification technique is hidden Markov models. The models are estimated for each of the four Secretaries based on 16 randomly chosen 32-events sequences of USA>ISR and USA>PAL events during the term of the Secretary. Each month in the data set is then assigned to one of the four Secretarial styles based on the best-fitting model. The models differentiate the mediation styles quite distinctly and this method of detecting styles yields quite different results when applied to ISR-PAL data or random data. The "Baker" and "Albright" styles are most distinctive; the "Schultz" style is least; both results are consistent with many qualitative characterizations of these periods. A series of t-tests is then done on Goldstein-scaled scores to determine whether the mediation styles translate into statistically distinct interactions in the ISR>USA, ISR>PAL, PAL>USA and PAL>ISR dyads. While there are a number of statistically-significant differences when the full sample is used, these may be due simply to the overall changes Israel-Palestinian relations over the course of the time series. When tests are done on months that are out-of-term -- in other words, where the style of one Secretary is being employed during the term of another -- few statistically-significant differences are found, though there is someindication of a lag of a month or so between the change in style and the behavioral response. It appears that the effects of the differing styles are not captured by changes in aggregated data, possibly because these scales force behavior into a single conflict-cooperation dimension. Consistent with other papers in the "Multiple Paths to Knowledge" project, the paper contains commentary on how the research project was actually done, as well as the conventional presentation of results. The file includes the papers in Postscript and PDF formats, the event data (Levant, April 1979 to December 1998) used in the analysis, the C source code for estimating the hidden Markov models. This paper was presented at the International Studies Association meetings, Washington, 16-21 February 1999
4
Paper
An Event Data Set for the Arabian/Persian Gulf Region 1979-1997
Schrodt, Philip A.
Gerner, Deborah J.
Uploaded
04-12-1999
Keywords
event data
Middle East
Persian Gulf
automated coding
Abstract
This paper discusses a WEIS-coded event data set covering the Arabian/Persian Gulf region (Iran, Iraq, Kuwait, Oman, Saudi Arabia, Yemen, and the smaller Gulf states) for the period 15 April 1979 to 10 June 1997. The coded events cover international interactions among these states, as well as interactions with any other states or major international organizations. The data set is generated from Reuters news reports downloaded from the NEXIS data service and coded using the Kansas Event Data System (KEDS) machine-coding program. The paper begins with a review of the process of generating a machine-coded data set, including a discussion of software we have developed to partially automate the development of dictionaries to code new geographical regions. The Gulf data are coded using a standard set of verb phrases (rather than phrases specifically adapted to the Gulf) and an actors dictionary that has been augmented only with the actors identified by a utility program that examines the source texts for actors not already found in the KEDS dictionary. The Reuters reports generate 264,421 events when full stories are coded and 48,721 events when only lead sentences are coded. An examination of the time series that are generated when the events are aggregated by month using the Goldstein scale shows that they capture the major features of the behavior that we know to have occurred in the region. There is generally a high correlation (r > 0.75) between the series generated from lead-sentences and from full stories when the major actors of the region (Iran, Iraq, Saudi Arabia and USA) are studied. An exception to this pattern is found in interactions involving a relatively minor actor, the United Arab Emirates. Here the full-story coding provides far more events than the lead-sentence coding and shows greater variance even for interactions between major actors. We expect this will also be the case for other small Gulf states, suggesting that full-story coding may be necessary for a complete analysis of these actors. Paper was presented a year ago at the International Studies Association, Minneapolis, 18 - 22 March 1998 The file includes the papers in Postscript and PDF formats. The data set has been updated through March, 1999 and is available at the KEDS project web site, http://www.ukans.edu/~keds.
5
Paper
Using Cluster Analysis to Derive Early Warning Indicators for Political Change in the Middle East, 1979-1996
Schrodt, Philip A.
Gerner, Deborah J.
Uploaded
08-22-1996
Keywords
event data
conflict
early warning
Middle East
cluster analysis
genetic algorithms
Abstract
This paper uses event data to develop an early warning model of major political changes in the Levant for the period April 1979 to July 1996. Following a general review of statistical early warning research, the analysis focuses on the behavior of eight Middle Eastern actorsÑEgypt, Israel, Jordan, Lebanon, the Palestinians, Syria, the United States and USSR/RussiaÑusing WEIS-coded event data generated from Reuters news service lead sentences with the KEDS machine-coding system. The analysis extends earlier work (Schrodt and Gerner 1995) demonstrating that clusters of behavior identified by conventional statistical methods correspond well with changes in political behavior identified a priori. We employ a new clustering algorithm that uses the correlation between the dyadic behaviors at two points in time as a measure of distance, and identifies cluster breaks as those time points that are closer to later points than to preceding points. We also demonstrate that these data clusters begin to "stretch" prior to breaking apart; this characteristic is used as an early-warning indicator. A Monte- Carlo analysis shows that the clustering and early warning measures perform very differently in simulated data sets having the same mean, variance, and autocorrelation as the observed data (but no cross-correlation) which reduces the likelihood that the clustering patterns are due to chance. The initial analysis uses Goldstein's (1992) weighting system to aggregate the WEIS-coded data. In an attempt to improve on the Goldstein scale, we use a genetic algorithm to optimize the weighting of the WEIS event categories for the purpose of clustering. This does not prove very successful and only differentiates clusters in the first half of the data set, a result similar to one we obtained using the cross-sectional K- Means clustering procedure. Correlating the frequency of events in the twenty-two 2-digit WEIS categories, on the other hand, gives clustering and early warning results similar to those produced by the Goldstein scale. The paper concludes with some general remarks on the role of quantitative early warning and directions for further research. This paper was presented at the American Political Science Association, San Francisco, 28 August - 1 September 1996
6
Paper
Early Warning of Conflict in Southern Lebanon using Hidden Markov Models
Schrodt, Philip A.
Uploaded
08-24-1997
Keywords
hidden Markov models
event data
early warning
international crisis
sequence analysis
Middle East
WEIS
BCOW
Abstract
This paper extends earlier work on the application of hidden Markov models (HMMs) to the problem of forecasting international conflict. HMMs are a sequence comparison method widely used in computerized speech recognition as a computationally efficient method of generalizing a set of sequences observed in a noisy environment. The technique is easily be adapted to work with sequences of international event data. The paper provides a theoretical "micro-foundation" for the use of sequence comparison in conflict early- warning based on coadaptation of organizational standard operating procedures. The left-right (LR) HMM used in speech recognition is first extended to a left-right-left (LRL) model that allows a crisis to escalate and de-escalate. This model is tested for its ability to correctly discriminate between BCOW crisis that involve and do not involve war. The LRL model provides slightly more accurate classification than the LR model. The interpretation of the hidden states in the LRL models, however, is more ambiguous than in the LR model. The HMM is then applied to the problem of forecasting the outbreak of armed violence between Israel and Arab forces in south Lebanon during the period 1979 to 1997 (excluding 1982-1985). An HMM is estimated using six cases of "tit-for-tat" escalation, then fitted to the entire time period. The model identifies about half of the TFT conflictsÑincluding all of the training casesÑthat occur in the full sequence, with only one false positive. This result suggests that HMMs could be used in an event-based monitoring system. However, the fit of the model is very sensitive to the number of days in a sequence when no events occurred, and consequently the fit measure is ineffective as an early warning indicator. Nonetheless, in a subset of models, the maximum likelihood estimate of the sequence of hidden Markov states provides a robust early warning indicator with a three to six-month lead. These models are valid in a split-sample test, and the patterns of cross-correlation of the individual states of the model are consistent with the theoretical expectations. While this approach clearly needs further validation, it appears promising. The paper concludes with observations on the extent to which the HMM approach can be generalized to other categories of conflict, some suggestions on how the method of estimation can be improved, and the implications that sequence-based forecasting techniques have for theories of the causes of conflict.
7
Paper
Pattern Recognition of International Crises using Hidden Markov Models
Schrodt, Philip A.
Uploaded
06-30-1997
Keywords
hidden Markov models
event data
early warning
international crisis
sequence analysis
Middle East
WEIS
BCOW
Abstract
Event data are one of the most widely used indicators in quantitative international relations research. To date, most of the models using event data have constructed numerical indicators based on the characteristics of the events measured in isolation and then aggregated. An alternative approach is to use quantitative pattern recognition techniques to compare an existing sequence of behaviors to a set of similar historical cases. This has much in common with human reasoning by historical analogy while providing the advantages of systematic and replicable analysis possible using machine-coded event data and statistical models. This chapter uses "hidden Markov models" Ñ- a recently developed sequence- comparison technique widely used in computational speech recognition Ñ- to measure similarities among international crises. The models are first estimated using the Behavioral Correlates of War data set of historical crises, then applied to an event data set covering political behavior in the contemporary Middle East for the period April 1979 through February 1997. A split-sample test of the hidden Markov models perfectly differentiates crises involving war from those not involving war in the cases used to estimate the models. The models also provide a high level of discrimination in a set of test cases not used in the estimated, and most of the erroneously-classified cases have plausible distinguishing features. The difference between the war and nonwar models also correlates significantly with a scaled measure of conflict in the contemporary Middle East. This suggests that hidden Markov models could be used to develop conflict measures based on event similarities to historical conflicts rather than on aggregated event scores.
8
Paper
Inductive Event Data Scaling using Item Response Theory
Schrodt, Philip A.
Uploaded
07-17-2007
Keywords
event data
IRT
latent trait
scaling
Rasch model
Goldstein scale
WEIS
CAMEO
Abstract
Political event data are frequently converted to an interval-level measurement by assigning a numerical scaled value to each event. All of the existing scaling systems rely on non-replicable expert assessments to determine these numerical scores. This paper uses item response theory (IRT) to derive scales inductively, using event data on Israeli interactions with Lebanon and the Palestinians for 1991-2007. Monthly scores on a latent trait are calculated using three IRT models: the single-parameter Rasch model, and two-parameter models that add discrimination and guessing parameters. The three formulations produce generally comparable scores (correlations of 0.90 or higher). The Rasch scales are less successful than the expert-derived Goldstein scale in reconciling the somewhat divergent sets of events derived from the Agence France Presse and Reuters news services. This is in all likelihood due largely to a low weighting given uses of force by the IRT because such events are common in these two dyads. A factor analysis of the event counts shows that a single cooperation-conflict dimension generally accounts for about two-thirds of the variance in these dyads, but a second case-specific dimension explains another 20%. Finally, moving averages of the derived scores generally correlate well with the Goldstein values, suggesting that IRT may provide a route towards deriving purely inductive, and hence replicable, scales.
9
Paper
Automated Production of High-Volume, Near-Real-Time Political Event Data
Schrodt, Philip
Uploaded
08-30-2010
Keywords
event data
ICEWS
DARPA
natural language processing
open source
forecasting
prediction
conflict
Abstract
This paper summarizes the current state-of-the-art for generating high-volume, near-real-time event data using automated coding methods, based on recent efforts for the DARPA Integrated Crisis Early Warning System (ICEWS) and NSF-funded research. The ICEWS work expanded by more than two orders of magnitude previous automated coding efforts, coding of about 26-million sentences generated from 8-million stories condensed from around 30 gigabytes of text. The actual coding took six minutes. The paper is largely a general ``how-to'' guide to the pragmatic challenges and solutions to various elements of the process of generating event data using automated techniques. It also discusses a number of ways that this could be augmented with existing open-source natural language processing software to generate a third-generation event data coding system.
10
Paper
Conflict and Mediation Event Observations (CAMEO): A New Event Data Framework for the Analysis of Foreign Policy Interactions
Schrodt, Philip A.
Gerner, Deborah J.
Abu-Jabr, Rajaa
Yilmaz, Omur
Uploaded
04-01-2002
Keywords
event data
mediation
WEIS
Middle East
Balkans
West Africa
Abstract
The Conflict and Mediation Events Observations (CAMEO) framework is a new event data coding scheme optimized for the study of third-party mediation in international disputes. We have developed and implemented this system using the TABARI automated coding program, and have generated data sets for the Balkans (1989-2002; N=69,620), Levant (1979-2002; N=146,283), and West Africa (1989-2002; N=17,468) from Reuters and Agence France Presse reports. We describe why we decided to develop a new coding system, rather than continuing to use the World Events Interaction Survey (WEIS) framework that we have used in earlier work. Our decision involved both known weaknesses in the WEIS system, and some additional problems that we have found occur when WEIS is coded using automated methods. We have addressed these problems in constructing CAMEO, as well as producing much more completed documentation than has been available for WEIS. In this paper, we make several statistical comparisons of CAMEO-coded and WEIS-coded data in the three geographical regions. When the data are aggregated to a general behavioral levelÑverbal cooperation, material cooperation, verbal conflict and material conflictÑmost of the data sets show a high correlation (r>0.90) in the number of WEIS and CAMEO events coded per month. However, as we expected, CAMEO consistently picks up a greater number of events involving material cooperation. CAMEO and WEIS show similar irregularities in the distribution of events by category. Finally, there is a very significant correlation (r>0.57) between the count of CAMEO events specifically dealing with mediation and negotiation, and a pattern-based measure of mediation we developed earlier from WEIS data. Appendices in the paper show the CAMEO coding framework and examples from the codebook.
11
Paper
Analyzing the dynamics of international mediation processes
Schrodt, Philip A.
Gerner, Deborah J.
Uploaded
07-16-2001
Keywords
event data
cross-correlation
mediation
Cox proportional hazard
pattern recognition
Abstract
This paper presents initial results from a project that will formally test a number of the hypotheses embedded in the theoretical and qualitative literatures on mediation, using automated coding of event data from news-wire sources. In contrast to most of the existing quantitative literature, which emphasizes the structural aspects of mediation, we will focus on the dynamics. The initial part of the paper focuses on two issues of design. First, we discuss the advantages of generating data using fully automated methods, which increases the transparency and replicability of the research. This transparency is extended to the development of more complex variables that cannot be captured as single events: these are defined as pattern of the underlying event data. We also suggest that these can be usefully studied using conventional inferential statistics rather than computational pattern recognition. Second, we justify the "statistical case study" approach which focuses on a small number of cases that are limited in geographical and temporal scope. While the risk of this approach is that one will find patterns of behavior that apply only in those circumstances, we point out that the more conventional large-N time-series cross-sectional studies also carry inferential risks. The statistical tests reported in this paper look at three different issues using data on the Israel-Lebanon and Israel-Palestinian conflicts in the Levant (1979-1999), and the Serbia-Croatia and Serbia-Bosnia conflicts in the Balkans (1991-1999). First, cross- correlation is used to look at the effects of mediation on the level of violence over time. Second, we test the "sticks-or-carrots" hypothesis on whether mediation is more effective in reducing violence if accompanied by cooperative or conflictual behavior by the mediator. Finally, we estimate Cox proportional hazard models to assess the factors that influence (1) whether mediation is accepted by the parties in a conflict, (2) whether formal agreements are reached, and (3) whether the agreements reduce the level of conflict. Future work in the project involves development of a new event coding scheme specifically designed for the study of mediation, and expansion of the list of cases to include other mediated conflicts in the Middle East and West Africa.
12
Paper
Automated Coding of International Event Data Using Sparse Parsing Techniques
Schrodt, Philip A.
Uploaded
06-28-2001
Keywords
event data
natural language processing
conflict
content analysis
open source
Abstract
"Event data" record the interactions of political actors reported in sources such as newspapers and news services; this type of data is widely used in research in international relations. Over the past ten years, there has been a shift from coding event data by humans -- typically university students -- to using computerized coding. The automated methods are dramatically faster, enabling data sets to be coded in real time, and provide far greater transparency and consistency than human coding. This paper reviews the experience of the Kansas Event Data System (KEDS) project in developing automated coding using "sparse parsing" machine coding methods, discusses a number of design decisions that were made in creating the program, and assesses features that would improve the effectiveness of these programs.
13
Paper
Monitoring conflict using automated coding of newswire reports
Schrodt, Philip A.
Gerner, Deborah J.
Simpson Gerner, Erin M.
Uploaded
06-28-2001
Keywords
event data
natural language processing
conflict
content
Abstract
his paper discusses the experience of the Kansas Event Data System (KEDS) project in developing event data sets for monitoring conflict levels in five geographical areas: the Levant (Arab-Israeli conflict), Persian Gulf, former Yugoslavia, Central Asia (Afghanistan, Armenia-Azerbijan, former Soviet republics), and West Africa (Liberia, Sierra Leone). These data sets were coded from commercial news sources using the KEDS and TABARI automated coding systems. The paper discusses our experience in developing the dictionaries required for this coding, the problems with the number of reported events in the various areas, and provides examples of the statistical summaries that can be produced from event data. We also compare the coverage of the Reuters and Agence France Presse news services for selected years in the Levant and former Yugoslavia. We conclude with suggestions for four topics where additional efforts that could be usefully undertaken by multiple research projects.
< prev
1
next>