The eeesr from zmjones

Code for "An Empirical Evaluation of Explanations for State Repression," by Daniel Hill and Zachary Jones. Published in the American Political Science Review 108:3 (pp. 661-667).

The empirical literature that examines cross-national patterns of state repression seeks to discover a set of political, economic, and social conditions that are consistently associated with government violations of human rights. Null hypothesis significance testing is the most common way of examining the relationship between repression and concepts of interest, but we argue that it is inadequate for this goal, and has produced potentially misleading results. To remedy this deficiency in the literature we use cross-validation and random forests to determine the predictive power of measures of concepts the literature identifies as important causes of repression. We find that few of these measures are able to substantially improve the predictive power of statistical models of repression. Further, the most studied concept in the literature, democratic political institutions, predicts certain kinds of repression much more accurately than others. We argue that this is due to conceptual and operational overlap between democracy and certain kinds of state repression. Finally, we argue that the impressive performance of certain features of domestic legal systems, as well as some economic and demographic factors, justifies a stronger focus on these concepts in future studies of repression.

See Google Scholar's citation count.

You can also see the anynomous referree reviews, and our responses to them (rounds 1 and 2), as well as our online appendix. There is also a short post I wrote that summarizes the paper.

@article{hill2014empirical,
  title={An Empirical Evaluation of Explanations for State Repression},
  author={Hill Jr., Daniel W. and Jones, Zachary M.},
  journal={American Political Science Reivew},
  year={2014},
  volume={108},
  issue={3},
  pages={661-687}
}

Open an issue or send me an email if you have any problems or suggestions. Even though this paper is published I intend on making sure it remains replicable.

This repository contains the complete history of the manuscript and code since we started the project. You can look at the commits to see how the paper and code changed over time.

Getting the Code and Data

You can clone this repository using git or download it as a .zip archive. You can download the data necessary to run this code here. The code expects the data to be in a subdirectory labeled data. If you have git, wget, and unzip available, the following code will automate the procedure.

git clone https://github.com/zmjones/eeesr.git && cd eeesr
wget http://zmjones.com/static/data/eeesr_data.zip && unzip eeesr_data

Running the Code

This build process has only been tested on OSX. It was originally run on Amazon's EC2 using an Ubuntu server, but the most recent revisions have been run on Penn State's Lion cluster. Some parts of the code are runnable on a laptop, but the cross-validation and permutation importance scripts are very computationally intensive and it is probably a good idea to run these on a high performance computing system.

The makefile allows you to build everything with one command, or to only build subsets of the entire project. You can build everything: make all, the paper only: make paper, the analysis only make analysis, or the data only make data. If you don't want to use the makefile, be sure to run the scripts in the order specified in the makefile (also shown below).

To rebuild the replication data you'll need to make get_un.sh executable: chmod +x get_un.sh. This also requires git (it clones another repository). The relevant data files scraped by untreaties are already in the data archive though. You'll also need the package dependencies that are listed at the top of each script.

The approximate runtime of each script varies widely as a function of the number of cross-validation iterations, bootstrap iterations, number of imputations performed, and the number of cores the computation is distributed across (all of these variables are set in setup.R).

get_un.sh fetches the untreaties utility, grabs the appropriate treaties, and transforms them
data.R joins and cleans up the various data sets we use for this analysis
setup.R sets global variables (e.g. folds, iterations, etc.), defines model specifications and labels
mi.R performs multiple imputation
imp.R calculates bootstrapped variable importance based on random forest models
all.R estimates models on all the (imputed) data
cv_setup.R sets up functions and variables for cross-validation procedure
cv.R cross-validates all models and combines results
plot.R creates descriptive and model plots
tree.R creates decision tree plot for random forest explanation section

Vector length in log-INGOs model

I might be doing things wrong, but I get this issue with the code:

> source("data.R")
> source("imp.R")
> source("mi.R")
> source("setup.R")
> source("all.R")
Error in `$<-.data.frame`(`*tmp*`, "spec", value = c("log INGOs", "Polity",  : 
  replacement has 31 rows, data has 30
> traceback()
12: stop(sprintf(ngettext(N, "replacement has %d row, data has %d", 
        "replacement has %d rows, data has %d"), N, nrows), domain = NA)
11: `$<-.data.frame`(`*tmp*`, "spec", value = c("log INGOs", "Polity", 
    "Executive Compet.", "Executive Open.", "Executive Const.", "Participation Compet.", 
    "Judicial Indep.", "log Oil Rents", "Military Regime", "Left Executive", 
    "log Trade/GDP", "FDI", "Public Trial", "Fair Trial", "Court Decision Final", 
    "Legislative Approval", "WB/IMF Structural Adj.", "IMF Structural Adj.", 
    "WB Structural Adj.", "British Colony", "Common Law", "PTA w/ HR Clause", 
    "CAT Ratifier", "CPR Ratifier", "Youth Bulge", "Civil War", "International War", 
    "AI Press (lag)", "AI Background (lag)", "Western Media (lag)", 
    "HRO Shaming (lag)")) at all.R#20
10: `$<-`(`*tmp*`, "spec", value = c("log INGOs", "Polity", "Executive Compet.", 
    "Executive Open.", "Executive Const.", "Participation Compet.", 
    "Judicial Indep.", "log Oil Rents", "Military Regime", "Left Executive", 
    "log Trade/GDP", "FDI", "Public Trial", "Fair Trial", "Court Decision Final", 
    "Legislative Approval", "WB/IMF Structural Adj.", "IMF Structural Adj.", 
    "WB Structural Adj.", "British Colony", "Common Law", "PTA w/ HR Clause", 
    "CAT Ratifier", "CPR Ratifier", "Youth Bulge", "Civil War", "International War", 
    "AI Press (lag)", "AI Background (lag)", "Western Media (lag)", 
    "HRO Shaming (lag)")) at all.R#20
9: CleanAll(y, ivar.labels) at all.R#34
8: FUN(X[[1L]], ...)
7: lapply(x, function(y) CleanAll(y, ivar.labels)) at all.R#34
6: FUN(X[[1L]], ...)
5: lapply(all, function(x) lapply(x, function(y) CleanAll(y, ivar.labels))) at all.R#34
4: eval(expr, envir, enclos)
3: eval(ei, envir)
2: withVisible(eval(ei, envir))
1: source("all.R")

> sessionInfo()
R version 3.0.1 (2013-05-16)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
 [1] stats4    grid      splines   stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] mice_2.18         nnet_7.3-6        lattice_0.20-15   multicore_0.1-7   party_1.0-8      
 [6] vcd_1.2-13        colorspace_1.2-2  MASS_7.3-26       strucchange_1.4-7 sandwich_2.2-10  
[11] zoo_1.7-10        coin_1.0-22       mvtnorm_0.9-9995  modeltools_0.2-19 stringr_0.6.2    
[16] rms_4.0-0         SparseM_1.03      Hmisc_3.12-2      Formula_1.1-1     survival_2.37-4  
[21] foreign_0.8-54    countrycode_0.16  plyr_1.8         

loaded via a namespace (and not attached):
[1] cluster_1.14.4 rpart_4.1-1    tools_3.0.1

zmjones / eeesr Goto Github PK

eeesr's Introduction

Getting the Code and Data

Running the Code

eeesr's People

Contributors

Stargazers

Watchers

Forkers

eeesr's Issues

Vector length in log-INGOs model

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent