Giter Site home page Giter Site logo

eeesr's Introduction

Code for "An Empirical Evaluation of Explanations for State Repression," by Daniel Hill and Zachary Jones. Published in the American Political Science Review 108:3 (pp. 661-667).

The empirical literature that examines cross-national patterns of state repression seeks to discover a set of political, economic, and social conditions that are consistently associated with government violations of human rights. Null hypothesis significance testing is the most common way of examining the relationship between repression and concepts of interest, but we argue that it is inadequate for this goal, and has produced potentially misleading results. To remedy this deficiency in the literature we use cross-validation and random forests to determine the predictive power of measures of concepts the literature identifies as important causes of repression. We find that few of these measures are able to substantially improve the predictive power of statistical models of repression. Further, the most studied concept in the literature, democratic political institutions, predicts certain kinds of repression much more accurately than others. We argue that this is due to conceptual and operational overlap between democracy and certain kinds of state repression. Finally, we argue that the impressive performance of certain features of domestic legal systems, as well as some economic and demographic factors, justifies a stronger focus on these concepts in future studies of repression.

See Google Scholar's citation count.

You can also see the anynomous referree reviews, and our responses to them (rounds 1 and 2), as well as our online appendix. There is also a short post I wrote that summarizes the paper.

@article{hill2014empirical,
  title={An Empirical Evaluation of Explanations for State Repression},
  author={Hill Jr., Daniel W. and Jones, Zachary M.},
  journal={American Political Science Reivew},
  year={2014},
  volume={108},
  issue={3},
  pages={661-687}
}

Open an issue or send me an email if you have any problems or suggestions. Even though this paper is published I intend on making sure it remains replicable.

This repository contains the complete history of the manuscript and code since we started the project. You can look at the commits to see how the paper and code changed over time.

Getting the Code and Data

You can clone this repository using git or download it as a .zip archive. You can download the data necessary to run this code here. The code expects the data to be in a subdirectory labeled data. If you have git, wget, and unzip available, the following code will automate the procedure.

git clone https://github.com/zmjones/eeesr.git && cd eeesr
wget http://zmjones.com/static/data/eeesr_data.zip && unzip eeesr_data

Running the Code

This build process has only been tested on OSX. It was originally run on Amazon's EC2 using an Ubuntu server, but the most recent revisions have been run on Penn State's Lion cluster. Some parts of the code are runnable on a laptop, but the cross-validation and permutation importance scripts are very computationally intensive and it is probably a good idea to run these on a high performance computing system.

The makefile allows you to build everything with one command, or to only build subsets of the entire project. You can build everything: make all, the paper only: make paper, the analysis only make analysis, or the data only make data. If you don't want to use the makefile, be sure to run the scripts in the order specified in the makefile (also shown below).

To rebuild the replication data you'll need to make get_un.sh executable: chmod +x get_un.sh. This also requires git (it clones another repository). The relevant data files scraped by untreaties are already in the data archive though. You'll also need the package dependencies that are listed at the top of each script.

The approximate runtime of each script varies widely as a function of the number of cross-validation iterations, bootstrap iterations, number of imputations performed, and the number of cores the computation is distributed across (all of these variables are set in setup.R).

  • get_un.sh fetches the untreaties utility, grabs the appropriate treaties, and transforms them
  • data.R joins and cleans up the various data sets we use for this analysis
  • setup.R sets global variables (e.g. folds, iterations, etc.), defines model specifications and labels
  • mi.R performs multiple imputation
  • imp.R calculates bootstrapped variable importance based on random forest models
  • all.R estimates models on all the (imputed) data
  • cv_setup.R sets up functions and variables for cross-validation procedure
  • cv.R cross-validates all models and combines results
  • plot.R creates descriptive and model plots
  • tree.R creates decision tree plot for random forest explanation section

eeesr's People

Contributors

zmjones avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

eeesr's Issues

Vector length in log-INGOs model

I might be doing things wrong, but I get this issue with the code:

> source("data.R")
> source("imp.R")
> source("mi.R")
> source("setup.R")
> source("all.R")
Error in `$<-.data.frame`(`*tmp*`, "spec", value = c("log INGOs", "Polity",  : 
  replacement has 31 rows, data has 30
> traceback()
12: stop(sprintf(ngettext(N, "replacement has %d row, data has %d", 
        "replacement has %d rows, data has %d"), N, nrows), domain = NA)
11: `$<-.data.frame`(`*tmp*`, "spec", value = c("log INGOs", "Polity", 
    "Executive Compet.", "Executive Open.", "Executive Const.", "Participation Compet.", 
    "Judicial Indep.", "log Oil Rents", "Military Regime", "Left Executive", 
    "log Trade/GDP", "FDI", "Public Trial", "Fair Trial", "Court Decision Final", 
    "Legislative Approval", "WB/IMF Structural Adj.", "IMF Structural Adj.", 
    "WB Structural Adj.", "British Colony", "Common Law", "PTA w/ HR Clause", 
    "CAT Ratifier", "CPR Ratifier", "Youth Bulge", "Civil War", "International War", 
    "AI Press (lag)", "AI Background (lag)", "Western Media (lag)", 
    "HRO Shaming (lag)")) at all.R#20
10: `$<-`(`*tmp*`, "spec", value = c("log INGOs", "Polity", "Executive Compet.", 
    "Executive Open.", "Executive Const.", "Participation Compet.", 
    "Judicial Indep.", "log Oil Rents", "Military Regime", "Left Executive", 
    "log Trade/GDP", "FDI", "Public Trial", "Fair Trial", "Court Decision Final", 
    "Legislative Approval", "WB/IMF Structural Adj.", "IMF Structural Adj.", 
    "WB Structural Adj.", "British Colony", "Common Law", "PTA w/ HR Clause", 
    "CAT Ratifier", "CPR Ratifier", "Youth Bulge", "Civil War", "International War", 
    "AI Press (lag)", "AI Background (lag)", "Western Media (lag)", 
    "HRO Shaming (lag)")) at all.R#20
9: CleanAll(y, ivar.labels) at all.R#34
8: FUN(X[[1L]], ...)
7: lapply(x, function(y) CleanAll(y, ivar.labels)) at all.R#34
6: FUN(X[[1L]], ...)
5: lapply(all, function(x) lapply(x, function(y) CleanAll(y, ivar.labels))) at all.R#34
4: eval(expr, envir, enclos)
3: eval(ei, envir)
2: withVisible(eval(ei, envir))
1: source("all.R")

> sessionInfo()
R version 3.0.1 (2013-05-16)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
 [1] stats4    grid      splines   stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] mice_2.18         nnet_7.3-6        lattice_0.20-15   multicore_0.1-7   party_1.0-8      
 [6] vcd_1.2-13        colorspace_1.2-2  MASS_7.3-26       strucchange_1.4-7 sandwich_2.2-10  
[11] zoo_1.7-10        coin_1.0-22       mvtnorm_0.9-9995  modeltools_0.2-19 stringr_0.6.2    
[16] rms_4.0-0         SparseM_1.03      Hmisc_3.12-2      Formula_1.1-1     survival_2.37-4  
[21] foreign_0.8-54    countrycode_0.16  plyr_1.8         

loaded via a namespace (and not attached):
[1] cluster_1.14.4 rpart_4.1-1    tools_3.0.1  

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.