Duck of Minerva did a public service by hosting a debate between Matt Kroenig, Todd Sechser and Matthew Fuhrmann (see here, here, and here). The topic was nuclear superiority and crisis bargaining. Those are abstract words, but they are relevant for how we think about the United States’ ability to compel, say, North Korea in a crisis. But I think the debate should be of interest to anyone interested in applied methods in IR.
For me, Sechser and Fuhrmann’s arguments are the more compelling. In particular, they critique how Kroenig squeezes the appearance of more data out of a very small number of events:
Kroenig confronts a basic challenge in his empirical analysis: nuclear crises are rare. Specifically, he has only 20 nuclear crises in his dataset (drawn from the ICB dataset). Yet he winds up with 52 observations, enough to generate a statistically significant correlation. How does he obtain such a large dataset from such a small set of crises? The answer is that Kroenig simply duplicates each observation in the dataset, so as to double its size. A single observation for the Cuban Missile Crisis, for example, now becomes two independent events in his dataset: a victory for the United States, and a defeat for the Soviet Union. This is inappropriate: the two observations are measuring the same event. Kroenig is not actually observing more data here; he is simply reporting the same event twice. This is equivalent to an exit poll that lists each respondent twice in the sample – once voting for candidate X, and once voting against candidate Y – and then claims to have twice the sample size.
As quantitative methods have pushed into new areas, including areas with very few observations, their employers have suggested greater confidence than is deserved about their findings. In fact, at some point, my guess is the whole enterprise of treating dyad-years as meaningfully independent observations will come collapsing around our heads. It’s been a while since I looked at it, but Erickson, Pinto and Rader have a paper that concludes “typical statistical tests for significance are severely overconfident in dyadic data.”
Political science’s great challenge is knowing what we know. Quantitative methods are not a panacea for this problem.