Forecasting and Ethics, Ctd.

A few additional thoughts (original thoughts here) inspired by posts on Duck of Minerva and Dart-Throwing Chimp.

First, I just want to agree wholeheartedly with Jay Ulfelder’s conclusion on Dart-Throwing Chimp:

Look, these decisions are going to be made whether or not we produce statistical forecasts, and when they are made, they will be informed by many things, of which forecasts—statistical or otherwise—will be only one. That doesn’t relieve the forecaster of ethical responsibility for the potential consequences of his or her work. It just means that the forecaster doesn’t have a unique obligation in this regard. In fact, if anything, I would think we have an ethical obligation to help make those forecasts as accurate as we can in order to reduce as much as we can the uncertainty about this one small piece of the decision process. It’s a policymaker’s job to confront these kinds of decisions, and their choices are going to be informed by expectations about the probability of various alternative futures. Given that fact, wouldn’t we rather those expectations be as well informed as possible?

And I just want to underscore something that Daniel Nexon on the Duck referred to as the peformativity problem, referencing some interesting work in economics. I do agree that an interesting intellectual question is if political scientists ever get good at forecasting–a big ifand those forecasts do generate policy interventions, then the forecasts might become self-defeating or self-fulfilling. The goal is for them to become self-defeating, but one could imagine the opposite: military intervention or aid could destabilize a perilous society or warnings of risk could lead to better data collection which could lead to categorization of a problem as being more serious than in an area without warning. Either self-defeating or self-fulfilling predictions present a host of empirical problems. It’s worth a longer discussion, but I think it’s worth starting the discussion. My hunch is political science is shifting methodologically to a fight between the forecasters (probably bolstered by big data) and the causal inferencers. The issues associated with causal inference have been fleshed out in some detail, but the issues associated with forecasting are still relatively unexplored. Let’s start exploring.


Forecasting and Ethics

Idean Salehyean has a provocative post over at the Monkey Cage arguing that forecasting is not a value neutral enterprise. If we forecast some outcome, we should expect policymakers to take some steps as a result, and those steps may or may not be acceptable ethically. Okay. But not forecasting may or may not be acceptable ethically either. I’m a bit of an ethics novice, so I assume people are coming at morals from either some consequentialist or deontological frame. It’s difficult for me to see how forecasting is directly a rights or rules violation, so we’re automatically in some realm of consequentialism when we are worried about forecasting. And once we are down that path, we have to explicitly consider the counterfactual of not forecasting. And not forecasting when one has some ability to do so might lead to bad moral outcomes.

We probably aren’t in the business of chaining Nate Silver, Andrew Gelman, and Gary King to their desks to forecast genocide even if it would achieve salutary consequences (probably for good deontological reasons about letting people make choices about their lives). But I think Idean Salehyean’s point ends up being a banal one because more or less everything we do is not value neutral.

Nothing is special about forecasting, I would stress. Observational studies—the democratic peace, for instance—might lead one to conclude that democracy should be spread, by force if need be.

We are inhabitants of the world. We happen not to be particularly influential inhabitants so the chain of causation from our studies to positive and negative consequences is lengthy. We shouldn’t be paralyzed or fascinated by our presence in the moral universe.

Is Syria’s Civil War A Result of Too Little ‘Bandwidth’ in Washington?

A commonly heard refrain about why the Obama administration has not done X or Y in [insert global troublespot name here] is that there is just not enough bandwidth. Richard Haass, discussing how Iraq distracted us from other more pressing priorities with Diane Rehm, said, “Presidents only have so much bandwidth.” 

The administration’s mouthpieces are also fond of the web 2.0 metaphors in discussing U.S. relations with the world. Benjamin Rhodes on Africa: “[There’s bandwidth in] the relationship for a lot of cooperation, even when we have difference, and even within the Syria issue, there’s that bandwidth. And that’s the message that the leaders wanted to send.” Even Obama himself has employed the notion of freeing up “national-security bandwidth.”
Huh? I understand that there are only so many hours in the day. Bandwidth is treated by its users as a finite and depletable resource, like political capital or canola oil, that should be used prudently. But part of me feels this “bandwidth” metaphor is a cop-out. When are presidents’ in-boxes ever empty? Juggling the breakup of the USSR, Tiananmen Square massacres, South Africa overturning apartheid, an invasion of Iraq, a follow-up no-fly zone in that country’s north, and an economy crumbling around him, George Bush Sr. still found time to send U.S. forces into Somalia to save lives in a place barely anybody at the time had heard of and which was of zero strategic interest.
All of which is to ask: Have scholars ever tried to code “bandwidth” in any systematic fashion? In other words, is it possible to examine the number of other pressing issues (e.g. immigration reform, healthcare, SARS outbreak, etc.) an administration is juggling at the same time? If there are more than, say, a dozen, that might cause the system to short-circuit and lead to paralysis. Do we intervene less overseas or lean more isolationist when bandwidth is low? Discuss.

What We’re Reading


  • Substitute political words for the medical words in this excerpt: “The current regime was built during a time of pervasive ignorance when the best we could do was throw a drug and a placebo against a randomized population and then count noses. Randomized controlled trials are critical, of course, but in a world of limited resources they fail when confronted by the curse of dimensionality. Patients are heterogeneous  and so are diseases. Each patient is a unique, dynamic system and at the molecular level diseases are heterogeneous even when symptoms are not. “
  • A military and strategic assessment of the situation in Syria from Yezid Sayigh at Carnegie. Brutally honest about the existence of good, clean options for the rebels and its potential allies–there aren’t any–it’s a nice companion/update to Dexter Filkins excellent overview of Obama’s options in Syria from the New Yorker a few weeks ago. The crux is that Assad’s position is slowly stabilizing, making a prolonged stalemate increasingly the likely outcome of the conflict.
  • From the Monkey Cage: Akis Georgakellos and Harris Mylonas with a great overview of the structural realignments in the Greek political system. Many Greeks are still in denial about the very real–and immense–changes to Greek political life. At the core, Greece has been transformed from a two-party electoral system with one-party governance into a fragmented electoral system with multiparty governance.

Naming and Faming: Forbes’ Power Women

Forbes’ list of Most Powerful Women for this year is out and Angela Merkel has come out on top again, for the 7th time in the last 8 years. I was struck by the number of American women on the list simply by glancing at it. When I went through the entire list of 100 women, I realized that an astounding 58 were American. Does this mean that other countries are not producing ‘powerful women’?

Well, let’s look at how Forbes decides who is a powerful woman. According to the magazine, 250 candidates are picked each year, out of which 100 are picked across 7 fields – billionaires, business, lifestyle (entertainment and fashion), media, nonprofits and NGOs, politics and technology. Three variables are then used to decide one’s overall rank as well the rank within the category to which she belongs: money, media presence and impact. While money and media presence are calculated using fairly standard metrics (see here), ‘impact’ is measured by the “extent of their reach across industries, cultures and countries, numbers of spheres of influence and people they affect, and how actively they wield their power.” The extremely poor operationalization (if one can even call it that) of this third crucial variable ‘impact’ might be the reason why the list is so US-centric. Or maybe I have some idealistic conception of what ‘impact’ means in that it should affect the lives of people around the world, especially women, positively. Otherwise, I’d be hard pressed to believe that someone like Anna Wintour, editor-in-chief of Vogue, or Jenna Lyons, Creative Director of J. Crew, is actually influencing a wide spectrum of society in any manner, let alone positive.

Cell Phones and Conflict

Jan H. Pierskalla and Florian M. Hollenbach have a new paper (via the Monkey Cage) arguing that cell phone coverage makes collective action easier, and that includes making political violence easier. Good for them. A post-doc and a PhD candidate in the APSR looking at an interesting problem, relevant to the “Did Twitter cause the Arab Spring” question, and using novel data (particularly novel on the independent variable side and using the up-and-coming UCDP Georeferenced Event Dataset on the dependent variable side). Their study uses data from Africa, but its larger implications seem apparent.

The question you have to ask yourself is whether cell phone coverage makes it more likely that “an event” will be recorded in the dataset, a dataset derived from “print, radio, and television news reports from regional newswires, major and local newspapers, secondary sources, and expert knowledge….” If not, then data is missing in a biased way. The cell phones are not increasing violence through collective action but rather through greater reporting on violence that was happening irrespective of cell phones. Depending on the model specification, cell phones might be associated with a 50% increase in reports of a violent event (involving at least one death), from a baseline of about 1% to 1.5%. (Other models report larger effects.) That bump seems plausible to me from cell phone reporting alone, without any collective action effect. I do not know what the situation is like in Africa, but in India and Pakistan it is routine for political violence in the countryside to be under-reported. I assume the effect is multiplied when it is in the countryside, without cell phone coverage, and one has to walk 4 miles to make a phone call about it.

These researchers are aware of this problem and they try to control for it. You can read their discussion on page 6 (particularly footnote 13) and see if you think they resolve it. Also, they are aware that cell phones cover areas with more people and more people are likely to be associated with more violence. I’m a little more comfortable with their strategy for controlling for population (see page 8), though who knows if population’s effect on violence is linear? The fact that when they conduct a robustness check of using a logarithmic transformation of population it weakens their findings for the effect of cell phone coverage “somewhat” (p. 8, footnote 18) is worrisome to me.

Terrorism Data Remains a Mess

I’ve recently been trying to get a handle on terrorism trends in Pakistan, and in that process have been reminded of the problems in terrorism datasets. Based on extant data, I could tell you two stories about Pakistan: terrorism in Pakistan is either getting progressively worse or has gotten considerably better since 2009.

Here is the basic trend for terrorism in Pakistan using the National Consortium for the Study of Terrorism and Responses to Terrorism (START) Global Terrorism Database (GTD).


Compare that with the trend for all terrorism using the National Counterterrorism Center’s World Incident Tracking System. (Through some process opaque to me, the U.S. government decided to stop producing this data series. Or they likely continue to produce it, but just don’t provide it to the outside world. And ignore the redline, which is just the mean across the years shown.)


If you believe GTD, you should be really worried about Pakistan. If you believe NCTC, we may have turned a corner. Get your “Mission Accomplished” banner ready.

The evidence is a little more consistent when looking at just suicide terrorism. Here is the data from the University of Chicago’s Project on Security and Terrorism (CPOST). Sadly, this data hasn’t been updated since October 2011, suggesting they ran out of funding, and also meaning the 2011 data below is incomplete. But, the data suggest a strong peak at 2009, and then sizeable decreases since then, even with the incomplete 2011 data.


Finally, compare that with the GTD data for just suicide attacks. Strong peak in 2007, and then a more modest decline since then.


I don’t have much in the way of conclusions, but despite how many asterisks are found in coefficient charts, we should be skeptical of terrorism findings that are not robust across datasets, and given the discrepancies across data sets, such robustness may be unlikely. The long twilight struggle for cumulative knowledge continues.