Patrick Ball, Director of Research, Human Rights Data Analysis Group
Data about mass violence can seem to offer insights into patterns: is violence getting better, or worse, over time? Is violence directed more against men or women? But in human rights data collection, we (usually) don’t know what we don’t know --- and worse, what we don’t know is likely to be systematically different from what we do know.
This talk will explore the assumption that nearly every project using data must make: that the data are representative of reality in the world. We will explore how, contrary to the standard assumption, statistical patterns in raw data tend to be quite different than patterns in the world. Statistical patterns in data tend to reflect how the data were collected rather than changes in the real-world phenomena data purport to represent.
Using analysis of killings during Peru's civil war, homicides committed by police in the US, killings in the conflict in Syria, and homicides in Colombia, we will contrast patterns in raw data with estimates total patterns of violence—where the estimates correct for heterogeneous underreporting. The talk will show how biases in raw data can be addressed through estimation, and explain why it matters.