HuffPo
Jeff Jonas is one of the country's leading practitioners of the dark art of data analysis. Casino chiefs and government spooks alike have used his CIA-funded "Non-Obvious Relationship Awareness" software to scour databases for hidden connections.
So you'd think that Jonas would be all into the idea of using these data-mining systems to predict who the next terrorist attacker might be.
Think again. "Though data mining has many valuable uses, it is not well suited to the terrorist discovery problem," he writes in a new study, co-authored with the Cato Institute's Jim Harper. "This use of data mining would waste taxpayer dollars, needlessly infringe on privacy and civil liberties, and misdirect the valuable time and energy of the men and women in the national security community."
Which means that the government's massive electronic surveillance programs aren't just illegal. They're also burning up terrorist-hunters' money and time - endangering the country, in the process. Are you listening, NSA?
Jonas doesn't have a problem cobbling together information on suspects from various databases. It's using these databases to forecast a terrorist's behavior -- think market research, but for Al-Qaeda -- that Jonas hates. "The possible benefits of predictive data mining for finding planning or preparation for terrorism are minimal. The financial costs, wasted effort, and threats to privacy and civil liberties are potentially vast," he writes.
QUOTE
One of the fundamental underpinnings of predictive data mining in the commercial sector is the use of training patterns. Corporations that study consumer behavior have millions of patterns that they can draw upon to profile their typical or ideal consumer. Even when data mining is used to seek out instances of identity and credit card fraud, this relies on models constructed using many thousands of known examples of fraud per year.
Terrorism has no similar indicia. With a relatively small number of attempts every year and only one or two major terrorist incidents every few years--each one distinct in terms of planning and execution--there are no meaningful patterns that show what behavior indicates planning or preparation for terrorism. Unlike consumers' shopping habits and financial fraud, terrorism does not occur with enough frequency to enable the creation of valid predictive models. Predictive data mining for the purpose of turning up terrorist planning using all available demographic and transactional data points will produce no better results than the highly sophisticated commercial data mining done today [with results in the low single-digits - ed.]. The one thing predictable about predictive data mining for terrorism is that it would be consistently wrong.
Without patterns to use, one fallback for terrorism data mining is the idea that any anomaly may provide the basis for investigation of terrorism planning. Given a "typical" American pattern of Internet use, phone calling, doctor visits, purchases, travel, reading, and so on, perhaps all outliers merit some level of investigation. This theory is offensive to traditional American freedom, because in the United States everyone can and should be an "outlier" in some sense. More concretely, though, using data mining in this way could be worse than searching at random; terrorists could defeat it by acting as normally as possible.
Treating "anomalous" behavior as suspicious may appear scientific, but, without patterns to look for, the design of a search algorithm based on anomaly is no more likely to turn up terrorists than twisting the end of a kaleidoscope is likely to draw an image of the Mona Lisa.
Terrorism has no similar indicia. With a relatively small number of attempts every year and only one or two major terrorist incidents every few years--each one distinct in terms of planning and execution--there are no meaningful patterns that show what behavior indicates planning or preparation for terrorism. Unlike consumers' shopping habits and financial fraud, terrorism does not occur with enough frequency to enable the creation of valid predictive models. Predictive data mining for the purpose of turning up terrorist planning using all available demographic and transactional data points will produce no better results than the highly sophisticated commercial data mining done today [with results in the low single-digits - ed.]. The one thing predictable about predictive data mining for terrorism is that it would be consistently wrong.
Without patterns to use, one fallback for terrorism data mining is the idea that any anomaly may provide the basis for investigation of terrorism planning. Given a "typical" American pattern of Internet use, phone calling, doctor visits, purchases, travel, reading, and so on, perhaps all outliers merit some level of investigation. This theory is offensive to traditional American freedom, because in the United States everyone can and should be an "outlier" in some sense. More concretely, though, using data mining in this way could be worse than searching at random; terrorists could defeat it by acting as normally as possible.
Treating "anomalous" behavior as suspicious may appear scientific, but, without patterns to look for, the design of a search algorithm based on anomaly is no more likely to turn up terrorists than twisting the end of a kaleidoscope is likely to draw an image of the Mona Lisa.
Civil libertarians and bloggers have talked 'til they're blue in the face about how lame this kind of terror-predicting is. But I don't think I've ever heard a giant of the field, like Jonas, come out against the practice -- at least not on-the-record. Let's hope that this is one conversation that the feds are monitoring.