Unknown Knowns

An interesting piece from Ars Technica last month described the divergence between the set of plant species that have been collected by botanists and the set of plant species that have been cataloged into the set of known species.

This kind of story complicates that idea so often invoked by science journalists, of what is "known" to science. This is a complication familiar to historians. Many, many things are written down in an archive somewhere, and perhaps known to someone. No one, however, has made sense of the vast majority of them within some historical framework, and thereby processed them into the body of knowledge of the/a historical community.

Of more specific concern to historians of computing, this kind of story adds quite a large wrinkle to Bruno Latour's model of science as a process of immutable mobiles being vacuumed smoothly into centers of calculation. This is a wrinkle elucidated by Paul Edwards' recent book, A Vast Machine, in terms of "friction," generated both by the mass of data itself and by the act of computing on it. There is many slip, as Edwards points out, between the cup of data collection and the lip of a comprehensive understanding of that data.

The plant article suggests that Edwards has opened up a very fruitful (cough) line of analysis, and that we will find friction everywhere we look in the history of science, especially the globalized/globalizing science of the nineteenth and twentieth centuries.

The effort to classify all the species of the world, for example, is an ancient one, but achieved a new intensity in Europe with the imperial projects of the modern era. In this context, Linnaeus' nomenclature was an early effort to overcome the data friction created by the huge new flood of species data flowing into European centers from around the world.

Modern botany, of course, operates in a very different political context and on a very different intellectual basis (that of neo-Darwinian evolutionary biology), but it faces, as we see, a similar kind of problem--plants can be collected much more quickly than experts can analyze them and allocate them a place in the scheme of botanical knowledge. Significant amounts of expert labor are expended simply maintaining the data (in the form of plant samples), rather than analyzing it.

Perhaps the artificial intelligence experts will step in to offer their own grease to ease this particular friction.