Why online transparency matters

Aaron Swartz has written an intriguing but, in my opinion, fundamentally flawed argument for why efforts to increase government transparency with digital tools undertaken by organizations like the Sunlight Foundation are extremely overrated and contribute very little to actually ridding us of corruption. Speaking more broadly, Swartz takes a rather skeptical view of most nascent efforts to aggregate large chunks of data and visualize them in new and exciting ways; as someone who has built such an online project himself (watchdog.net or "the good government site with teeth", as it still proudly boasts to its visitors), he brings an extremely well-informed perspective to the table.

Identifying corruption, says Swartz, requires more than a mere re-arrangement of data rows; data is just a veil – and usually useless at that -to more sinister activities – lunches with lobbyists and voting under emergency provisions – which almost never show up in official voting records that are being digitized and scrapped by Sunlight and its grantees (this is what Swartz means by “reality doesn't live in databases”).

Diligent and long-term work by journalists, on the other hand, may help to uncover many more irregularities that are invisible to those who focus on aggregating data points only.However, since the practice of investigative journalism appears to be dying, we would be better off putting the online transparency efforts on hold and focusing our attention on saving journalism (which is, more or less, what Swartz has done by stopping working on watchdog.net)

Leaving aside the great work done by investigatige journalists (and its very uncertain future), I think that there are more than a few reasons to be skeptical about Swartz's skepticism.

On a very basic level, he overlooks the fact that most campaigns to bring on more data transparency via the internet usually have increasing returns to scale; scrapping government data and arranging it into databases may be very expensive and labor-intensive, but once it's done, all future projects built around it usually cost very little to build. To put it differently, the effort required to bring on Government 2.0 with open and easy-to-mash-up data streams may be gigantic, but once it's completed, all further efforts to take advantage of these data streams are almost free.

Government 2.0 is a process as much as an objective; while there is always room for perfection, it's reasonable to assume that at some in the next few years, we'll feel relatively comfortable about the kinds of government data available on the web (at least in the United States). Thus, the stark choice that Swartz is trying to set up – that between supporting investigative journalism and database-driven transparency work – is not as stark as he paints it; if all we need is one giant push to automate the gathering and auto-publishing of government data in perpetuity, let's make the push – it surely wouldn't hurt. While it's hard to predict what would happen with this data once it's published, all recent trends suggest that it would be used by enthusiastic talented solo-developers like Swartz; I see no harm in this – not everyone can be an investigative journalist, after all.

Second – and I think this is where most of Swartz's conceptual errors begin – publishing government data is useful for a whole number of reasons – not all of them having to do with empowering investigations of the Watergate type. I like to think of it in terms of the signalling theory from economics (for the economists out there: yes, I know that the matching is not perfect, but I think it points in the right direction); in a political environment with tremendous information asymmetry, whereby one party (the elected representatives) have both the power to cheat and to devise legalistic cover-ups to their misbehavior, it helps that voters have the power to send powerful signals about their ability to keep their elected representatives in check, and potentially curtailing the latter's propensity to misbehave.

By this logic, the more organizations like Sunlight we have, the more powerful the signalling becomes (think of this as “they know that we know that they know that we know that we are watching them” kind of constructions). Of course, there are many activities that are not yet disclosed and are thus invisible – meetings with lobbyists and private luncheons with companies – but this doesn't mean that we should stop paying close attention to the ones which ARE observable, especially given that the costs of observation are falling dramatically.

Swartz's major conceptual mistake is thinking that transparency sites answer a “who?” question; answering questions is not really what they are really good for, even though they may occasionally be helpful in this regard. Think about the famous “tree falling in a forest” dilemma and try to apply it to the world of transparency: are these online platforms and databases helping to fight corruption even if there is nobody to query them? I'd venture to say that, yes, they do, simply because there is a distant possibility that one day somebody will actually step up and use those sites. The real point here is not the frequency of use – although it would be nice to see many more people use these databases – but the fact that both parties are aware that such sites and databases exist (which, in a way, brings us back to the importance of signalling).

Swartz's juxtaposition of databases to investigative journalism at best seems quite inept. If I understand his view correctly, he believes that politicians would be increasingly less willing to engage in corrupt practices if they knew that there is a contingent of well-trained investigative reporters out there. But, by the same logic, one could argue that the reason why politicians have to revert to secretive means of advancing third-party agenda in the first place is because the more transparent mechanisms – like voting – are subject to the kind of verification mechanisms that Swartz believes to be useless. But go ahead and remove those mechanisms – and what we would get is an even more corrupt political progress.

Once again, the logic here is very simple: assuming that congressmen are rational players who usually follow the path of least resistance, there is no reason to expect that they would prefer to engage in more covert dealings if they didn't feel that selling their vote in the open market would not be noticed. Of course, the world we get is not perfect, but at least it makes corruption harder - and isn't it a realistic objective to pursue? Assessing relative value added by investigative journalists or other systems of public checks and balances (like databases) is a very hard task; and for the reasons outlined above, I think that forcing the public to make a choice between them is misleading, simply because online databases are, to a large extent, are already there and do not require significant amounts of money to prop them up.