RECAP'ing the PACER: any lessons for circumventing censorship?

Here is a smart (and subversive!) technology that allows to make government data more accessible, despite the government's best efforts to thwart this process. Ars Technica has more:

Federal court documents are currently made available to the public through a crufty system called PACER. For eight cents per page, users can download filings and other relevant documents associated with individual cases. PACER is intended to open case law and court activity to broad public scrutiny, but the system's obfuscated design and its paywall significantly undermine its efficacy.

The content hosted on PACER can be freely redistributed by third parties because copyright is not applicable to court documents, but the access fees make it costly and difficult for data archivers to assemble their own comprehensive mirrors that would offload the hosting burden and make the content more easily accessible to the general public. Princeton's Center for Information Technology Policy (CITP) is launching a new project to tackle this problem.

A team led by CITP director Ed Felten has devised a novel means of boosting the availability of PACER documents outside of the paywall. They have created a new Firefox extension called RECAP that seamlessly replicates PACER content and uploads it to a mirror hosted by the Internet Archive. When RECAP users browse the PACER site, the content that they pay to view will be uploaded to the mirror by the Firefox extension. Users will get free access to the documents that are already hosted by the mirror.

Over time, free PACER content will accumulate at the Internet Archive's mirror, making it unnecessary for additional users to pay PACER for access to those files. The unrestricted availability of the mirrored legal documents will empower legal researchers and members of the public who can't simply pass the access costs along to clients as most lawyers do.


This is definitely very useful; I particularly like the post-modernist touch, whereby corporations are, in essence, generating information fodder for freelances and those who cannot afford to pay to access PACER.

But what about some other implications of such distributed technologies? For example, it may be useful to think of ways in which similar models could be devised to circumvent censorship. Firefox addons have been a life-save when it comes to circumvention; only think of the famous Access Flickr! addon which allows to access Flickr in countries where it's banned.What I am wondering, however, is whether it might be possible to create a product that would combine the RECAP's principle - perhaps, relying on cache? - that would allow users in authoritarian countries to access content that has just been accessed by their peers in countries that do not restrict access to the Internet. Now, this would be a very powerful technology, especially if you could find ways in which to make it learn from real-data generated by projects like Herdict.