Monday, April 5, 2010

San Francisco - Internet Archive

I was picked up at around 10am this morning from my hotel by Kris Carpenter, Director of the Web Archiving group at the Internet Archive.

We dropped by their original Data Centre in the Mission area of downtown San Francisco and saw the latest storage boxes they have designed in collaboration with Capricornia Technologies which of course have further reduced the amount of power and cooling required overall to manage equivalent data volumes. Overtime these will gradually replace the now famous red "petaboxes" which they have relied on for some years. They now have two other sites with the Sun Microsystems "datacentre in a shipping crate" located at Santa Clara and the ISC.org data center in Redwood City also now providing space for them. They intend to relocate most of their data storage to a NASA data center in Mountain View, CA over time.

The IA's web archiving team. along with the other IA activities have been located at the Presidio for many years. However IA is now consolidating all of its activities in their own building (a former Christian Science Church) in the Richmond distric of San Francisco and they will all have moved there within the next couple of months.

I caught up with current initiatives with web archiving and in particular talked about the take-up of their Archive-It subscription service which allows researchers to build thematic collections of content captured from the web. This is very useful to social scientists in particular. The take-up amongst Universities and Libraries/Archives in the US is very substantial.
...
Archive-It, a subscription service from the Internet Archive, allows institutions to build and preserve collections of born digital content. Through our user-friendly web application, Archive-It partners can harvest, scope, catalog, manage, and browse their archived collections. Collections are hosted at the Internet Archive data center and are accessible through Url and full-text search.
Over 125 partners currently use Archive-It, including state archives and libraries, university libraries, federal institutions, museums, and public libraries.
....
The take-up in Australia is still very low, just the University of Melbourne and the NLA (who although they use their own web archiving infrastructure for archiving Australian web content use the subscription service for collecting Asian web content because of its better support for non-roman scripts). A number of other Australian universities have expressed interest but have not subscribed yet.

No comments:

Post a Comment