What’s On the Net Stays on the Net: Thoughts on the Wayback Machine
Steve Vladeck (law, Miami) visiting at PrawfsBlawg tells an interesting anecdote about the Internet Archive, otherwise known as the “Wayback Machine.” Steve writes about a student who discovered his childhood pictures:
Well, apparently that cute idea I had for a webpage when I was a freshman in college, including the fun pictures page, didn’t die quite the fiery death I had hoped for it upon graduating (or, to be more honest, one month after last updating it in the fall of my sophomore year).
So, new law prawfs, beware!! If there’s a cute, funny webpage all about you from somewhere out there in the Internet ether, your students will find it… what they do with it, well, I’m just glad I kept some of the college photos off the page.
Sobering thoughts for any blogger before clicking on the “publish” button.
According to the Wayback Machine’s FAQ:
The Internet Archive Wayback Machine contains approximately 1 petabyte of data and is currently growing at a rate of 20 terabytes per month. This eclipses the amount of text contained in the world’s largest libraries, including the Library of Congress. If you tried to place the entire contents of the archive onto floppy disks (we don’t recommend this!) and laid them end to end, it would stretch from New York, past Los Angeles, and halfway to Hawaii.
A few other facts about the Wayback Machine:
Sites are usually crawled within 24 hours and no more then 48. Right now there is a 6-12 month lag between the date a site is crawled and the date it appears in the Wayback Machine. . . .
If a site owner properly requests removal of a Web site through http://www.archive.org/about/exclude.php, we will exclude that site from the Wayback Machine.