I have been in the middle of a major rethink of search engines’ efforts to digitize books. As it started I enthusiastically celebrated their potential to tame information overload. But major research librarians are now questioning search engines’ practices here:
Several major research libraries have rebuffed offers from Google and Microsoft to scan their books into computer databases, saying they are put off by restrictions these companies want to place on the new digital collections. The research libraries, including a large consortium in the Boston area, are instead signing on with the Open Content Alliance [OCA], a nonprofit effort aimed at making their materials broadly available.
As the article notes, “many in the academic and nonprofit world are intent on pursuing a vision of the Web as a global repository of knowledge that is free of business interests or restrictions.”
As noble as I think this project is, I doubt it can ultimately compete with the monetary brawn of a Google. And why should delicate old books get scanned 3 or 4 times by duplicative efforts of Google, Microsoft, the OCA, and who knows what other private competitor? I also worry that a fragmented archiving system might create a library of Babel. So what is to be done?