If the internet is at its core is a system of record, then it is failing to complete that mission. Sometime in 2014, the internet surpassed a billion websites, while it has since fallen back a bit, it’s quite obviously an enormous repository. When websites disappear, all of the content is just gone as though it never existed, and that can have a much bigger impact than you imagine on researchers, scholars or any Joe or Josephine Schmo simply trying to follow a link.
Granted, some decent percentage of those pages probably aren’t worth preserving (and some are simply bad information), but that really shouldn’t be our call. It should all be automatically archived, a digital Library of Congress to preserve and protect all of the content on the internet.
As my experience shows, you can’t rely on publishers to keep a record. When it no longer serves a website owner’s commercial purposes, the content can disappear forever. The trouble with that approach is it leaves massive holes in the online record.
Cheryl McKinnon, an analyst with Forrester Research, who covers content management, has been following this issue for many years. She says the implications of lost content on the internet could be quite profound. READ MORE: The Internet Is Failing The Website Preservation Test | TechCrunch
Harvard’s flagship library, Widener, is an imposing granite cube built quite literally as shrine to the book. A central alcove cuts through the stacks to show off a prized relic: an original Gutenberg bible. But this is not the heart of Harvard’s libraries. No, that would be its cold storage site, an anonymous concrete building few students or even faculty know about.
The Harvard Depository, some 30 miles from the Cambridge campus, better resembles an Amazon warehouse than a library. The 200,000 square foot facility houses the vast majority of Harvard Library’s collection—some 9 million books, films, LPs, magnetic tapes, and pamphlets sorted not by the Dewey decimal system but by size.
A fascinating new interactive documentary, Cold Storage, glimpses inside this little-known world.
READ MORE: A Glimpse Inside the Hidden Vault Where Harvard Keeps Millions of Books | Gizmodo
Reddit users have created a machine-readable data set of over 200,000 Jeopardy questions. The data, which the dataset’s creators scraped from fan-created question repository J!-Archive, contains each question’s answer, along with category, dollar value, air date, and other data.
READ MORE: Fans Create Database of Over 200,000 Jeopardy Questions | Center for Data Innovation
If you’ve been thinking about getting started on the rocket project that’s been on your mind for ages, now is a good time to get serious. Next week, NASA will release a massive software catalog with over 1,000 projects. It’s not the first time the space agency’s released code, but it is the first time they’ve made it so easy.
The breadth and variety of the software projects that NASA’s about to give away are difficult to express. It’s not just a bunch of algorithms and star-finding software, though stuff like that is in there. The crazy geniuses that land rovers on Mars are actually releasing code for ultra high-tech NASA stuff like rocket guidance systems and robotics control software. There’s even some artificial intelligence.
And did I mention it’s all free? Read more: NASA’s About To Release a Mother Lode of Free Software | Gizmodo.
See also: NASA Technology Transfer Portal