Some information sticks around when it shouldn’t, while other information vanishes when it should remain

Tuesday, July 6th, 2021

The Internet is rotting, Jonathan Zittrain notes:

The first study, with Kendra Albert and Larry Lessig, focused on documents meant to endure indefinitely: links within scholarly papers, as found in the Harvard Law Review, and judicial opinions of the Supreme Court. We found that 50 percent of the links embedded in Court opinions since 1996, when the first hyperlink was used, no longer worked. And 75 percent of the links in the Harvard Law Review no longer worked.

People tend to overlook the decay of the modern web, when in fact these numbers are extraordinary — they represent a comprehensive breakdown in the chain of custody for facts. Libraries exist, and they still have books in them, but they aren’t stewarding a huge percentage of the information that people are linking to, including within formal, legal documents. No one is. The flexibility of the web — the very feature that makes it work, that had it eclipse CompuServe and other centrally organized networks — diffuses responsibility for this core societal function.

The problem isn’t just for academic articles and judicial opinions. With John Bowers and Clare Stanton, and the kind cooperation of The New York Times, I was able to analyze approximately 2 million externally facing links found in articles at nytimes.com since its inception in 1996. We found that 25 percent of deep links have rotted. (Deep links are links to specific content — think theatlantic.com/article, as opposed to just theatlantic.com.) The older the article, the less likely it is that the links work. If you go back to 1998, 72 percent of the links are dead. Overall, more than half of all articles in The New York Times that contain deep links have at least one rotted link.

[…]

Of course, there’s a keenly related problem of permanency for much of what’s online. People communicate in ways that feel ephemeral and let their guard down commensurately, only to find that a Facebook comment can stick around forever. The upshot is the worst of both worlds: Some information sticks around when it shouldn’t, while other information vanishes when it should remain.

Comments

  1. Bob Sykes says:

    How many people remember recordings on magnetic wires, or punch cards, of 5 1/4 in floppies or…?

    The history of the computer is the destruction and replacement of record making and keeping systems by newer ones that are not backward compatible. Computers destroy history. Future historians (if any survive, 2525) will label our current era as a dark age, because there will be no records of what happened.

  2. Faze says:

    Whenever Isegoria posts a long list of past links (as was done for Independence Day) I make a note to myself to check them all out when I have more time. Alas, when that day comes, I wonder if they’ll be there.

  3. Isegoria says:

    I don’t know if this blog is exactly Lindy, but I’ve done my best to maintain it for almost two decades!

  4. VXXC says:

    This a very interesting take.

    “Future historians (if any survive, 2525) will label our current era as a dark age, because there will be no records of what happened.”

  5. David Foster says:

    A big part of the problem is that media sites keep reorganizing themselves and make no attempt to use permalinks. Less of a problem for blogs; in that case, the main problem is that the whole blog disappears. Very often can be found on archive.com, though.

  6. David Foster, the domain at archive.com appears to be for sale, LOL https://sell.sawbrokers.com/domain/archive.com/

  7. Mike-SMO says:

    It isn’t obvious but persistent old links mean that someone is paying for server space. If something matters to you, save it locally so you will have it. Don’t expect someone else to do it and pay for it for your convenience.

    Consider buying a printer and a used file cabinet. Yeak, I know. Indexing is a real pain.

Leave a Reply