Tough times for the Internet

Many pages have been removed or abandoned, so much so that a quarter of the active sites between 2013 and 2023 no longer exist or are no longer accessible: there is talk of "digital erosion"

According to an analysis by the Pew Research Center, a quarter of active websites between 2013 and 2023 no longer exist or are no longer accessible. The disappearance of these pages is attributed to various factors, including the deliberate removal of content, website transfers that break links, and the abandonment of websites—a true phenomenon of digital erosion.

The research found that approximately 23% of news pages contain at least one broken link, while 21% of government website pages and 54% of Wikipedia pages have references to content that no longer exists. This indicates that web erosion spares not even the most frequented platforms used for information seeking.

The phenomenon is particularly evident in older content: about 38% of web pages existing in 2013 are no longer available today. Even more recently created pages are not immune; 8% of the pages existing in 2023 have already disappeared. When a page becomes inaccessible, users are greeted with the familiar error message “404 Not Found,” indicating that the content no longer exists on the host server.

A phenomenon impacting social media too

Digital decay is not limited to websites but also affects social media. About one-fifth of tweets disappear within a few months of posting. Specifically, over 40% of tweets in Turkish or Arabic are no longer visible on the site within three months of posting. Tweets from accounts with default profile settings have a higher likelihood of disappearing.

For this analysis, the Pew Research Center collected a sample of pages from the web repository Common Crawl for each year from 2013 to 2023. It then verified the availability of these pages and the integrity of the links. It also examined the frequency with which social media posts are deleted or removed.

It has been concluded that web erosion poses a serious problem for the preservation of online knowledge. The ongoing disappearance of web content could compromise our ability to access historical and current information, thereby reducing the richness of our collective digital archive.

Source: Pew Research Center

Condividi su Whatsapp Condividi su Linkedin