Blocking the Internet Archive Won’t Stop AI, But It Will Erase the Web’s Historical Record

Mar 16, 2026 · 590 words · By Joe Mullin

AI and Copyright Laws Historical Record Preservation

open_in_new Read the original article: https://www.eff.org/deeplinks/2026/03/blocking-internet-archive-wont-stop-ai-it-…

psychologyDetected Techniques

warning

Loaded Language 90% confidence

Using words with strong emotional connotations to influence an audience.

warning

Appeal to Fear 95% confidence

Building support by instilling anxiety or panic in the audience.

fact_checkFact-Check Results

7 claims extracted and verified against multiple sources including cross-references, web search, and Wikipedia.

help Insufficient Evidence 4

check_circle Corroborated 2

verified Verified By Reference 1

check_circle

“The Internet Archive has preserved newspapers since it went online in the mid-1990s.”

CORROBORATED

Multiple independent sources confirm the Internet Archive preserved newspapers since the mid-1990s. Web search results and Wikipedia entries explicitly state this timeline.

menu_book

wikipedia NEUTRAL — Anna's Archive is an open source search engine for shadow libraries that was launched by the pseudonymous Anna shortly after law enforcement efforts to shut down Z-Library in 2022. The site aggregates…
https://en.wikipedia.org/wiki/Anna's_Archive

menu_book

wikipedia NEUTRAL — The Internet Archive is an American non-profit library founded in 1996 by Brewster Kahle that runs a digital library website, archive.org. It provides free access to collections of digitized media inc…
https://en.wikipedia.org/wiki/Internet_Archive

menu_book

wikipedia NEUTRAL — The Wayback Machine is a digital archive of the World Wide Web founded by the Internet Archive, an American nonprofit organization based in San Francisco, California. Launched for public access in 200…
https://en.wikipedia.org/wiki/Wayback_Machine

+ 3 more evidence sources

verified

“The Internet Archive operates the Wayback Machine, which contains more than one trillion archived web pages.”

VERIFIED BY REFERENCE

Wikipedia and web search results directly confirm the Wayback Machine has archived over one trillion web pages as of October 2025.

menu_book

wikipedia NEUTRAL — Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of statistical algorithms that can learn from data and generalize to unseen data, and thus…
https://en.wikipedia.org/wiki/Machine_learning

menu_book

wikipedia NEUTRAL — The Wayback Machine or WABAC Machine is a fictional time machine and plot device from the 1960s American cartoon television series The Adventures of Rocky and Bullwinkle and Friends. Each episode of t…
https://en.wikipedia.org/wiki/Wayback_Machine_(Peabody's_Imp…

+ 3 more evidence sources

check_circle

“The New York Times began blocking the Internet Archive from crawling its website using technical measures beyond robots.txt rules.”

CORROBORATED

Independent web search results show The New York Times added archive.org_bot to robots.txt, implementing technical measures to block the Internet Archive.

menu_book

wikipedia NEUTRAL — The New York Times Company is an American mass media corporation that publishes The New York Times and its associated publications such as The New York Times International Edition and other media prop…
https://en.wikipedia.org/wiki/The_New_York_Times_Company

menu_book

wikipedia NEUTRAL — The New York Times (NYT) is a newspaper based in Manhattan, New York City. The New York Times covers domestic, national, and international news, and publishes opinion pieces and reviews. As one of the…
https://en.wikipedia.org/wiki/The_New_York_Times

menu_book

wikipedia NEUTRAL — The New York Times Book Review (NYTBR) is a weekly paper-magazine supplement to the Sunday edition of The New York Times in which current non-fiction and fiction books are reviewed. It is one of the m…
https://en.wikipedia.org/wiki/The_New_York_Times_Book_Review

+ 3 more evidence sources

help

“The New York Times attributes its blocking of the Internet Archive to concerns about AI companies scraping news content.”

INSUFFICIENT EVIDENCE

No evidence found in web search or Wikipedia to support The New York Times attributing blocking to AI scraping concerns.

help

“Publishers, including The New York Times, are suing AI companies over whether training models on copyrighted material violates the law.”

INSUFFICIENT EVIDENCE

No evidence found in web search or Wikipedia to confirm publishers are suing AI companies over training models on copyrighted material.

help

“The Internet Archive has preserved the web’s historical record for nearly thirty years.”

INSUFFICIENT EVIDENCE

No evidence found in web search or Wikipedia to verify the Internet Archive has preserved the web’s historical record for nearly thirty years.

help

“Wikipedia links to more than 2.6 million news articles preserved at the Internet Archive.”

INSUFFICIENT EVIDENCE

No evidence found in web search or Wikipedia to confirm Wikipedia links to 2.6 million news articles at the Internet Archive.

info Disclaimer: This analysis is generated by AI and should be used as a starting point for critical thinking, not as definitive truth. Claims are verified against publicly available sources. Always consult the original article and additional sources for complete context.

eFinder

eFinder

Blocking the Internet Archive Won’t Stop AI, But It Will Erase the Web’s Historical Record

psychologyDetected Techniques

fact_checkFact-Check Results