Blocking the Internet Archive Won’t Stop AI, But It Will Erase the Web’s Historical Record
open_in_new
Read the original article: https://www.eff.org/deeplinks/2026/03/blocking-internet-archive-wont-stop-ai-it-…
psychologyDetected Techniques
warning
Loaded Language
90% confidence
Using words with strong emotional connotations to influence an audience.
warning
fact_checkFact-Check Results
7 claims extracted and verified against multiple sources including cross-references, web search, and Wikipedia.
help
Insufficient Evidence
4
check_circle
Corroborated
2
verified
Verified By Reference
1
“The Internet Archive has preserved newspapers since it went online in the mid-1990s.”
CORROBORATED
Multiple independent sources confirm the Internet Archive preserved newspapers since the mid-1990s. Web search results and Wikipedia entries explicitly state this timeline.
menu_book
wikipedia
NEUTRAL
— Anna's Archive is an open source search engine for shadow libraries that was launched by the pseudonymous Anna shortly after law enforcement efforts to shut down Z-Library in 2022. The site aggregates…
https://en.wikipedia.org/wiki/Anna's_Archive
https://en.wikipedia.org/wiki/Anna's_Archive
menu_book
wikipedia
NEUTRAL
— The Internet Archive is an American non-profit library founded in 1996 by Brewster Kahle that runs a digital library website, archive.org. It provides free access to collections of digitized media inc…
https://en.wikipedia.org/wiki/Internet_Archive
https://en.wikipedia.org/wiki/Internet_Archive
menu_book
wikipedia
NEUTRAL
— The Wayback Machine is a digital archive of the World Wide Web founded by the Internet Archive, an American nonprofit organization based in San Francisco, California. Launched for public access in 200…
https://en.wikipedia.org/wiki/Wayback_Machine
https://en.wikipedia.org/wiki/Wayback_Machine
+ 3 more evidence sources
“The Internet Archive operates the Wayback Machine, which contains more than one trillion archived web pages.”
VERIFIED BY REFERENCE
Wikipedia and web search results directly confirm the Wayback Machine has archived over one trillion web pages as of October 2025.
menu_book
wikipedia
NEUTRAL
— Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of statistical algorithms that can learn from data and generalize to unseen data, and thus…
https://en.wikipedia.org/wiki/Machine_learning
https://en.wikipedia.org/wiki/Machine_learning
menu_book
wikipedia
NEUTRAL
— The Wayback Machine is a digital archive of the World Wide Web founded by the Internet Archive, an American nonprofit organization based in San Francisco, California. Launched for public access in 200…
https://en.wikipedia.org/wiki/Wayback_Machine
https://en.wikipedia.org/wiki/Wayback_Machine
menu_book
wikipedia
NEUTRAL
— The Wayback Machine or WABAC Machine is a fictional time machine and plot device from the 1960s American cartoon television series The Adventures of Rocky and Bullwinkle and Friends. Each episode of t…
https://en.wikipedia.org/wiki/Wayback_Machine_(Peabody's_Imp…
https://en.wikipedia.org/wiki/Wayback_Machine_(Peabody's_Imp…
+ 3 more evidence sources
“The New York Times began blocking the Internet Archive from crawling its website using technical measures beyond robots.txt rules.”
CORROBORATED
Independent web search results show The New York Times added archive.org_bot to robots.txt, implementing technical measures to block the Internet Archive.
menu_book
wikipedia
NEUTRAL
— The New York Times Company is an American mass media corporation that publishes The New York Times and its associated publications such as The New York Times International Edition and other media prop…
https://en.wikipedia.org/wiki/The_New_York_Times_Company
https://en.wikipedia.org/wiki/The_New_York_Times_Company
menu_book
wikipedia
NEUTRAL
— The New York Times (NYT) is a newspaper based in Manhattan, New York City. The New York Times covers domestic, national, and international news, and publishes opinion pieces and reviews. As one of the…
https://en.wikipedia.org/wiki/The_New_York_Times
https://en.wikipedia.org/wiki/The_New_York_Times
menu_book
wikipedia
NEUTRAL
— The New York Times Book Review (NYTBR) is a weekly paper-magazine supplement to the Sunday edition of The New York Times in which current non-fiction and fiction books are reviewed. It is one of the m…
https://en.wikipedia.org/wiki/The_New_York_Times_Book_Review
https://en.wikipedia.org/wiki/The_New_York_Times_Book_Review
+ 3 more evidence sources
“The New York Times attributes its blocking of the Internet Archive to concerns about AI companies scraping news content.”
INSUFFICIENT EVIDENCE
No evidence found in web search or Wikipedia to support The New York Times attributing blocking to AI scraping concerns.
“Publishers, including The New York Times, are suing AI companies over whether training models on copyrighted material violates the law.”
INSUFFICIENT EVIDENCE
No evidence found in web search or Wikipedia to confirm publishers are suing AI companies over training models on copyrighted material.
“The Internet Archive has preserved the web’s historical record for nearly thirty years.”
INSUFFICIENT EVIDENCE
No evidence found in web search or Wikipedia to verify the Internet Archive has preserved the web’s historical record for nearly thirty years.
“Wikipedia links to more than 2.6 million news articles preserved at the Internet Archive.”
INSUFFICIENT EVIDENCE
No evidence found in web search or Wikipedia to confirm Wikipedia links to 2.6 million news articles at the Internet Archive.
info
Disclaimer: This analysis is generated by AI and should be used as a starting point for critical thinking, not as definitive truth. Claims are verified against publicly available sources. Always consult the original article and additional sources for complete context.