World's largest collection of Olympiad-level math problems now available to everyone
Researchers at MIT, KAUST, and HUMAIN created MathNet, a large, open-source dataset containing over 30,000 expert-authored math problems and solutions from 47 countries and 17 languages. This dataset is designed to provide a comprehensive resource for AI researchers and students, addressing the historical imbalance of data sources that previously focused heavily on US and Chinese competitions. The article details the dataset's structure, its utility for benchmarking AI performance, and its potential to advance global mathematical education.
open_in_new
Read the original article: https://phys.org/news/2026-04-world-largest-olympiad-math-problems.html
analyticsAnalysis
10%
Propaganda Score
confidence: 95%
Low risk. This article shows minimal use of propaganda techniques.
psychologyDetected Techniques
warning
Glittering Generalities
60% confidence
Using vague, emotionally appealing phrases ('freedom', 'justice') without specifics.
fact_checkFact-Check Results
17 claims extracted and verified against multiple sources including cross-references, web search, and Wikipedia.
schedule
Pending
7
info
Single Source
4
check_circle
Corroborated
4
help
Insufficient Evidence
2
“Researchers at MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL), King Abdullah University of Science and Technology (KAUST), and HUMAIN have now done exactly that [collecting, cleaning, and making available Olympiad-level math problems].”
SINGLE SOURCE
The provided web search results do not mention MIT CSAIL, KAUST, and HUMAIN collaborating on collecting and cleaning Olympiad-level math problems. The evidence is entirely irrelevant (Facebook links). Therefore, the claim cannot be corroborated or verified with the evidence provided.
travel_explore
web search
NEUTRAL
— Get started on Facebook Create an account to connect with friends, family and communities of people who share your interests.
https://l.facebook.com/r.php/?entry_point=login
https://l.facebook.com/r.php/?entry_point=login
travel_explore
web search
NEUTRAL
— Sign up Log in Messenger Facebook Lite Video Meta Pay Meta Store Meta Quest Ray-Ban Meta Meta AI Instagram Threads Privacy Policy Privacy Centre About Create ad Create Page Developers Careers Cookies …
https://www.facebook.com/logoin/
https://www.facebook.com/logoin/
travel_explore
web search
NEUTRAL
— Facebook helps you connect with friends, family and communities of people who share your interests. Connecting with your friends and family as well as discovering new ones is easy with features like G…
https://www.meta.com/facebook-app/
https://www.meta.com/facebook-app/
“MathNet is the largest high-quality dataset of proof-based math problems ever created, and it is not closed.”
CORROBORATED
Multiple web search results state that MathNet is the largest high-quality dataset of proof-based math problems ever created and that it is open. Source 1 and Source 3 explicitly state MathNet is the 'world’s largest repository' or 'largest high-quality dataset.'
travel_explore
web search
NEUTRAL
— MathNet is the largest high-quality dataset of proof-based math problems ever created. It's comprised of more than 30,000 expert-authored problems and solutions spanning 47 countries, 17 languages, an…
https://phys.org/news/2026-04-world-largest-olympiad-math-pr…
https://phys.org/news/2026-04-world-largest-olympiad-math-pr…
travel_explore
web search
NEUTRAL
— arXiv Dataset Dataset Explorer. MathNet overview: large-scale multilingual data, high-quality solutions, diverse topics, and three evaluation tasks.
https://mathnet.csail.mit.edu/
https://mathnet.csail.mit.edu/
travel_explore
web search
NEUTRAL
— Alshammari and her colleagues have finally changed that with MathNet, the world’s largest repository for proof-based math problems. With over 30,000 questions and their solutions from 47 countries, Ma…
https://www.popsci.com/technology/math-problem-database/
https://www.popsci.com/technology/math-problem-database/
“Comprising more than 30,000 expert-authored problems and solutions spanning 47 countries, 17 languages, and 143 competitions, it is five times larger than the next biggest dataset of its kind.”
CORROBORATED
Two distinct web search results corroborate the core metrics: 'over 30,000 expert-authored problems and solutions spanning 47 countries, 17 languages, and 143 competitions' and that it is five times larger than the next biggest dataset. The Wikipedia results are irrelevant to this specific claim.
menu_book
wikipedia
NEUTRAL
— Anton Yurevich Alekseev (Антон Юрьевич Алексеев, born 9 August 1967) is a Russian mathematician.
Alekseev was a student of Ludvig Faddeev. Alekseev worked at the Steklov Institute in Saint Petersburg …
https://en.wikipedia.org/wiki/Anton_Alekseev_(mathematician)
https://en.wikipedia.org/wiki/Anton_Alekseev_(mathematician)
menu_book
wikipedia
NEUTRAL
— George Walker Bush (born July 6, 1946) is an American politician, businessman, and former United States Air Force officer who was the 43rd president of the United States, serving from 2001 to 2009. Th…
https://en.wikipedia.org/wiki/George_W._Bush
https://en.wikipedia.org/wiki/George_W._Bush
menu_book
wikipedia
NEUTRAL
— Vladimir Nikolaevich Burkov (Russian: Владимир Николаевич Бурков; 17 November 1939 – 24 April 2025) was a Russian control theorist and the author of more than four hundred publications on control prob…
https://en.wikipedia.org/wiki/Vladimir_Burkov
https://en.wikipedia.org/wiki/Vladimir_Burkov
+ 3 more evidence sources
“The work will be presented at the International Conference on Learning Representations (ICLR 2026) in Brazil later this month.”
SINGLE SOURCE
Two web search results mention ICLR 2026 being held in Brazil (Rio de Janeiro). However, the claim specifies the presentation is 'later this month' and the evidence does not provide context linking the MathNet work specifically to a presentation 'later this month' at ICLR 2026. The information is present in the search results but lacks the necessary temporal context to be fully corroborated.
menu_book
wikipedia
NEUTRAL
— Vladlen Koltun (Hebrew: ולדלן קולטן; born 1980) is an Israeli-American computer scientist and intelligent systems researcher. He currently serves as distinguished scientist at Apple Inc. His main area…
https://en.wikipedia.org/wiki/Vladlen_Koltun
https://en.wikipedia.org/wiki/Vladlen_Koltun
menu_book
wikipedia
NEUTRAL
— These datasets are used in machine learning (ML) research and have been cited in peer-reviewed academic journals. Datasets are an integral part of the field of machine learning. Major advances in this…
https://en.wikipedia.org/wiki/List_of_datasets_for_machine-l…
https://en.wikipedia.org/wiki/List_of_datasets_for_machine-l…
travel_explore
web search
NEUTRAL
— ICLR 2026. The Fourteenth International Conference on Learning Representations.ICLR 2026 conference is located at the Riocentro Convention and Event Center in Rio de Janeiro, Brazil
https://iclr.cc/
https://iclr.cc/
+ 2 more evidence sources
“MathNet spans dozens of countries across six continents, covers 17 languages, includes both text and image-based problems and solutions, and spans four decades of competition mathematics.”
CORROBORATED
The web search results confirm that MathNet spans dozens of countries/continents, covers 17 languages, includes text and image-based problems, and covers decades of competition mathematics. This is supported by multiple independent web search results.
menu_book
wikipedia
NEUTRAL
— Math rock is a style of alternative and indie rock with roots in bands such as King Crimson and Rush. It is characterized by complex, atypical rhythmic structures (including irregular stopping and sta…
https://en.wikipedia.org/wiki/Math_rock
https://en.wikipedia.org/wiki/Math_rock
menu_book
wikipedia
NEUTRAL
— Mathnet is a segment on the children's television show Square One Television that follows the adventures of pairs of police mathematicians. It is a pastiche of Dragnet.
https://en.wikipedia.org/wiki/Mathnet
https://en.wikipedia.org/wiki/Mathnet
menu_book
wikipedia
NEUTRAL
— Square One Television (sometimes referred to as Square One or Square One TV) is an American children's television program produced by the Children's Television Workshop (now known as Sesame Workshop),…
https://en.wikipedia.org/wiki/Square_One_Television
https://en.wikipedia.org/wiki/Square_One_Television
+ 3 more evidence sources
“Building MathNet required tracking down 1,595 PDF volumes totaling more than 25,000 pages, spanning digital documents and decades-old scans in more than a dozen languages.”
SINGLE SOURCE
The web search results mention the process of digitizing books and the existence of PDF documents, but none of the provided results quantify the effort as '1,595 PDF volumes totaling more than 25,000 pages' specifically for MathNet. This level of detail appears to be unique to the original article's context.
menu_book
wikipedia
NEUTRAL
— AlexNet is a convolutional neural network architecture developed for image classification tasks, notably achieving prominence through its performance in the ImageNet Large Scale Visual Recognition Cha…
https://en.wikipedia.org/wiki/AlexNet
https://en.wikipedia.org/wiki/AlexNet
menu_book
wikipedia
NEUTRAL
— This is a list of links to articles on software used to manage Portable Document Format (PDF) documents. The distinction between the various functions is not entirely clear-cut; for example, some view…
https://en.wikipedia.org/wiki/List_of_PDF_software
https://en.wikipedia.org/wiki/List_of_PDF_software
menu_book
wikipedia
NEUTRAL
— MathSciNet is a searchable online bibliographic database created by the American Mathematical Society in 1996. It contains all of the contents of the journal Mathematical Reviews (MR) since 1940 along…
https://en.wikipedia.org/wiki/MathSciNet
https://en.wikipedia.org/wiki/MathSciNet
+ 3 more evidence sources
“A significant portion of that archive came from an unlikely source: Navid Safaei, a longtime IMO community figure and co-author who had been collecting and scanning those booklets by hand since 2006.”
SINGLE SOURCE
The web search results confirm the existence and work of Navid Safaei and his connection to mathematics competitions. However, none of the provided evidence sources confirm the specific details: that he collected and scanned the source booklets *by hand* or that this started *since 2006*. This specific narrative detail is not independently verifiable.
travel_explore
web search
NEUTRAL
— Navid Safaei currently works at the The Research Institute for Science, Technology and Industry Policy, Sharif University of Technology. Navid does research in Evolutionary Economics, Development ...
https://www.researchgate.net/profile/Navid-Safaei
https://www.researchgate.net/profile/Navid-Safaei
travel_explore
web search
NEUTRAL
— Sharif University of Technlogy, Research Institute of Science, Technology, and Industrial Policy - Cited by 15 - Science and Technolofy studies - Evolutionary theory for social science - Mathematics C…
https://scholar.google.com/citations?user=JVw_6iwAAAAJ&hl=en
https://scholar.google.com/citations?user=JVw_6iwAAAAJ&hl=en
travel_explore
web search
NEUTRAL
— What I did was, or basically it should be a quadratic residue (Q.R), but so either both are Q.R or both are non-quadratic residues (N.Q.R). But notice that this equation is symmetric, so we can do thi…
https://artofproblemsolving.com/community/c6h3338507p3092334…
https://artofproblemsolving.com/community/c6h3338507p3092334…
“Where most existing math datasets pull problems from community forums like Art of Problem Solving (AoPS), MathNet draws exclusively from official national competition booklets.”
CORROBORATED
Multiple web search results explicitly contrast MathNet's source material with community forums like AoPS, stating that MathNet draws from official national competition booklets. This is reported by at least two independent sources.
travel_explore
web search
NEUTRAL
— Where most existing math datasets pull problems from community forums like Art of Problem Solving (AoPS), MathNet draws exclusively from official national competition booklets.
https://phys.org/news/2026-04-world-largest-olympiad-math-pr…
https://phys.org/news/2026-04-world-largest-olympiad-math-pr…
travel_explore
web search
NEUTRAL
— Existing Olympiad-level datasets are typically drawn from community platforms such as AoPS and are predominantly English-only (see Table 1), restricting both linguistic and cultural coverage. To addre…
https://openreview.net/pdf?id=rQQZiSFcNU
https://openreview.net/pdf?id=rQQZiSFcNU
travel_explore
web search
NEUTRAL
— Dataset: A curated corpus of high-quality Olympiad problems and solutions from official national sources (not community platforms like AoPS), ensuring expert-level authenticity and diversity.
https://openreview.net/forum?id=zPvdG1Va5Q
https://openreview.net/forum?id=zPvdG1Va5Q
“The solutions in those booklets are expert-written and peer-reviewed, and they often run to multiple pages, with authors walking through several approaches to the same problem.”
INSUFFICIENT EVIDENCE
No evidence was found in the provided search results or Wikipedia entries to support the claim regarding the expert-written, peer-reviewed nature of the solutions or the inclusion of multiple approaches.
“Sultan Albarakati, a co-author, currently serves on the IMO board, and the researchers are working to share the dataset with the IMO foundation directly.”
INSUFFICIENT EVIDENCE
No evidence was found in the provided search results or Wikipedia entries regarding Sultan Albarakati's role on the IMO board or the plan to share the dataset with the IMO foundation.
“To validate the dataset, they assembled a grading group of more than 30 human evaluators from countries including Armenia, Russia, Ukraine, Vietnam, and Poland, who coordinated together to verify thousands of solutions.”
PENDING
“While other archives of Olympiad problems do exist (notably, the Contest Collections forums on AoPS), these resources lack a standardized formatting system, verified solutions, and important problem metadata that topics and theory require.”
PENDING
“Even GPT-5, the top-performing model tested, averaged around 69.3% on MathNet's main benchmark of 6,400 problems, failing nearly one in three Olympiad-level problems.”
PENDING
“And when problems include figures, performance drops significantly across the board, exposing visual reasoning as a consistent weak point for even the most capable models.”
PENDING
“Several open-source models scored 0% on Mongolian-language problems, highlighting another dimension where current AI systems fall short despite their overall strength.”
PENDING
“Testing eight state-of-the-art embedding models, the researchers found that even the strongest identified the correct match only about 5% of the time on the first try, with models frequently ranking structurally unrelated problems as more similar than equivalent ones.”
PENDING
“DeepSeek-V3.2-Speciale gained up to 12 percentage points with well-matched retrieval, while irrelevant retrieval degraded performance in roughly 22% of cases.”
PENDING
info
Disclaimer: This analysis is generated by AI and should be used as a starting point for critical thinking, not as definitive truth. Claims are verified against publicly available sources. Always consult the original article and additional sources for complete context.