> AI_IQ: UNDERSTANDING_THE_PERCENTILES

IQ to Percentile Conversion

Here's how IQ scores translate to percentiles, giving you a sense of where an AI system stands relative to human performance:

IQ Score	Percentile	Description
70	2nd	Significantly below average
85	16th	Below average
100	50th	Average
115	84th	Above average
130	98th	Highly gifted
145	99.9th	Exceptionally gifted
160	99.99th	Profoundly gifted

Remember: When an AI achieves "130 IQ," it means it outperformed 98% of humans on that specific test. The same AI might perform at 70 IQ on a different type of reasoning task.

Frontier AI IQ — June 2026

aiiq.org maps 12 hard benchmarks (FrontierMath, ARC-AGI-2, GPQA Diamond, SWE-bench, Humanity's Last Exam...) onto the human IQ scale. The frontier as of June 10, 2026:

Model	Est. IQ	Human percentile
GPT-5.5	~130–136	98th–99th
Claude Opus 4.8	~130	98th
Gemini 3.1 Pro	~127	96th
Claude Opus 4.7	~126	96th
Kimi K2.6	~122	93rd
GLM-5.1	~121	92nd
Qwen 3.7 Max	~118	88th
Grok 4.3	~117	87th

• One year ago the frontier was o3 at ~112. That's ≈ +1.3 IQ points per month, measured.
• On novel, offline tests scores drop 20–40 points — public-test results partly reflect memorization.
• The jagged truth: the same models solving Erdős problems still fumble tasks a child finds trivial. IQ is one lens, not the territory.
• Live trackers: aiiq.org, trackingai.org, metr.org

The "Highest Human IQ" Question

• Real IQ tests stop measuring around 160 (99.99th percentile) — beyond that there aren't enough humans to norm the test against.
• Terence Tao's famous "220–230 IQ" is a ratio-IQ extrapolation from scoring 760 on the math SAT at age 8 — not a measured deviation IQ. Tao himself calls the figure noise and has claimed only "greater than 175."
• Marilyn vos Savant's record 228 was a childhood ratio score; her adult deviation-test result was ≈186. Guinness retired the "Highest IQ" category in 1990 as statistically unreliable.
• So when honest AI scores reach ~160, "AI IQ" stops being a number at all. That's why the IntExp chart marks everything above 160 as off the human scale — from there on, you need different rulers: task horizons, percentile-of-experts, things no human can do at any speed.

AI IQ Benchmarks & Progress

Click on any image below to view it fullscreen and explore the latest AI capability assessments:

Click to Enlarge

GPQA Diamond Benchmark Progress

AI performance on graduate-level scientific questions (July 2023 to May 2025)

Click to Enlarge

PhD-Level Scientific Reasoning

Benchmarking AI on doctoral-level scientific problems

Click to Enlarge

AI IQ on Codeforces

Competitive programming performance showing AI reaching exceptional problem-solving levels

Click to Enlarge

AI IQ Progress Tracking (11 Months)

MaximumTruth's tracking of AI IQ improvements over time - offline testing, not MENSA certified