In 2024, the artificial analysis inteligence index ranks AI models based on MMLU-Pro, a benchmark for reasoning and knowledge capabilites. DeepSeek R1, o1, and Claude 3.7 Sonnet Thinking lead with the highest scores at 84 percent, demonstrating stong analytical and comprehension skills.
In 2024, the artificial analysis math index ranked AI models based on their mathematical reasoning using benchmarks like AIME 2024 and Math-500. o1, QwQ-32B, and DeepSeek R1, led the rankings, showing the highest proficiency in mathematical problem solving.
In 2024, the artificial analisys intelligence index evaluates AI models across reasoning, knowledge, math, and coding. Grok 3 Reasoning Beta, o1, and DeepSeek R1 lead the rankings, showing high overall intelligence.
Comparison of Represents the average of coding benchmarks in the Artificial Analysis Intelligence Index (LiveCodeBench & SciCode) by Model
Comparison of Context Window: Tokens Limit; Higher is better by Model
Comparison of Artificial Analysis Intelligence Index vs. Context Window (Tokens) by Model
Comparison of Artificial Analysis Intelligence Index vs. End-to-End Seconds to Output 100 Tokens by Model
Comparison of Represents the average of math benchmarks in the Artificial Analysis Intelligence Index (AIME 2024 & Math-500) by Model
Comparison of Intelligence Index incorporates 7 evaluations spanning reasoning, knowledge, math & coding by Model
Comparison of Latency (Time to First Token) vs. Output Speed (Output Tokens per Second) by Model
Comparison of Image Input Price: USD per 1k images at 1MP (1024x1024) by Model
Comparison of Artificial Analysis Intelligence Index vs. Output Speed (Output Tokens per Second) by Model
Comparison of Artificial Analysis Intelligence Index vs. Price (USD per M Tokens) by Model
Comparison of Output Speed (Output Tokens per Second) vs. Price (USD per M Tokens) by Model
Comparison of Output Speed: Output Tokens per Second by Provider
Comparison of Seconds to First Token Received; Lower is better by Model
Comparison of Price: USD per 1M Tokens; Lower is better by Provider
Comparison of Output Tokens per Second; Higher is better by Model
Comparison of Price: USD per 1M Tokens by Model
In 2024, the artificial analysis inteligence index ranks AI models based on MMLU-Pro, a benchmark for reasoning and knowledge capabilites. DeepSeek R1, o1, and Claude 3.7 Sonnet Thinking lead with the highest scores at 84 percent, demonstrating stong analytical and comprehension skills.