XAI Unveils Grok 4 And Grok 4 Heavy With Record-Breaking AI Benchmarks

Elon Musk’s AI company, xAI, is turning heads again with the release of its next-gen Grok 4 models, and they’re setting some serious records. The two new models, Grok 4 and Grok 4 Heavy, are both focused on advanced reasoning and problem-solving — and the benchmark numbers are wild.

Alongside the models, xAI dropped a premium subscription tier called SuperGrok Heavy, priced at $300 per month. This new plan gives users full access to Grok 4 Heavy, the company’s most powerful model to date.

Grok 4 (Thinking) achieves new SOTA on ARC-AGI-2 with 15.9%

This nearly doubles the previous commercial SOTA and tops the current Kaggle competition SOTA pic.twitter.com/YbCMLXPJ2e
— ARC Prize (@arcprize) July 10, 2025

Also read: Hello Kitty Island Adventure Drops City Classics Update on iOS, Sunshine Celebration Arrives on Switch and PC

Advertisements

xAI Unveils Grok 4 and Grok 4 Heavy with Record-Breaking AI Benchmarks

Let’s talk numbers — because Grok 4 isn’t just keeping up with the competition, it’s leaving them in the dust:

🧠 GPQA benchmark: Grok 4 scored 87.5%, while Grok 4 Heavy hit 88.9%
🧮 AIME 2025 exam: Grok 4 Heavy nailed it with a perfect 100%
💀 Humanity’s Last Exam (with tools): Grok 4 Heavy got 44.4%, and Grok 4 scored 38.6% — both significantly ahead of Gemini 2.5 Pro (26.9%) and OpenAI’s o3 (24.9%)

But the real headline is the ARC-AGI-2 benchmark, a brand-new test designed to challenge the reasoning abilities of top-tier models. Grok 4 scored 15.9%, the highest ever recorded — and that’s double the score of Claude Opus 4 and OpenAI’s o3. On the older ARC-AGI-1 benchmark, Grok 4 still led the pack with 66.7%.

xAI says Grok 4 Heavy is their largest and most powerful model yet, capable of parallel problem-solving using multiple agents. Basically, it’s not just smart — it’s built to think like a team.

Looking ahead, the roadmap is packed. Musk announced:

An AI coding model coming in August
A multi-modal agent dropping in September
And possibly a video generation model by October

All this solidifies xAI as a serious contender in the AI world, right up there with OpenAI, Google, and Anthropic — and maybe even a step ahead. With Grok 4’s performance, xAI just planted its flag as a leading force in building foundational AI models.

Also Read: Asmongold Becomes Most-Watched Streamer of Q2 2025—Thanks to Going All-In on Kick and Twitch

Castle Crashers’ Painter Boss Paradise DLC Launches August 6 – Steam Exclusive

Battlefield 6 Open Beta Might Be Just Around the Corner

Leave a Comment Cancel reply

xAI’s Grok 4 Sets New AI Benchmark Records, Launches $300 SuperGrok Plan

Leave a Comment Cancel reply