xAI has introduced Grok 4 Fast, a new reasoning model designed for efficiency and lower cost. The model reduces average thinking tokens by 40% compared with Grok 4, which brings an estimated 98% decrease in cost for equivalent benchmark performance. It maintains a 2-million token context window and a unified architecture that supports both reasoning and non-reasoning use cases. The model also integrates tool-use capabilities such as web browsing and X search.
In benchmark tests, Grok 4 Fast scores close to Grok 4 on GPQA, AIME, and HMMT, while outperforming Grok 3 Mini. On the LMArena Search Arena, its search variant ranked first with an Elo of 1163, and its text variant placed among the top in its category.
Compared to similar models, Grok 4 Fast delivers higher efficiency than OpenAI’s GPT-4 Turbo and Anthropic’s Claude 3 Opus on cost-per-benchmark-point evaluations, while showing slightly lower raw accuracy on some high-end reasoning tasks. Independent analysis from Artificial Analysis highlighted its cost-to-intelligence ratio as more favorable than most models in the same weight class. In agentic browsing tasks, Grok 4 Fast’s performance also surpasses both Claude 3 Haiku and Mistral Large.
Community response has focused on the balance between cost and performance. AI Scientist Rudi Ranck commented:
I can’t remember the last time I was so impressed with a model. Grok 4 fast achieving Gemini 2.5 Pro level intelligence at a ~25X cheaper cost.
Meanwhile Developer Axel Pond noted:
Genius to call it Grok 4 Fast instead of Grok 4 mini. Associate the product with its pros, not its cons.
Grok 4 Fast is now available through grok.com Fast and Auto modes, and accessible via xAI API under grok-4-fast-reasoning and grok-4-fast-non-reasoning. It is temporarily free to try on OpenRouter and Vercel AI Gateway. xAI has stated that further updates will expand its multimodal and agentic features.