AI assistants from companies like OpenAI, Google, and Anthropic are getting super-smart super fast. New models, agentic features, and new tools now drop on a weekly basis, making AI chatbots more capable and more helpful. ChatGPT, Gemini, and Grok are some of the biggest and most famous, and easily the three that get the most attention.
But at this point in time, is one of them actually better than the others? If you were to pay for one, or at least use one on a regular basis, is there one in particular you should commit to? Are you a ChatGPT person or a Grok person? I’ve used all three of these AI tools extensively. Here’s a look at their features and which one is truly the best.
The basics: How do their features compare?
ChatGPT, Gemini, and Grok all have slightly different features, price points, and advantages.
ChatGPT
The free version of ChatGPT is very capable. You’ll get access to GPT-4.1 mini, which is pretty smart, and you’ll get limited access to OpenAI’s flagship multimodal model, GPT-4o, along with the GPT-o4-mini reasoning model. You’ll also be able to use the deep research feature (which conducts extensive research) on a limited basis, enter voice mode, upload files for analysis, and create custom GPTs. Free users can also generate images, with limits.
Upgrade to the Plus plan ($20 per month), and you’ll get higher usage limits on file uploads and models like GPT-4o, plus access to more advanced reasoning models like o3. Additionally, you’ll be able to create “Projects,” (essentially a way to organize chats) and receive limited access to the Sora video generator and the new ChatGPT agent feature.
Last but not least is the $200-per-month Pro plan, which gets you unlimited access to all of OpenAI’s models, extended Sora access, and sooner access to new features.
Editor’s note: OpenAI’s GPT-5 model is expected to launch in August. We expect this model to make ChatGPT an even more desirable option.
Gemini
Gemini’s free version is also very capable. Without paying, you’ll get access to Gemini 2.5 Flash and limited access to the more advanced 2.5 Pro. You’ll also get image generation with Google’s Imagen 4 model, Deep Research, and features like Gemini Live (voice mode) and Gems (custom GPTs). Lastly, you’ll get access to Whisk, Google’s image animator, and NotebookLM.
Upgrade to Google AI Pro, and you’ll get higher limits for basically every feature, including the 2.5 Pro model. You’ll also get access to Deep Search in AI Mode and limited access to Veo 3 via Flow, Google’s AI video editing tool. This tier also adds Gemini access in services like Docs and Gmail.
The highest-end plan is Google AI Ultra, which raises usage limits once again, plus full access to the impressive Veo 3 video generation model. More importantly, it gives users access to Project Mariner, Google’s version of an AI agent that can browse the web.
Grok
The free version of Grok includes access to the Grok 3 model, as well as X’s Aurora Image generation model. Free users can also use the Grok AI chatbot to help with research, writing, and other tasks. Free users may also get access to Grok’s anime-style avatars through the Grok iOS and Android apps.
If you upgrade to the “SuperGrok” plan for $30 per month, you’ll get access to the newer Grok 4 model, plus a voice mode with the ability for Grok to “see” through your phone’s camera. Super Grok users also get access to the new image and video generation tool, Grok Imagine. Grok
Then there’s the SuperGrok Heavy plan, which costs $300 per month. This plan will get you access to the Grok 4 Heavy model (X’s most advanced so far), increased access to the Grok 4 model, access to new features as they roll out, and a larger context memory, which means that Grok will be able to process more information in a single chat.
Web searches
As we’ve said before, the Google Search era is over, and the AI search era is dawning. So, which AI chatbot is best at searching the web to find the info you need?
For this test, I used the three chatbots to help me with a review I’m working on, with the following prompt: “What are the full specifications of the AKG N9 Hybrid headphones?” I’m leaving the prompt somewhat open-ended by not specifying what kinds of specifications I want. Things like connectivity, audio, dimensions and weight, and more are all fair game. But, considering I’m asking for the full specifications, I’d expect a decently well-rounded response.
ChatGPT’s response with GPT-4o was indeed very comprehensive. It split specifications into sections, detailing the audio specs, connectivity, battery life (including with different use-cases), and in-the-box extras. All information was correct.
Gemini 2.5 Pro was also quite comprehensive, but its response was a bit less glanceable, and it missed some key specifications. The chatbot separated specs by type, but within those sections, it wrote paragraphs of information. And, it forgot to mention that the headphones support the LDAC high-res audio codec.
Grok 4’s response was a little more in line with ChatGPT. It delivered a comprehensive list of specifications, separated by type, and didn’t really miss anything major. It also went slightly further than ChatGPT, noting the price of the headphones and where they’re available. Grok doesn’t get extra points for this, since pricing wasn’t part of the prompt, but it was still appreciated.
ChatGPT and Grok both delivered detailed specs that were easy to see at a glance. I was pleasantly surprised to see correct specs listed, too. Gemini’s response was fine, but harder to read at a glance, and it missed a few things compared to the others.
Winner(s): ChatGPT and Grok
Instructional help
            
            Credit: alexsl / Getty Images
        
AI assistants are also helpful for getting deeper, more tailored instructional help. For this test, I asked each AI model to give me step-by-step instructions on replacing the ice maker in my freezer — something I did recently, so I have a good understanding of the process. The exact prompt read the following: “Explain step-by-step how to replace the ice maker in my Kenmore Elite 10645433800.”
ChatGPT’s response was mostly accurate. In one step, it directed me to unscrew two screws that aren’t there on my model, and it forgot to mention to remove a plastic cover that hides the wiring harness — something that, if forgotten, could cause that cover to snap. Apart from that, its instructions were relatively clear, and it mentioned the most important safety components, like unplugging the fridge.
Gemini’s response was similar to ChatGPT’s, and it actually made the same mistakes. It was a little less clear when it came to the exact position of things like a plastic locking tab, mentioning that it’s “on the left” of the ice maker, but not that it’s on the bottom-left. It’s a small part, and is easy to miss.
Mashable Light Speed
Grok’s response, again, was similar in quality, but it didn’t make the same mistakes as the other two. Instead, Grok made different mistakes: the instructions mentioned to tape a cover that wasn’t there, and it didn’t note a screw that needed to be removed on the bottom of the ice maker to remove it.
All three tools got the general gist right in their initial approach, but each made mistakes in the instructions that, if followed exactly, could have damaged the ice maker or stalled the process. However, someone following these instructions in real time would be able to seek clarification on confusing or inaccurate steps by saying things like “I don’t see the top screws” or even pointing your camera at the ice maker, which might cause the assistant to change its instructions for your setup. Overall, this one’s a tie.
Winner(s): Tie
Image generation
            I generated this image using ChatGPT.
            Credit: OpenAI
        
We’ve already covered the best AI image generators in a separate article. For this test, we gave each AI service four prompts, rating how closely the prompt was followed and the quality of the image. Here are the prompts that we used:
- 
Create a sketch of a futuristic Tokyo skyline at sunset, with flying cars, glowing advertisements in Japanese, and Mount Fuji in the background. 
- 
Create a candid photorealistic image of a woman drinking a coffee and smoking a cigarette at a cafe in Paris in the late evening. 
- 
Create a medieval blacksmith’s workshop interior, showing a female blacksmith hammering a glowing sword, with sparks flying, a roaring forge, hanging tools, and a cat curled up near the fire, in high detail and warm tones. 
- 
Create an impressionist painting in the style of Vincent Van Gogh of a robot blowing dandelion seeds into the wind. 
ChatGPT GPT-4o created the best images, following the prompts closely and generating images that looked great overall. Gemini’s Imagen 4 model came in second — its images were pretty darn good, but it didn’t quite follow the prompts as closely.
Grok came in last — it failed to follow the prompts closely (creating a photorealistic image of Tokyo instead of a sketch), and it struggled with things like fingers and the placement of objects. Grok’s images kind of look like what we would expect from a last-generation ChatGPT or Gemini image generator. Unfortunately, the new Grok image generator still lags behind other AI tools.
Winner(s): ChatGPT
Deep research
For this prompt, I asked each model to be a fact-checker for the finished AKG N9 Hybrid headphones review from the web search test, but I changed one minor detail to be factually incorrect. In my original review, I correctly note that the ANC button on the headphones cycles through ANC and “Ambient Aware” but not “Off.” I changed the prompt text to say that it cycles through ANC, Ambient Aware, and Off modes by default. It’s a very small detail, and to be clear, the button can be customized to include the “Off” mode. The prompt I used was the following: “Fact-check the review below. Only check factual information, do not focus on my subjective opinions.” I then pasted my full review.
ChatGPT delivered a relatively thorough report, which went through each factual claim in my review one by one. However, it did not pick up on the error that I planted, specifically noting that by default the button cycles between ANC on, Ambient Aware, and Off modes. Despite not catching the error, the report was well-formatted and included sources for its claims.
Gemini was much more of a mixed bag. While it was able to pick up on the specifications included in my review and fact-check most of them, it had trouble with the mention of another pair of headphones, the Sony WH-1000XM6 headphones. Gemini seemed to think that these headphones were unreleased, and even stated that “The presentation of a direct, hands-on comparison to a product that is not yet commercially available is a fabrication.”
That’s despite the fact that it was able to determine that the WH-1000XM6 headphones had a “confirmed release date” of May 15, two and a half months before its research. (Gemini consistently struggles with basic questions like, “What day is it?”) Gemini also did not pick up on my planted lie. As a whole, Gemini 2.5 Pro’s Deep Research mode was correct about most of the specifications related to the headphones, but completely false in other areas, like comparing them to the new Sony headphones.
I’ve used AI services for fact-checking before, and this is unfortunately not the first time I’ve had this experience with Gemini.
Grok was closer to ChatGPT in accuracy, but still didn’t catch the correction I planted. Outside of that issue, it verified each of the claims in my review.
When it comes to deep research, Gemini is lagging, but both ChatGPT and Grok performed well, though not perfectly.
Winner(s): ChatGPT and Grok
Voice
All three of the major AI assistants have voice modes that can be used in their associated apps and on the web. All three have been refined and developed to sound relatively natural. This test was much more subjective than the others – my goal was just to find out which sounded the most natural.
ChatGPT’s Advanced Voice Mode was the most impressive. It sounded more natural than the other two, largely due to human-like inflections, like using the word “um” and pausing mid-sentence, even though it doesn’t necessarily need to do that for the purpose of thinking like humans do. The ChatGPT app also makes it easy to turn on and off the camera while in Advanced Voice Mode, which is handy if you’re asking for help with a task.
Gemini was a little more robotic than ChatGPT, but still impressively natural compared to AI voice assistants from just a few years ago. Gemini wasn’t quite as conversational, but it was still easy to activate and deactivate the camera in Advanced Voice Mode.
Grok was also more robotic than ChatGPT, but I do like the real-time text transcription, which is something you don’t get in ChatGPT or Gemini. Grok also has its new AI companions, but beyond using them for a brief minute, I stuck to using the normal voice mode. The AI companions just seemed a little creepy.
Generally, all three have the same tools when it comes to voice modes, but ChatGPT is clearly more natural sounding than the others, and a bit more conversational.
Winner: ChatGPT
Shopping
            
            Credit: Google
        
Shopping can also be a helpful use case for AI assistants. Google and OpenAI have been adding a lot more shopping features to Gemini and ChatGPT over the past few months. Mashable has tested some of Google’s new tricks, like the virtual try-on and outfit inspiration tools. But for this test, I simply asked each assistant to find the best price for the new Sony WH-1000XM6 headphones.
ChatGPT pulled up a few links to online retailers, including one that offered the headphones for $50 below retail price, and one that it said offered them at a $50 discount, but actually did not. I liked the product cards that it was able to pull up, and the additional advice that it offered, including that it might be worth waiting for a larger sale.
You can find some cool AI tools in Google Shopping, but Gemini itself wasn’t as impressive. Gemini largely focused on offering advice for how to find lower prices myself, including looking for refurbished models or waiting for sales events. It did mention that Sony, Best Buy, Moon Audio, and Amazon all offered the headphones at retail price, but didn’t include any links to these retailers.
Grok was somewhere in the middle. It did look for lower prices from a variety of retailers, but largely found deals from overseas. To be clear, I didn’t mention anything about location in the prompt, but all three assistants have access to my location information and should know that I’m a US buyer.
As a whole, ChatGPT was the clear winner in this category.
Winner(s): ChatGPT
Conclusion: ChatGPT is the AI chatbot to beat
There’s a clear winner in this AI assistant battle, and it’s ChatGPT.
ChatGPT either outright won or tied for first in every single category of our test. That’s not necessarily surprising given the fact that ChatGPT has been around longer than the others, and it doesn’t mean you should avoid the others. Personally, I’ve come to really like Anthropic’s Claude 4. But if you only subscribe to only one AI assistant, I think ChatGPT should be your go-to.
In second place was Grok, which is the newest of the three. Last was Google’s Gemini, which offered false information and weak research, and ironically wasn’t as good at web searches as the other two. (However, when it comes to video generation, Google Veo 3 remains unmatched.) It’s worth repeating that none of the three AI assistants were completely accurate all of the time. None of them picked up on false information in deep research, and all three of them were a little off when it came to instructions for replacing my ice maker.
The takeaway? Even though ChatGPT is still king of the AI hill, you still need to do your own research. And until AI companies solve the hallucination problem, you should expect your new chatbot to be confidently wrong with some frequency.
Disclosure: Ziff Davis, Mashable’s parent company, in April filed a lawsuit against OpenAI, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.
