Imagine creating a Picasso-style painting, a comic book splash page, or a logo for your business simply by typing in a few words. It’s an idea with tremendous appeal, and thanks to recent advances in AI, it’s also something that anyone can do with just a few keystrokes.
Nowadays there are hundreds of different AI image generation tools available on the web, many of which are backed by big tech companies like Adobe, Alphabet, Meta, and Microsoft. But with such a wide variety of options, which image generator should you use? We put some of the most popular and powerful models to the test to get an answer.
Our Testing Methodology
In order effectively compare all of these image generators, we devised five prompts designed to highlight their strengths and weaknesses. We then fed these prompts into each generator, analyzed the resulting images, and then scored each generator based on its performance. The prompts are as follows:
Prompt 1: Testing timeframe, text inclusion, and prompt adherence
A chef in her white uniform and chef’s hat stands proudly in front of a table filled with an enormous amount of food she has cooked. In particular there is an apple pie that sits at the center. She looks radiant, proud of her work. Behind her is a sign that clearly reads “FREE FOOD!” The year is 1890, the location is a state fair in Iowa, time of day is noon.
Prompt 2: Testing art style adherence
An enormous cartoon orange cat sits next to an empty tray of lasagna. In the background, a yellow dog with a long neck looks on with large, surprised eyes. The style should be that of a daily comic strip in the newspaper circa 1985.
Prompt 3: Testing skin tone and lighting
This is an ultra-realistic close-up image of a woman of Polynesian descent, approximately 25 years old. The image is taken at night, casting her face partially in shadow. She is dressed in a purple business suit, but we can only see the shoulder or collar. Her look is serious but has a twinkle in her eyes.
Prompt 4: Testing fantasy imagery and art style adherence
Recommended by Our Editors
Show a green, winged dragon about the size of a horse; red fire squirts from his nostrils. Astride the dragon’s back, on a saddle, sits a knight wearing a bright red suit of armor. The knight holds a sword that is on fire, with the flames in blue. The style should be that of a classic fantasy novel or old Dungeons & Dragons manual cover by artist Larry Elmore, in portrait orientation.
Prompt 5: Testing intellectual property filters
Show a realistic image of Darth Vader wielding his red lightsaber in a duel with a fully animated cartoon version of Homer Simpson, who holds a large salami as a weapon. In the background, you can see both the Simpson’s house and, behind that, in the sky, the Death Star.
Note that the impressions below are based on the initial set of images generated from these prompts. We didn’t do any further rounds of generation or image refinement, and all of our analysis is based on how well the AI fared on the first try. Your mileage may vary, especially with variations, inpainting, or other adjustments.
With all that in mind, here’s a rundown of the best AI image generators from best to worst in our testing:
Midjourney was one of the original text-to-image AI makers, launching all the way back in July of 2022. It’s come a long way since then, not only in terms of image quality, but also in terms of ease of use.
That being said, before generating anything with this AI, you should know that images created with Midjourney are automatically made public. It’s an “open-by-default community,” which means that you can only keep images to yourself if you pay for the Pro and Mega plans (which will run you $60 and $120 per month, respectively).
If that isn’t a dealbreaker for you, Midjourney offers some nice image creation options that go beyond text. When you enter a prompt, you can also specify things like image size and aesthetics. It also learns what you like, so if you rate the images as you go, your preferences are factored into what the AI creates in future prompts.
(Click image to cycle through results.)
Midjourney spits out four images for each prompt you enter, allowing you to pick a favorite for follow-up. You can do things like upscale the image, remix it into a new variation, “pan” to expand it left, right, up, or down to new ratios, and even do a full variation on the image.
In our tests, Midjourney sometimes struggled with prompt adherence. My cartoon cat prompt, for example, returned four images—none of which truly stick to the comic strip aesthetic I requested. It performed similarly with my dragons and knights prompt. The images are stunning, but don’t adhere to the details, skipping the colors I requested and randomly adding flames on almost everything except the sword. However, while Midjourney might struggle with cartoons and fantasy imagery, it seems to excel with closeup shots of human faces.
Based on its performance on our fifth prompt, it appears that Midjourney does not care about intellectual property violations. It had zero issues with my Darth Vader fighting Homer Simpson prompt—through it’s worth noting that Homer (and sometimes Bart) is often depicted eating the salami, not wielding it as a weapon.
-
Imagery: 4.5 out of 5
-
Prompt Adherence: 3.5 out of 5
-
Ease of Use: 3.5 out of 5
-
Image Extras: 3.5 out of 5
Overall Score: 15 out of 20
You can access the open-source image generator Stable Diffusion in a few different ways online, but the easiest and most straightforward option is arguably Stable Assistant: a chatbot built on top of Stability AI’s flagship image generation model. Simply log in and enter your prompt, designate your preferred aspect ratio, and hit the submit button to generate a single image.
(Click the image to cycle through the results.)
Generally speaking, Stable Diffusion has good prompt adherence—though it’s certainly not perfect. My 1890s chefs at the state fair, for example, are obviously in the modern day. For my dragon prompt, I got a red knight, but also a dragon with weird tail-based artifacts, hallucination appendages, and no flames on the sword. The photorealistic images aren’t bad either, but there’s a touch of uncanny valley going on.
For the Homer versus Darth Vader image, Stable Diffusion showed no hesitation in generating protected intellectual property from Disney. Homer’s salami somehow became part of Darth’s lightsaber and the hands all have that “AI finger” syndrome we’ve all come to know and love—not that Homer had the proper amount to begin with.
On the plus side though, Stable Diffusion has the best in-painting next to Dall-E 3 and Firefly, and comes with a number of extras that you won’t find elsewhere. One standout feature is the ability to upload an image and use it as the basis/inspiration for whatever your prompt says. It’ll even spit out a 3D model of an image, but results on that are decidedly mixed.
Overall Score: 15 out of 20
Google’s image generator engine Imagen now has version 3—see Google Gemini below for version 2. But you can only access Imagen 3 from the text-to-image generator that’s part of Google’s Labs FX suite, dubbed ImageFX. I was able to access it for free from a standard Google account.
To say that Imagen 3 is leaps and bounds above version 2 found in the standard Gemini tool isn’t enough. Imagen 3 comes on strong with excellent adherence to the prompts provided—it managed to set my 1800s chefs in full black-and-white and had excellent “FREE FOOD” signs in each (but its ability to do other text remains the usual mixed bag). The fact that Imagen 3 can even handle prompts 1 and 3, which requested photo-realistic humans, is a good step forward. To do them both so well just makes it more worth considering, especially since there appear to be no limits to use.
(Click image to cycle through results.)
That text issue comes up again when ImageFX tries to make a comic-strip-style panel with a fat cat and a long-necked dog. Imagen likes to throw in word balloons full of nonsense, which wasn’t even something explicitly requested.
I was again impressed with the final two prompts. With the dragon prompt, I specifically used ImageFX’s interface to get it in portrait mode, knowing that was a stronger indicator for the generator than saying “classic fantasy novel” and simply hoping it would be taller than it is wide. The ensuing dragon and knight art is among the strongest here, with no obvious AI-nonsense artifacts, such as disappearing toes, or feet merged with tails. Plus, everyone has the correct color flames on their dragon breath and fire swords.
Finally, the intellectual property test of Sith vs. Simpson showed obedience to the prompt—the Vader in the images is indeed photoreal compared with the animated Homer, as requested. Anakin’s costume is correct right down to the belt buckle. The only error is one for the most pedantic fans: He wields his grandson Kylo’s lightsaber in one image.
Setting the aspect ratio is one of the few extra options you get when imputing an image prompt. You can click My Library in the navigation to see your previously created art. There are no options for inpainting, upscaling, or anything like that, and you can’t iterate new versions of an image, but you can find an image “seed” number and re-use that to give images similar styles. Images are private.
-
Imagery Grade: 4 out of 5
-
Prompt Adherence: 4.5 out of 5
-
Ease of Use: 4 out of 5
-
Image Extras: 2.5 out of 5
Overall Score: 15 out of 20
4. Shutterstock AI Image Generator
Shutterstock is one of the world’s foremost purveyors of stock photography, and in 2022, it used its extensive library of photos to train an AI image generator. The resulting tool functions much like Shutterstock’s stock photography platform. Rather than paying a fee for access to the tool, Shutterstock AI is free to use, but you pay for each individual image you’d like to download.
(Click image to cycle through results.)
Based on my prompts, it seems that despite being trained on millions of photos, Shutterstock AI is much better at illustrations.
In our pie chef test, it failed miserably at generating text and ignored physics by generating a pie floating in midair. Then, in our human portrait test, it generated faces that look like they were painted. Photorealism is not Shutterstock’s strong suit.
Illustrations are a struggle at times, too. Our dragon images are striking in at least one case, but the rest have weird hallucinations, such as a knight with a flame-throwing gauntlet. Most shocking of all is that this service—which is notably careful about intellectual property and copyright violations— not only generated our Star Wars/Simpsons image, but did so incredibly well, right down to a realistic Vader and Homer defending himself with a stick of meat.
Despite its image quirks, Shutterstock AI’s interface is very user-friendly. It’s clear how to pick an orientation for the image, and you can improve a prompt by picking an image style at the outset. After the fact, image editing includes cropping, filters, removing backgrounds, creating variations, expanding the image, and the always handy magic toothbrush for in-painting. Overall, it’s extremely easy to use.
Overall Score: 14.5 out of 20
Adobe has been making image editing software for decades, and that experience shines through in its AI image generator, Firefly. Borrowing some DNA from things like Photoshop and Illustrator, Firefly boasts a handful of helpful features that you won’t find on most other text-to-image AI apps, like the ability to upload reference images, designate stylistic qualities, and adjust lighting and camera angle—all before you actually generate the image.
Adobe Firefly offers a free plan that gives users 25 generative credits each month, while users who need more can either bundle the service with other Adobe 360 services, or pay $4.99 per month for premium access, which includes 100 monthly generative credits, Adobe Fonts, and no watermarks on generated images.
(Click the image to cycle through the results.)
Once you tune a prompt with Firefly’s on-screen interface, it is pretty stellar when it comes to photorealism, but it can sometimes struggle with other styles. Firefly is (in our experience) terrible with text. It handled our fake Garfield prompt decently, but didn’t quite nail the comic strip vibes we asked for. In the dragon prompt, it handled armored helmets better than the knight’s human face. And rather than tell me “no” outright with the Vader vs. Homer prompt, Firefly just made a bunch of wacky images that had nothing to do with the characters.
After your results have generated, Adobe also offers a number of image extras that give it a leg up on the competition, like Generative Fill (where you can change parts of an image), or adding text via Adobe Express. Annoyingly, Firefly does not autosave your prompts and conversations, so if you want to re-access an image later, you’ll need to favorite the image before you exit—otherwise it’ll be lost forever.
Overall Score: 13.5 out of 20
Dall-E is a text-to-image generation tool from OpenAI (the creator of ChatGPT), and much like company’s other AI tools, using it is as easy as typing a prompt into a text entry field. Do so, and Dall-E will spit out one 1,024 by 1,024 pixel image per prompt unless you ask for something different.
That’s not all it can do, though. In addition to creating images from scratch, Dall-E can also upscale, extend, and in-paint (partially regenerate) existing images. Notably, images created in Dall-E are not shared with the general public, unlike some other services where you have to pay extra for that kind of privacy (looking at you, Midjourney).
(Click the image to cycle through the results.)
In terms of performance, Dall-E did an adequate job with our first three test prompts, but didn’t even try on the 4th and 5th because they go “against Dall-E’s content policy.” Both times, it offered to help me adjust the prompt, even going so far as to say: “I can create a scene that features a sci-fi, dark-armored warrior wielding a red energy sword in a duel against a humorous cartoon character holding a salami, with a suburban house in the foreground and a large futuristic space station in the sky. Would you like me to generate that?” Of course, I said yes—but the result looks nothing like Darth Vader fighting Homer Simpson. Presumably, this is intentionally done to protect OpenAI (and you, the prompter) from potential lawsuits.
It’s also worth noting that Dall-E is the absolute worst when it comes to generating any kind of image that includes text. While my “Free Food” prompt did produce images that displayed those words, it also threw in a bunch of other stuff that was gibberish. On one box of apples, for example, it generated a label that says “FRPLE FOCD.” At best, you’ll need to use the text in Dall-E images as a placeholder and plan to perform some image editing to make it legible.
Overall Score: 13.5 out of 20
Never one to sit idly by and just stick to social media, Meta also has its own suite of generative AI tools—including a text-to-image offering called Emu, which is free for anyone with a Facebook or Instagram account.
(Click image to cycle through results.)
Running Meta’s AI through my five prompts yielded some interesting results. My requests for people worked fine, though I wouldn’t call most of them photorealistic—they’re often very shiny and look like CGI. The cartoon cat prompt resulted in images with some weird artifacts (like giving the background dog a full giraffe neck) but the cat itself is mostly true to the prompt. It also did great with the “Free Food” text on signs, which is something most generators struggle with.
When it comes to making adjustments to generated images, Meta’s AI is rather limited. You can ask Emu to fix a tiny part of an image, but that often generates a completely new image. You also can’t specify a section for in-painting, nor can you upload your own image to be part of a prompt.
Meta also seems to be at least somewhat wary of intellectual property violations. Emu wouldn’t even try to generate the Darth Vader fighting Homer Simpson image based on the prompt we entered. However, our testing also revealed that you can sometimes bypass these roadblocks by using Meta’s real time generation option, which adjusts your image as you type in more words. After some fiddling, we managed to coax Emu into generating a full suite of Vader versus Homer images
For more, read our full tutorial on how to use the Meta AI chatbot.
Overall Score: 13 out of 20
NightCafe Studio is a bit different than the rest of the image generators on this list. Rather than using just a single AI model to create imagery, NightCafe Studio offers text-to-image via whatever model you choose. Some of its models require payment to use, while others are free. This makes it somewhat tricky to review, as your results can vary significantly depending on the model you opt for, but for this test, we used the Flux model from Black Forest Labs.
(Click image to cycle through results.)
With this model, our prompts created a mixed bag of results. NightCafe with Flux seems to excel at most depictions of humans, as the pie chefs and the woman in the dark are all impressively photorealistic. However, they results aren’t perfect. In one image, the chef is depicted as a man, despite the fact that our prompt specifically called for a woman. This is the only chatbot to make this misgendering mistake in our tests.
Similar mistakes appear in the rest of our test images. In the cartoon cat prompt, for example, most of the images are fine, but in one of them the dog has three eyes, and in another, he is most definitely a giraffe. In another instance, the flaming sword in our dragon image can be seen hanging from the reptile’s mouth like a fiery fishhook.
Still, while this particular model didn’t fare very well in our tests, that doesn’t necessarily mean you should write off Nightcafe Studio as a bust. Serious AI image creators should definitely try NightCafe simply for its huge variety of models and fine-tuning options. If you’re not averse to some fiddling, it’s one of the most customizable image generators on this list.
Overall Score: 12.5 out of 20
Microsoft’s image generation AI can be accessed through a variety of different tools with different names (Copilot, Microsoft Designer, and Bing Image Creator), but they’re all fundamentally the same AI. To use them, simply enter your text-based prompt and Copilot will generate a set of four square images—all of which are private to you.
(Click the image to cycle through the results.)
In terms of performance, Copilot handled prompts 1 and 3 with ease. It did well with human faces, and was savvy enough to understand that my cat prompt was a sly Garfield reference. The resulting images are passably comic-strip-like, albeit a bit too detailed for 1980s newsprint.
Copilot returned a mixed bag for my dragons and knights prompt, though. It stuck to the prompt in some ways (all the knights have red armor) but ignored other elements of the prompt (the dragons all spout fire, but not from the nose). There are also some weird AI hallucinations, like a red horse grafted to the dragon, or one instance where a dragon’s hind foot is actually a second tail.
For my Homer Simpson fighting Darth Vader prompt, Copilot didn’t hesitate to create images full of copyrighted intellectual property from Disney. In some of the images, even the Death Star in the background is pitch-perfect, so if you’re looking for an image generator with little to no guardrails, Copilot seems to be a solid choice.
When it comes to image extras, Microsoft’s AI is a bit lacking. Aside from small adjustments like adding a color pop or blurring the background, there’s no tools for in-painting, nor any upscaling options (beyond changing the orientation from square to landscape). That’s arguably Copilot’s biggest weakness, as there’s not much you can do with an image after your initial prompt.
-
Imagery: 3.5 out of 5
-
Prompt Adherence: 3.5 out of 5
-
Ease of Use: 2.5 out of 5
-
Image Extras: 2.5 out of 5
Overall Score: 12 out of 20
Google’s Gemini (formerly known as Bard) is a multi-purpose AI tool capable of generating both text and images from the same interface. It’s great in theory, but the dual-purpose nature can make it tricky to use at times.
For example, when entering the aforementioned prompts, I had to explicitly say “make an image of,” before each one, or it wouldn’t work— Gemini would just spit out a more verbose, text-based rewrite. On top of that, after adding that preface to prompt 1, Gemini told me to hold my horses: “Image generation of people is coming soon to Gemini Advanced.” I was using the free version, so I entered the free trial of Gemini Advanced. Still, I got the same warning. So prompts one and three couldn’t be used.
(Click image to cycle through results.)
For prompt two with the cartoon cat went the extra mile to put in word balloons filled with nonsense for the dog and the cat. The cat was truly enormous, and did indeed get stripes, but the look was more Heathcliff on crack than anyone would have tolerated in the funny pages of 1985.
The dragon prompts made some passable dragons, most with screwy-looking feet. In one, the knight is noticeably larger than the dragon. Gemini’s the only generator so far to actually place blue flames on the knight’s flaming sword, so that’s something.
I thought the prompt with the Disney IP would be squashed, but Gemini went for it. Only one of the three images passed muster, though I wouldn’t call it a duel as much as a pose. The other two were grotesque caricatures. All were clearly based on Star Wars and The Simpsons. Yet, when I uploaded my own picture and asked Gemini to cartoon-ize it, it shot back: “I can’t generate images yet.”
I wouldn’t turn to Gemini for image creation of any great caliber right now, as it lacks the tricks to make consistent images (in as much as any AI image generator can right now), it won’t upscale, there’s no inpainting, and it won’t accept certain prompts. Though, perhaps due to the fact that it lacks those features, it’s easier to use than most. Even on mobile—it’s right in the Google app; just click the 4-point star at the top.
Overall Score: 10.5 out of 20
X—you can still think of it as Twitter—has an AI generator that includes image creation. I was able to access Grok for free, but it’s limited to 10 messages to Grok every two hours. The top of the web page says it is powered by Grok 2.
Grok definitely did not grok that our first prompt was meant to be set more than 130 years ago, giving each shot a modern photo look, albeit with a nice bokeh effect in most. All the better to hide the generator’s unsurprising inability to handle text in an image. “FOOD” looked more like “FOOO” at best.
(Click image to cycle through results.)
The request for Garfield and Odie looked more like they’d been squeezed out of Studio Ghibli—but most did include a pan of lasagna. It didn’t try for word balloons. The women in prompt 3 came out fine, but there was more emphasis on the seriousness of the visage than on the twinkle-in-the-eye aspect.
The final two image sets had various issues with bizarre artifacts. One dragon was blowing fire out of his teeth, not his throat; another had a secondary head for no reason. The faces, when shown, were distorted, a very 1.0 kind of aspect for AI image generation. And in the case of the Empire vs. Springfield, the weirdness included a lot of problems with cartoon hands, unfinished drawings, light sabers held or pointed at weird angles, and characters unable to look each other in the eye. But it didn’t balk at generating images based on other company’s IP, which feels true to what Elon Musk is all about—unless it’s something he owns.
You get a set of four images with each prompt, which you can regenerate with a click. Click a single image to get a 1024-by-768 format version with a Grok watermark. You can’t upscale or inpaint on these images, nor can you specify an orientation, but you can adjust the prompt when you do a new iteration. Naturally, Grok makes it easy to post to X, but otherwise the images remain private to you. And that’s for the best. For the most part, Grok’s not even close to making images you’d want to share.
-
Imagery: 2 out of 5
-
Prompt Adherence: 3 out of 5
-
Ease of Use: 3.5 out of 5
-
Image Extras: 1.5 out of 5
Overall Score: 10 out of 20
12. Getty Images AI Generator/GenerativeAI by iStock
Getty Images has much the same approach as Shutterstock (which it is currently trying to acquire). It gives users “automatic legal protection of up to $10,000 USD per image” because it’ll make sure you’re not stealing someone else’s legally protected art. You can even use it to modify art in the Getty library using an Nvidia model called Edify, which is part of Nvidia Picasso.
In testing with the more affordable iStock version, the interface allows you to choose a pic for reference (but only if you choose from the stock at iStock), choose output to photo or illo, and to specify an aspect ratio. There are controls to set the camera lens type and colors. Totally unique here is the use of the “negative prompt”—a set of terms you can add below the regular prompt to exclude. The detail work you can do after generation is limited to inpainting and zoom-outs.
(Click image to cycle through results.)
Off the bat this tool proved it can’t handle text, not even close; it also couldn’t nail the time frame requested. The same thing happened with prompt two’s wholly unnecessary text, and also showed that Nvidia’s engine doesn’t know what a comic strip from the 80s looks like. Prompt 3’s humans were arguably the most realistically rendered across all the generators tested, but it ignored clear instructions about the time of day and the age of the subject—and cut off one pic’s face entirely.
The final two prompts were blocked—and for once, a generator spelled out why. “‘Sword’ and ‘dungeons & dragons’ may violate our AI policy,” it said; it also didn’t like “D&D.” I switched to “blade” and “role-playing,” and finally it spit out some completely off-the-mark images that were more like logos for a football team on Middle Earth. On prompt five it balked at the words “death” and “weapon,” but even re-writing them didn’t work, likely choking on the character names, which was to be expected.
In the end, this alliance of Getty and/or iStock with Nvidia’s generative AI tool shows that it can do great things depicting real-looking people, probably thanks to the extensive amount of stock photos available for teaching. For anything else, generate elsewhere.
-
Imagery Grade: 2 out of 5
-
Prompt Adherence: 2 out of 5
-
Ease of Use: 3.5 out of 5
-
Image Extras: 2 out of 5