Don’t miss out on our latest stories. Add PCMag as a preferred source on Google.
AI has already dominated humans in tasks like chess and mathematics, and is increasingly gaining ground in fields like programming, advertising, and even therapy. But researchers think there is one frontier AI hasn’t been able to cross yet: being authentically toxic on the internet.
A recent paper from the University of Zurich, University of Amsterdam, Duke University, and New York University finds that social media posts generated by a variety of LLMs are “readily distinguishable” from those made by humans with 70–80% accuracy, “well above chance.”
Researchers tested nine open-weight LLMs from six distinct model families (Apertus, DeepSeek, Gemma, Llama, Mistral, and Qwen), plus a large-scale Llama model, across Bluesky, Reddit, and X. The study, first reported by Ars Technica, found that the “toxicity score” of posts on these platforms consistently emerged as a key discriminator between AI and human posts.
Translation: If someone responds to your post with a particularly hilarious or cutting zinger, it was probably written by a human.
“These results suggest that while LLMs can reproduce the form of online dialogue, they struggle to capture its feeling: the spontaneous, affect-laden expression characteristic of human interaction,” said the researchers.
The study suggests LLMs are better at imitating the technical aspects of social media posts, like sentence length or word count, rather than expressing emotions. Across all three platforms, average toxicity scores were lower in the AI responses than in the authentic human replies.
Recommended by Our Editors
The findings come after AI users complained earlier this year that ChatGPT’s 4o model had become too sycophantic. GPT-5, however, overcorrected on that front, prompting OpenAI to re-release the more friendly GPT-4o after people lost their minds over GPT-5’s curt responses.
Researchers also found that AI models whose responses had not been fine-tuned by human instruction—such as Llama-3.1-8B, Mistral-7B, and Apertus-8B—outperformed their instruction-tuned counterparts. They suggest this may indicate that alignment training “introduces stylistic regularities that make text more, rather than less, machine-like.”
The models struggled most when generating posts in certain contexts. For example, expressing positive emotions on Elon Musk’s X or Bluesky, or discussing politics on Reddit. On the whole, all the AI models tested performed better when imitating posts on X compared to Bluesky, with Reddit posts being the hardest of the three to imitate, since “conversational norms are more diverse” on the site, according to researchers.
Get Our Best Stories!
Your Daily Dose of Our Top Tech News
By clicking Sign Me Up, you confirm you are 16+ and agree to our Terms of Use and Privacy Policy.
Thanks for signing up!
Your subscription has been confirmed. Keep an eye on your inbox!
About Our Expert
Experience
I’m a reporter covering weekend news. Before joining PCMag in 2024, I picked up bylines in BBC News, The Guardian, The Times of London, The Daily Beast, Vice, Slate, Fast Company, The Evening Standard, The i, TechRadar, and Decrypt Media.
I’ve been a PC gamer since you had to install games from multiple CD-ROMs by hand. As a reporter, I’m passionate about the intersection of tech and human lives. I’ve covered everything from crypto scandals to the art world, as well as conspiracy theories, UK politics, and Russia and foreign affairs.
Read Full Bio
