By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: Here’s Why AI Can’t Replace You | HackerNoon
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > Computing > Here’s Why AI Can’t Replace You | HackerNoon
Computing

Here’s Why AI Can’t Replace You | HackerNoon

News Room
Last updated: 2025/09/23 at 7:51 PM
News Room Published 23 September 2025
Share
SHARE

Every few months, someone declares that “AI will replace all of us.”

Since I work with it closely, I get that question all the time.

But look closer: AI isn’t replacing people, it’s replacing tasks. And there’s a huge difference.

LLMs Are Parrots With Jet Engines

Large language models like ChatGPT, Claude, and DeepSeek are built to predict the next token so convincingly that it feels like a person wrote it, and they are brilliant at it. They can translate better than Google Translate, draft emails, debug code, and even simulate a therapist’s warmth.

But being good at sounding right is not the same as being right.

These models learn from a blend of books, articles, code repos, Wikipedia, forum posts, and scraped web pages. Some of it is peer-reviewed. Most of it isn’t. No army of editors checks the truth of every line. The data is riddled with contradictions, biases, outdated facts, and outright fabrications. Think of it as learning medicine from every medical textbook ever written… and every health forum, every horoscope blog, and a few recipe sites for good measure. The model sees patterns, but it doesn’t “know” which patterns reflect reality. It just gets very good at mimicking consensus language.

I’ve seen first-hand why that matters.

Quality Over Quantity

In 2016, I worked on a machine-learning project to detect obfuscated malware. Microsoft had a public Kaggle dataset (Microsoft Malware Classification Challenge) for exactly this problem. My supervisor advised me to use it or to generate synthetic data. Instead, I decided to start from zero.

For several months, I downloaded malware every day, ran samples in a sandbox, reverse-engineered binaries, and labeled them myself. By the end, I had a dataset of about 120,000 malware and benign samples, which is far smaller than Microsoft’s but was built by hand.

The results spoke loudly:

| Training Dataset | Accuracy |
|—-|—-|
| Microsoft Kaggle dataset | 53% |
| My own hand-built dataset | 80% |
| My dataset + synthetic data | 64% |

Same algorithms. Same pipeline. Only the data changed.

The point: the best performance came from manual, expert-curated data. Public data contained anomalies; synthetic data introduced its own distortions. The only way to get high-quality signals was to invest time, expertise, and money in curation.

That’s the opposite of how LLMs are trained: they scrape everything and try to learn from it, anomalies and all. It’s why they can “sound right” while being wrong.

And the worst part is that it’s putting down roots. A single hallucination from ChatGPT, posted on social media, gets shared, retweeted, repackaged, and ends up being fed back into the next training set. The result is a kind of digital inbreeding.

The internet was already full of low-quality content before LLMs arrived: fake news, fictional “how-tos,” broken code, spammy text. Now, we’re mixing in even more synthetic output.

Who curates? At present, mostly automated filters, some human red-teaming, and internal scoring systems. There’s no equivalent of peer review at scale, no licensing board, no accountability for bad data.

Where do we get “new” data?

Which naturally leads to the obvious question: where do we find fresh, high-quality training data when the public web is already picked over, polluted, and increasingly synthetic?

The first idea almost everyone has is “We’ll just train on our own user data.”

In 2023, I tried exactly that with my gamedev startup Fortune Folly – an AI tool to help developers build RPG worlds. We thought the beta-test logs would be perfect training material: the right format, real interactions, directly relevant to our domain.

The catch?

One single tester produced more data than fifteen normal users combined, but not because they were building richer worlds. They were relentlessly trying to steer the system into sexual content, bomb-making prompts, and racist responses. They were far more persistent and inventive in breaking boundaries than any legitimate user.

Left unsupervised, that data would have poisoned our model’s behavior. It would have learned to mimic the attacker, not the community we were trying to serve.

This is exactly the data-poisoning problem that big AI labs face at a planetary scale. Without active human review and curation, “real user data” can encode the worst, not the best, of human input, and your model will faithfully reproduce it.

The Takeaway

ChatGPT is only the first step on the path toward “replacement.” It looks like an expert in everything, but in reality, it’s a specialist in natural language.

Its future is as an interface for conversation between you and deeper, domain-specific models trained on carefully curated datasets. Even those models, however, will still need constant updating, validation, and human expertise behind the scenes. But they won’t replace experienced professionals; they’ll just change the way they deliver their knowledge.

The real “replacement threat” would come only if we manage to build an entire fabric of machine learning systems: scrapers that collect data in real time, reviewer models that verify and fact-check it, and expert models that ingest this cleaned knowledge. That would be a living ecosystem, not just a single LLM.

But I don’t think we’re anywhere near that. Right now, we already burn massive amounts of energy just to generate human-like sentences. Scaling up to the level needed for real-time, fully reviewed expert knowledge would require orders of magnitude more computing power and energy than we can realistically provide.

And even if the infrastructure existed, someone still has to build the expert datasets. I’ve seen promising attempts in medicine, but every one of them relied on teams of specialists working countless hours building, cleaning, and validating their data.

In other words: AI may replace tasks, but it’s nowhere close to replacing people.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article Double the Power: New DDoS From ‘Aisuru’ Botnet Easily Shatters Record
Next Article Apple Wants Orange to Be the New Black. It Isn't
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

Goopro Max 2, the king’s return with real 8k in 360!
Mobile
Forget Batteries, Bitcoin Mining Is the Better Way to Balance Power Grids | HackerNoon
Computing
Google’s Mixboard is an AI moodboard builder
News
iPhone 17 Pro teardown looks at vapor chamber, scratchgate, more – 9to5Mac
News

You Might also Like

Computing

Forget Batteries, Bitcoin Mining Is the Better Way to Balance Power Grids | HackerNoon

1 Min Read
Computing

China approves 129 domestic online games in March, a new monthly high since 2023 · TechNode

1 Min Read
Computing

How to Plan a Social Media Content Calendar (2025 Edition)

3 Min Read
Computing

17 Must-Have Agency Tools to Run & Scale Operations

27 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?