When ChatGPT 5 was launched earlier this year everyone expected a huge step forward in its capabilities. As it turned out, there was trivial improvement in what it can accomplish.
There’s a simple reason for this slowdown in progress: we’re running out of data, prompting a scramble, and billions of pounds of investment, to create synthetic data.
However, there are two problems with tackling the data shortage with artificial data.
Firstly, synthetic data has been shown to introduce mistakes. It can perpetuate biases, mislead decisions and lead to leaks of sensitive data. Secondly, synthetic data fundamentally cannot create any new intelligence.
When ‘new data’ is created from old, it’s simply an echo chamber. Gartner has estimated that by 2030, synthetic data will be more prevalent than real data, meaning that we are hitting a point where the development of AI has stagnated….
