Linus Torvalds is sharing some of his classic and straight-to-the-point wisdom today over file-systems with case-folding / case-insensitive file and folder support.
Stemming from Bcachefs having discovered broken case-folding support for its implementation and now submitted to be fixed this week for Linux 6.15, Linus has written a lengthy post on the Linux kernel mailing list to outline his views on such case folding functionality.
This current Bcachefs case-folding issue also isn’t the first time Linux file-systems have seen problems with case-folding. In the past there have been past issues such as case folding behavior with emojis and other special Unicode characters.
Linus Torvalds wrote on the LKML his feelings around file-system case folding for case-insensitive files/folders:
“The only lesson to be learned is that filesystem people never learn.
Case-insensitive names are horribly wrong, and you shouldn’t have done them at all. The problem wasn’t the lack of testing, the problem was implementing it in the first place.
The problem is then compounded by “trying to do it right”, and in the process doing it horrible wrong indeed, because “right” doesn’t exist, but trying to will make random bytes have very magical meaning.
And btw, the tests are all completely broken anyway. Last I saw, they didn’t actually test for all the really interesting cases – the ones that cause security issues in user land.
Security issues like “user space checked that the filename didn’t match some security-sensitive pattern”. And then the shit-for-brains filesystem ends up matching that pattern *anyway*, because the people who do case insensitivity *INVARIABLY* do things like ignore non-printing characters, so now “case insensitive” also means “insensitive to other things too”.
For examples of this, see commits
5c26d2f1d3f5 (“unicode: Don’t special case ignorable code points”)
and
231825b2e1ff (“Revert “unicode: Don’t special case ignorable code points””)
and cry.
Hint: ❤ and ❤️ are two unicode characters that differ only in ignorable code points. And guess what? The cray-cray incompetent people who want those two to compare the same will then also have other random – and perhaps security-sensitive – files compare the same, just because they have ignorable code points in them.
So now every single user mode program that checks that they don’t touch special paths is basically open to being fooled into doing things they explicitly checked they shouldn’t be doing. And no, that isn’t something unusual or odd. *Lots* of programs do exactly that.
Dammit. Case sensitivity is a BUG. The fact that filesystem people *still* think it’s a feature, I cannot understand. It’s like they revere the old FAT filesystem _so_ much that they have to recreate it – badly.”
Classic Linus sharing his technical feelings directly to the point.