Bluesky is working on giving users a little more control over their privacy. The company published a proposal on Github outlining the changes the company is thinking of making to that end.
“This draft describes how atproto accounts (eg, Bluesky users) could declare ‘intents’ (aka, preferences) about certain categories of reuse of their public content. The mechanism and expectations are similar to robots.txt files on the web: a machine-readable format, which good actors are expected to abide, and does carry ethical weight, but is not legally enforceable”
That is quite a bit of technical speech, but the meaning is pretty clear. Robots.txt is a file most websites have—including this one—that tells robots that scrape the Internet what they can and cannot do with the data they find along the way. Bluesky would implement settings that would allow users to tell those same bots what they can and cannot do with their Bluesky data.
Can stan Twitter be recreated on Bluesky? Swifties think so.
It gets a little sticky because robots.txt is a suggestion rather than a hard rule. However, as it stands right now, Bluesky is a public website, and as such, generative AI platforms and other forms of data scraping, like Google Search, have free reign over what they find there.
Bluesky head honcho Jay Graber talked briefly about this at South by Southwest last week, but the discussion got more attention when Graber posted about it on Bluesky on Friday. Per Techcrunch, some users were initially alarmed until Graber explained the situation more succinctly.
Mashable Light Speed
“Gen AI companies are already scraping public data from across the web, and everything on Bluesky is public like a website is public,” Graber said. “But in the history of the open web, standards like robots.txt emerged that most search engines came to respect. This is a proposal to create a new, similar standard.”
Companies scraping the World Wide Web for use in training generative AI is a controversy as old as the technology, and many digital denizens have been trying to prevent AI from learning based on their content. Some companies, like Meta, have been accused of using untoward methods of training AI models, up to and including piracy.
It’s a reality that Graber has been staunchly against. Last week at SXSW, Graber wore a T-shirt that read Mundus sine Caesaribus (“A world without Caesars” in Latin), taking a dig at a similar T-shirt Mark Zuckerberg wore that read Aut Zuck aut nihil (“Zuck or nothing”).
Bluesky sold Graber’s Mundus T-shirt on its website, which sold out in minutes.