Is Really Simple Licensing the Answer for AI Crawler Compensation?
See how the RSL protocol could enable AI companies to pay royalties for using your content.
New RSL Protocol
Some of you have been blogging long enough to remember when RSS feeds were a big thing. (RSS stands for Really Simple Syndication.)
Well, the co-founder of RSS has spun up a new protocol for AI Data Licensing called Really Simple Licensing (RSL), as reported in this TechCrunch article. (I STRONGLY suggest you read this whole article.)
The terms and conditions for use of your content will be detailed in your robots.txt file.
Collection of royalties will be handled by the new RSL Collective, which is a non-profit agency.
The article lists some of the supporters who have already signed onto this:
“A host of web publishers have already joined the collective, including Yahoo, Reddit, Medium, O’Reilly Media, Ziff Davis (owner of Mashable and Cnet), Internet Brands (owner of WebMD), People Inc., and The Daily Beast. Others, like Fastly, Quora, and Adweek, are supporting the standard without joining the collective.”
How it Would Work
All of this is very much like the royalty system used for music. But, while it’s super easy to determine when a specific song has been played, it’s not so easy to determine where some information came from.
For instance, there are a gazillion apple pie recipes on the internet, in books, on videos, etc. If AIs crawl all of them, filter for what’s common among them, and then present that as a prompt response, which content creator should get paid for the use of their info?
How Lawsuits Are Pushing this Forward
AI companies have been training on all the data it can get its bots on for nearly 5 years now, that we know about. And they have never offered compensation to anyone for it, and that’s what has everyone up in arms to the point of bringing a multitude of lawsuits against the biggest AI companies.
You may have recently read about the suit against Anthropic where they were ordered to pay authors 1.5 billion for training on their book data.
There are 100s of similar suits making their way through the courts as we speak and this settlement has the AI companies shaking in their boots.
You may also recall me reporting earlier this year about Cloudflare’s new Pay to Crawl security rules that allow publishers to block AI bots that are unwilling to pay for access to that content. Cloudflare recently caught Perplexity trying to use stealth bots to get around those AI bot restrictions. And that also has these AI companies super concerned.
So, all of this combined makes for a ripe atmosphere to bring a coordinated compensation system into play.
New WordPress Plugin
Developer James LePage jumped on this news immediately and vibe coded a WP plugin to support RSL. I say vibe coded because he listed Claude as one of the contributors!!
It’s only available via GitHub at the moment, and it is not something we will be installing anytime soon.
Will RSL Come to Other Platforms?
Maybe.
Google is already paying an estimated $60 million to crawl Reddit. That irked many sub-Reddit owners so much that they put their pages behind a paywall.
The point is, RSL doesn’t exclude AI firms from striking private deals with content creators or platforms where they blog. In fact, those platforms may adopt RSL as a centralized way to collect the royalties.
FYI, if you publish on Substack, there is a setting to block AI training on your freely available content, but it is turned off by default. And even with it on, it’s only a suggestion to crawler bots and is not enforceable. You would have to put it behind a paywall to stop all unauthorized bot access.
Will RSL Be Adopted by AI Companies?
I think every content publisher wants to jump onboard with RSL today!!
But, there is no indication that any AI company wants to participate in this - yet.
That may change quickly if any of the other pending AI content lawsuits favor creators and force AI companies to pay billions.
The Other Cost of AI Crawling
In recent BlogAid Tips Tuesday posts, you have heard me report frequently about the splat type crawling AI does on our sites now. It’s leading to big hosting resource usage for us and too much useless info for them. This is why Google has a crawl budget. AI companies have not figured this out yet and it’s costing them billions annually.
But, if they will have to pay to gain access to the content, and only be allowed into certain areas, that may fix this whole problem.
The AI companies might actually save money by paying to crawl only useful content.
At that point, something like the llms.txt standard may actually become useful too, as it would format the content so that it could be quickly digested by AI bots.
(FYI, I do not endorse using llms.txt at this time, as there is no standard and it may actually do more harm than good. Read the Aug 12 Tips Tuesday for whether using llms.txt is a good idea for more details.)
We’ll See
I will be keeping a close eye on this news as it progresses.
But I don’t expect any real movement before the end of the year, at least not at our blogging level.
This is going to impact news media and product sites first.
Plus, all platforms that host content creators en masse, like Substack, are likely going to push for it as it lowers their hosting costs. And that will take time to implement.
I’ll keep you posted.




Very interesting! Thanks for bringing this to our attention.