That’s not what I mean. When you contribute content to Stack Exchange, it is licensed CC BY-SA. There are websites that scrape this content and rehost it, or at least there used to be. I’ve had a problem before where all the search results were unanswered Stack Overflow posts or copies of those posts on different sites. Maybe similar to Reddit they restricted access to the data so they could sell it to AI companies.
That’s not what I mean. When you contribute content to Stack Exchange, it is licensed CC BY-SA. There are websites that scrape this content and rehost it, or at least there used to be. I’ve had a problem before where all the search results were unanswered Stack Overflow posts or copies of those posts on different sites. Maybe similar to Reddit they restricted access to the data so they could sell it to AI companies.
Nobody smart scrapes them, they provide full dumps for you to download: https://data.stackexchange.com/