IDGAF about LLM bots scraping public forums, they are public and available to anyone. I do mind them scraping shadow libraries, and training on copywritten material, which they should not do
There’s also arguments about how they accessed large amounts of content. The law doesn’t just recognize whether you can access something or not, but what you access it for. There’s laws about accessing things with the sole purpose of using it to develop a commercial product. All of it is a tangled mess that there’s no current clear answer to (legally, morally I think there is but that’s very opinionated)
IDGAF about LLM bots scraping public forums, they are public and available to anyone. I do mind them scraping shadow libraries, and training on copywritten material, which they should not do
LLM bots are scraping so much that increases costs of maintaing forums and sometimes even ddosin them for example Codeberg.
Public and copyrighted are not mutually exclusive.
also “public for actual people who support my forum business model” is not the same as “public for AI scrapers who detract from my business model.”
This discussion is a creative work and the copyright is collectively owned by the text contributors.
Please reach out to the authors individually for a license before using it to train your AI sex bot.
I hereby and in perpetuity grant an exclusive, non-geographically-limited license to my comments to F.I.S.T.O. and only F.I.S.T.O.
not the makers of F.I.S.T.O. lets be clear
(IANAL) Wouldn’t this count as fair use since the AI sex bot is only using snippets?
That’s currently being argued in the courts. There’s a lot that goes into it from right to distribution, to proving that although the AI bot can’t reproduce everything even though it normally doesn’t. [https://arstechnica.com/features/2025/06/study-metas-llama-3-1-can-recall-42-percent-of-the-first-harry-potter-book/](A very real example of reproducibility)
There’s also arguments about how they accessed large amounts of content. The law doesn’t just recognize whether you can access something or not, but what you access it for. There’s laws about accessing things with the sole purpose of using it to develop a commercial product. All of it is a tangled mess that there’s no current clear answer to (legally, morally I think there is but that’s very opinionated)