A lot of the forums I’m seeing talked about where more technical or objective kinds. Like in a car forum there’d be repair manuals or parts lists, fountain pen forums would have loads of images comparing inks side by side for different shades and hues. Those are the sorts of knowledge centers being discussed and reminisced about a lot here.
That’s currently being argued in the courts. There’s a lot that goes into it from right to distribution, to proving that although the AI bot can’t reproduce everything even though it normally doesn’t. [https://arstechnica.com/features/2025/06/study-metas-llama-3-1-can-recall-42-percent-of-the-first-harry-potter-book/](A very real example of reproducibility)
There’s also arguments about how they accessed large amounts of content. The law doesn’t just recognize whether you can access something or not, but what you access it for. There’s laws about accessing things with the sole purpose of using it to develop a commercial product. All of it is a tangled mess that there’s no current clear answer to (legally, morally I think there is but that’s very opinionated)
I think there’s a lot of solid arguments against letting AI steal everything, but with the scraping there’s an even more immediate problem. They don’t rate limit or do it in an intelligent method. It becomes a full blown ddos that has take down entire sites and slowed many more to the point of near uselessness.
They’re in a very literal sense crashing large chunks of the Internet and causing havoc which costs very real money to fix, either by upping server resources or installing AI scraping mitigation resources so that every still has access to the free information you mention.
Doesn’t really solve the AI scraping or the silo problem and as Codeberg found out recently, solving the AI scraping DDOS is never ending
Honestly this. Their cpus melting down over the past couple years and their refusal to even acknowledge it hurt their image more than any potential backdoor could.
It’s absolutely possible if you’re abusing it recreationally, especially if you start mixing it with who knows what else. The biggest risk is respiratory system just shutting off.
That’s just an OD though, long term abuse like it’s suspected by many that Musk is doing has a whole host of nasty symptoms that you can look up yourself.
I feel like we could find ways and tools to help in that situation without stealing the entirety of human knowledge, boiling our planet, and spending a small nation’s GDP. Like better code library discovery or a better mentor environment amongst coders.
I’ve also seen plenty of people get pointed in the exact wrong way to do things by leaning on generative AI and then have to spend even more time getting back on track.
Feasible? Only time will tell. Possible? Caltech did it two years ago. Look up MAPLE. Wireless energy transfer to/from space was achieved.
Classic Torment Nexus moment over and over again really
Seriously, copyright doesn’t just go away because it’s online. The concept of “right of reproduction” is a vast and well defined area of law.
You can argue copyright law is garbage and archaic and needs to be overhauled sure, but right now “if it’s on the Web it’s free” only counts if you’re Meta and can pay off a judge or something
Time to build a small Dyson Sphere.
Codeberg was running Anubis. Apparently several bots have started just solving Anubis and scraping away again.