I can not prove it but i think that already since a long time that most articles are copy + pasted and some kind of summary. Which might have happend automatically. Like someonr might write a real article on anything and then hundreds of sites copied it, without adding anything
Yeah I get the main theory you’re going with and it’s been rampant without AI bullshit. When sites became traffic driven to an extreme they started trying to grab whatever was the most rage enduring, verified or not. I forget the exact example but there was a guy that back traced a rage bait article and it was like 10 articles deep, and at the bottom it was a misquoted tweet that was complete misinformation.
There’s no way an AI generated article costs less than a cent.
So humans writing articles cost more than the billions they’re pumping in to AI? I very much doubt it.
How is it a crisis? I’m expecting/hoping LLMs will just get increasingly worse as they are fed on their own slop, until they collapse into unusability and the world finally returns to sanity.
The crisis is that if the internet is bad enough that they can’t train LLMs on it, it’s also useless for us human. And if the LLMs leave because there is no information and we come back, eventually we will generate enough content that it will be worth it again
This is said as though it isn’t an immensely expensive endeavour to run these things and the only reason they’re this prevalent right now is the overspeculation and starved growth of US tech companies.
The crisis is that companies will have to pay fair prices for human labor again, which will lead to less profits for the precious shareholders
Not just precious shareholders, it’s a massive bubble that will impact the economy and peoples retirement when it bursts. They keep dumping more money into AI even though there are no returns.
On top of this, the scrapers that feed the AIs are creating more and more traffic, and therefor load on sites that did not have them before.
So at this point we’re gonna have to go back to reading books…
Books made before 2019. Amazon is absolutely filled with AI generated books nowadays.
In fact, this whole “consume media only from 2010 and earlier” idea is getting more appealing by the day. I’d rather watch an anime from the 80’s where each frame was drawn by a human hand and somebody spent a week encoding it to extract all the details from the original analogue source, and the subtitles were made by a person who considered each nuance carefully as if their life depended on it, rather than watch a 2025 sequel to a prequel to a reboot of an existing IP where half the assets are AI, the subtitles are AI, the script is AI, and it’s just the most generic mass appealing thing ever made.
It’s pretty sad, creepy and hopeless, but this is exactly how I have been feeling lately with YouTube videos. If it’s made post GPT, I am not inclined to watch it unless it’s a channel I know has a stated anti-AI position, because at least then I know I’m getting human input. They are building up Plato’s cave around us stone by stone, we can’t even move out of the way, we are getting imprisoned, and the few of us who see it happening and shout are being drowned out by the noise of the billions around us who happily hum along.
The system itself needs to come down, with violence, or we won’t make it, none of us, and honestly even if it does, I am not sure we are going to survive. I feel like I’m playing the fiddle on the Titanic.
Since a lot of channels have started stating they use no AI, I have unsubscribed from quite a few that don’t explicitly say so.
Human forums. Throw in a tar pit for any scrapers
If you are a USian then you probably have access to a public library.
If you go to the website of said library, usually a city.gov or countystate.gov, you can create an account.
After you create said account, you can download apps(they will tell you but normally Libby or Hoopla, Hoopla hasn’t been work for me but might be a me issue.).
Use the ID from the website on the apps and you can check out books to read. You also have access to comics and audiobooks. The Invincible comic is good, you should check it out.
Weird that third party apps, made by corporate entities, are needed for this. They’re public libraries funded with public money, it should be one unified backend with libre applications.
Then the libraries would have to pay for hosting, so they’d have to be the ones selling user data to advertisers and stuff. Hence the extra degree of separation / “plausible deniability”
If the hivemind cared about actual important priorities (instead of just the desire for low oil prices), then the Internet Archive would face more pressure to improve its infrastructure and the authorities would face more pressure to leave the Internet Archive tf alone.
Nostr can fix this someday
Then the libraries would have to pay for hosting, so they’d have to be the ones selling user data to advertisers and stuff. Hence the extra degree of separation / “plausible deniability”
What? Libraries don’t sell data to advertisers to acquire, maintain and lend books. Why would they do that to provide ebooks? You unitedstatians got used to this bizarre mix of private corporations and public services and ended up accepting the premise that it’s somehow mandatory.
I’m not accepting the premise, that’s why I use nostr on the internet and actual library buildings off the internet.
But it is mandatory, whether that’s acceptable or not. Authorities in the US aren’t gonna suddenly change their mind when it’s your local librarian instead of the Internet Archive trying to run things differently from the corporations. The librarian needs better infrastructure to stand up to the authorities and break away from the corporate way
deleted by creator
You likely need to physically go to the library and prove your residency but yes. Then you can do the above.
Most new books are AI generated now.
If AI replaces humans then why would they need to continue training them? You don’t believe in the AGI hype do you? You don’t think our system is about anything other than efficiently racing to the bottom, do you?
Besides, humans will always figure out a better way, a new way, a fresh take. And there’s your training data.
I don’t get why it says “training AI on AI content makes it dumb” when people literally use synthesized data to carefully tune models. Today. I’m sure they’ll improve training by the time the internet is fully replaced.
This all assumes that it won’t take a human-like AI to replace humans. Whatever replaces us will be pitiful, but it will be supported by our institutions, trillionaires, and enough personal data to create fun headlines like “Elon knows so much about you that if each datum was a grain of sand it would be bigger than Saturn!”
The incumbents have their training data that can also be used for the next generations of AI. This is just them helping others to pull up the ladder to avoid more competitors.
Mirrors face, fingers point.
In part, this is what Microsoft Recall is about: scraping end users’ data at will to sort and feed to its LLMs, without the user ever seeing what is being scraped or having any real, lasting ability to shut Recall off and keep it shut down.
While I am aware that MS insists none of that is true, it is fact that 1) the snapshotted and OCR’d Recall data is now stored in an encrypted database that takes higher than average user skill to get into; 2) even users who turned off Recall saw it turned on again at the next Windows Update; and 3) even after MS said they were backing off Recall, MS continued to partner with hardware makers to create computers bundled with Windows 11 on top of the extra GPU necessary for processing all these Recall snapshots without making that sluggish Windows bloat even more sluggishly bloated than it already was.
So why all that money and effort, even as they claimed to be backing away from it, just to help a hypothetically forgetful user here and there? Data harvesting was always part of the payoff, why they were and are very willing to piss off a huge part of their own consumer base around the world by ending Windows 10 unnecessarily, and why even now they keep ramming Recall shit down the pipe when literally NO ONE wants it.
They get your data. At will. And as much of it as they like, without you ever having the opportunity to oversee what they’re getting, much less curate it. And after feeding it to their LLMs, they get to aggregate and broker it to their “partners” as well. Never forget what MS did in Palestine and the partners they can and will gladly work with, all based on massive collections of quietly gathered user data that either should not legally exist, or is not known outside of MS and its partners to exist at all.
In part, this is what Microsoft Recall is about: scraping end users’ data at will to sort and feed to its LLMs
That’s also what Google has always done. Want a large data set of emails? Look at our new free service, Gmail! Need a lot of images to train machine learning vision models? Check out our newest free backup tool, Google Photos! and so on. When they want one particular data type, they launch a free service that just so happens to collect this exact data type from millions of users.
You know Microsoft isn’t about the user experience the moment they removed free games from their distro
Calling Windows a distro, while technically true, feels offensive
Explains why my personal blog, wiki, and git repo keep getting hammered by hordes of AI company scrapers. If AI was intelligent, they’d download a single snapshot every month or so and share. But no, eight different scrapers using thousands of different IP addresses (to evade my
fail2ban
measures) each have to follow every single blame and diff link when a simplegit clone
operation would get them the hundreds of megabytes of content in one go.They are getting better, though. More hits are to RecentChanges on my wiki, so there seem to be some optimizations going on. But I refuse to increase my operating costs beyond a few USD/month to serve AI bots when I know barely anyone human visits.
I love love love model collapse. It’s the absolute perfect conclusion to our Internet and this AI bubble.
The make AI. They realize that we’re not ready for it. But short term profits overlook any and all ethical concerns. So they business bro and release anyway, unleashing a flood of slop and garbage content. All of which is impossible to tell if it is human made or not. But the only way to make their business continue is to train new models, which needs more human created content, of which they already destroyed and made imperceivable from their slop.
It’s honestly just pure poetic justice. The ones who are hiring the next generation of AI models are AI companies.
The open internet will become divided into verified websites, and the rest will be left for bots to fight on forever.
It will be used as an excuse by our governments to force a ID verification system tied to your real life person, refuse it and fight it in every possible way.
Meanwhile we need enclaves of not corporate bullshit (like fedi but also) including bringing back webrings and old school chats. And usenet. And irl word of mouth.
Because fuck all that shit.
I mean, verification doesn’t really help because in the end it’s still mostly human posting the AI slop (I think).
That said, we’ll probably need some sort of reputation system. Something like a revamped GPG or Web Of Trust, where you a) can tag users/websites you find trustworth and b) can see what other people you trust think about someone/something.
The internet I loved died like 15 years ago anyway. It got replaced by ads and misinformation bots and hatred.
Hate to disagree but like this is the Internet and it’s pretty great and there are no ads and most of us are not bots and there’s a lot of love here
Lemmy is an exception to the rule.
It won’t stay that way for long.
Nah, if it remains unpopular enough it’ll be fine.
Beep boop
I’m deeply jealous of people who experienced the early internet. For me it has always been like this.
People say that early internet was like the wild west, and I agree. And I feel like a weathered old cowboy who saw my homestead on the frontier get surrounded by buildings, cars, industries, smog, noise, and the never ending growth of the population.
Capitalism fucked us all, and it keeps fucking us, and it will kill us.
In the early days, there were a lot of porn ads before ad blockers came along. There’s a nice sweet spot somewhere between that invention and the internet of today.
I was there when people when people were sending around Goatse pics and all our info came from Alta Vista.
It was pretty sweet, apart from the prolapse.
My dude, YouTube used to not have ads
The only time in my life I have ever seen ads on youtube is when I’ve watched videos on someone else’s computer or through “smart” TVs.
I am genuinely shocked that anyone puts up with it. Then again, I get it, not everybody knows how to get around it, and, more importantly, they are working very hard and will succeed at eliminating the possibility to circumvent it.
I learned if you report an ad for being offensive it cancels all the ads after as well as ending the current ad immediately. And it just takes an extra 2 seconds and 3 clicks. Works on smart tvs, phones, and my Xbox.
I don’t even feel like I’m lying because ads inherently are offensive to me.
It’s even worse - you can set it up for them, and one minor inconvenience later they’re back to how they used it before, ads and all. A lot of people truly don’t seem to care.
And Facebook used to just literally have your friends statuses show up on it and that was it. No ads or sponsored content.
But I’m also old enough to remember when getting somewhere involved using a map and asking for directions at gas stations.
Facebook was only glorious when it was made for college kids. The restricting of boomers and children was what made it great. It was the perfect app to boost/record the college experience.
But I’m also old enough to remember when getting somewhere involved using a map and asking for directions at gas stations.
Man, for all the advancements in technology, I fucking did this last week lmao.
My GPS was telling me to “continue to the intersection of Grand and 12th” then “take the third exit in the roundabout” when Grand and 12th was a stoplight and there was no roundabout in sight.
Well at least you’re well versed in the old ways.
I experienced a weird middle period because my parents had kids really late in their lives compared to most people. I experienced a lot of old tech because they liked to hold on to it. I remember my parents using maps to get around on a road trip and remember our VHS player but the first console I used was an xbox (though they never let me own a console 😔) and, because my parents were concerned about internet access, I didn’t get online until quite late in life compared to my peers. So I am very familiar with the old tech native to the millenial and gen X generations but have only experienced the internet post centralization. Bless my parents for keeping me offline as long as they did, I think this was a net good, but damn it I wish I could have experienced the wild west age of the internet. It wish being online felt like an exploration.
Wtf? How did they make money?
Oh my sweet young man. The internet in the early days was very open source and about proving connectivity, at least with the little people like us. Monetization drove people away like crazy and we segregated to places that didn’t have crazy pop up porn ads every time we clicked on something.
In my opinion Bill Gates is the most responsible for ruining the internet. He used his money to go around to universities and governments and private entities around the world and convince them to start monetizing what used to be all open source hobby work that was bettering mankind.
They didn’t
“You don’t really know what it is you have until it’s gone. Gone…gone.”
I’m still baffled about the expected outcome of replacing all content creation with AI for new events or ideas, like news articles.
If there are no humans to write the story first, where are the bots going to get the new information to plagiarize?
Wait, maybe that’s the sustainable endgame. If the problem is that there is no new human contect for AI to learn from, then the AI companies can hire some humans to create literature, art, music, memes, etc. to carry on a stream of data for the AI to consume. They said that once AI had taken all the shit jobs, humans would be able to spend their time on art - maybe this is how it all ties together!
Haaaaaa…
The expected outcome is internet collapse. The internet is a mass communication tool that allows bothersome peasants to organize and educate themselves if they are inclined to do so. Which means it must be destroyed.
The poor, disenfranchised, and uneducated are significantly easier to manipulate. So the goal of the 1% is to break society, eliminate the ability to educate, eliminate the ability to organize, and the use their near total control over mass media to assure everyone that they are better off under the thumb of the technocratic state.
Apart from being hit with what seemed like random ban hammers, that was one of the reasons I left Reddit.
It just seems so “the same”. Every comment chain is filled with the same predictable comments and they’re all predictably commenting on the same predictable content.
I hear lots of Lemmy newbies saying they stay on Reddit because of the user base in their niche communities. I hope those same people post or comment here a little bit in their niche instances before buggering off back to the black hole of bots.
I love that I can spot individual users again without tagging them.
I love the guy who uses that ye olde symbol instead of typing the letters “th”
This place is awesome because of the limited user base IMO. Very rarely have I been met with negativity here.
I joined Reddit in 2008. I saw it devolve year by year. When it got big enough, it attracted the attention of the capital. Then it quickly became a propaganda mouthpiece of the money. I had big subs on there, now I’m aggressively permabanned, every single account I’ve ever had, easily a hundred of them. If you don’t play along, you don’t belong, it’s become just another controlled media outlet. It will happen here too, as soon as it becomes big enough to become a problem, they will find a way. It’s extremely scary.
Who could have imagined the Illuminati were a bunch of pimplefaced billionaire nerds acting all out in the open.
It’s wild that most people don’t understand this is happening. And it’s being manifested in a Mafia type way.
I honestly have no hope for society. People really are ignorant and arrogant to the point that they are basically like farm animals.
Yep. That about sums it up. It’s a bit worse than that though, because they can also actively poison the AI and make it express more positive right wing sentiments too. For now that’s still a bit tricky, because the AI goes mecha-Hitler, but don’t worry, they will fix that glitch in the coming months.
It doesn’t even matter if it does a poor job. I see people scrolling endlessly, 24/7, they’re already zombified. The amount of people who can filter the bullshit or think critically about what they are fed online are in an extreme minority. It doesn’t matter for the ruling class if there’s some single digit percent of the population that don’t buy it, fooling most people most of the time is absolutely enough to rule absolutely.
Anyone else inhabiting a very dark mindspace lately? We should start a club.
This is not a conspiracy to destroy the internet. This is a tragedy of the commons. It doesn’t take many bad agents to ruin it for everyone. AI slop is so cheap and has the potential to make profit, so it’s a no brainer if you don’t have a better way to make money.
The only reason the AI companies are crying is because now they need humans in the loop somehow. They’ll need to source data from only known and verified real-person sources (like reputable news sites, magazines, personal blogs, and podcasts), manually verify data is not slop, and/or pay people to generate data for them.
Capitalism will kill civilization. We can’t escape it. If we tear it down, we go down with it. If we don’t, it will collapse under its own weight, and soon, and it will crush us all under the flaming rubble.
I know I sound all doomsday-y. Sorry.
That’s a good point. The internet collapse also helps this new age of " information capitalism" to spread even further: any information that you may seek will be behind a paywall or intermediate by an AI that you have to pay a subscription to access. In fact, that’s the best case scenario for all these AI companies: to kill free and reliable information.
He who controls the past controls the future. He who controls the present controls the past.
Most likely the only goal is getting short term massive investment like everything else. Just the capitalism playbook of find resource, hype it up and exploit it, draining all value before discarding it. Pump and dump.
Once Ai becomes Super Aware AI^TM then we will have no need for original stories…We just need a trillion more dollars of funding, all your power, fresh water, and for you to rewrite all intellectual property laws. Trust me bro, we are sooo close.
I swear tech bro billionaires forgot they were the ones who wrote the checks to the marketing department that dream this stuff up and have just been getting high on their own supply. The amount of people in silicon valley that actually believe AI is going to evolve into some version of the 40k Omni Messiah is too damn high.
I think it’s the greatest sunk cost fallacy the world has yet seen.
“We’ve poured trillions of dollars into this, and convinced a bunch of people it’s going to fundamentally transform humanity, so it MUST be a good idea!”
None of them want to be the first to admit that it was a bad bet.
A bad bet? It is earning them billions. Why would they stop?
Unchecked profit motive is what is going to destroy civilization- it already is. It already did. We are governed by a bizarre set of rules where aggressive competition is supposed to somehow bring about the best outcomes for humanity, which is just an absurd proposition, it’s all a racket, a game that has superseded rule of democracy and law, it is what controls every nation and every decision on the planet- the economy, profit, money.
It is about to great filter us.
Yeah, it just doesn’t make any sense. There hasn’t even been a legitimate scheme to make AI profitable in the long run. The only thing these idiots can dream about is cutting down labour cost at any conceivable level.
Like okay even if that were possible, we now have infinite cheap production but with no consumer demand?
Yeah…the whole thing is incredibly short-sighted, and that’s being charitable.
The capitalists have such a raging hard-on at the thought of not having to pay workers, they don’t want to listen to anyone who might give them a dose of reality.
The means of production are cannibalizing themselves, like the Lemongrabs in Adventure Time
That’s the thing that I wonder about. Like I know people who think all coding is going to be ai in the next couple of years. But where do they get data? Come up with better programs? Llama are, like you said, machines that just steal what’s already there. So if they are doing all the coding, we’re stuck right now? Nothing new or better?
Why do you think they store and analyze everything you do on your computer, like when you use Windows? “Would you like a permanent spell checker that you can’t turn off make sure anything you write is spelled correctly? No? Too bad, go use Linux. Oh shit sorry, it’s in the hardware now, you can opt out if you want, but not really, we are going to read everything you write on your computer and your phone and analyze every conversation you have on your phone and it’s all going to be processed and then eventually sold for scraps after we’ve extracted what we want from it”.
That’s where, and it’s been happening for way over a decade already.
Llama being the LLM corrected to is amazing
There are legitimately AI researchers who are investigating “how can we make models that can be trained on AI generated data”. These researchers know about model collapse (LLMs recursively trained on generated data will degenerate after a few iterations) for over a year now
… I hope I was joking
The Hapsburg chin of the digital age