This feels like a “WATER IS WET!” type situation with the exception that a large part of the population would end up being shocked and appalled by the idea of water being wet.
You can tell if someone will be shocked by this news or not by whether they unironically call it AI or whether they call it an LLM. It’s not a fool proof method, but it siffs out the masses.
I just wish LLM felt better to say than AI. Like AI just rolls off the tongue but LLM feels like a mouthful. That doesn’t help :(
I’m not surprised by those results at all but I think it’s a very good thing to see some actual research and numbers.
Wow, what sort of advanced techniques of investigative journalism did they deploy? Use the thing for five minutes and count?
I’m not even a big hater of LLMs and I could have told you that for free.
Well, yeah, I could’ve told you that too but neither of us would have any proof. It’s one thing to try it out and decide that it sucks for your use case and another thing to measure and quantify it somehow.
Why such a negative reaction if you apparently agree with the outcome?
Well, for one thing, it’s part of a wider trend of misreporting about AI. For another, the more interesting, meaningful angle here would be why the (frankly very simplistic) research of the BBC is mismatched with the supposedly more rigorous benchmarks used for LLM quality testing and reported in new releases.
In fact, are they? What do they mean? Should people learn about them and understand them before engaging? Probably, yeah, right? But the BBC is saying their findings have “far reaching implications” without engaging with any of those issues, which are not particularly obscure or unknown in the field.
The gap between what’s being done in LLM development, what is being reported about it and how the public at large understand it is bizarre and hard to quantify. I believe once the smoke clears people will have some guilt to process about it, regardless of what the outcome of the hype cycle ends up being.
Yeah, I intentionally left out the word “groundbreaking” from the title when posting, because that’s a ridiculous thing to say about this research. Obviously, it could be much better.
But I would say that any attempt at rational look at LLMs in mainstream media is a step in the right direction.
I’m not surprised. And I heavily recommend people to ask questions about a topic that they reliably know to those assistants; they’ll notice how much crap the bots output. Now consider that the bot is also bullshitting about the things that you don’t know.