AGI achieved 🤖

cyrano@lemmy.dbzer0.com · 2 months ago

AGI achieved 🤖

Farid@startrek.website · 2 months ago

I get the meme aspect of this. But just to be clear, it was never fair to judge LLMs for specifically this. The LLM doesn’t even see the letters in the words, as every word is broken down into tokens, which are numbers. I suppose with a big enough corpus of data it might eventually extrapolate which words have which letter from texts describing these words, but normally it shouldn’t be expected.

cyrano@lemmy.dbzer0.com · 2 months ago

True and I agree with you yet we are being told all job are going to disappear, AGI is coming tomorrow, etc. As usual the truth is more balanced

kayzeekayzee@lemmy.blahaj.zone · 2 months ago

I’ve actually messed with this a bit. The problem is more that it can’t count to begin with. If you ask it to spell out each letter individually (ie each letter will be its own token), it still gets the count wrong.

Farid@startrek.website · 2 months ago

In my experience, when using reasoning models, it can count, but not very consistently. I’ve tried random assortments of letters and it can count them correctly sometimes. It seems to have much harder time when the same letter repeats many times, perhaps because those are tokenized irregularly.

Farid@startrek.website · 2 months ago

I don’t know what part of what I said prompted all those downvotes, but of course all the reasonable people understood, that the “AGI in 2 years” was a stock price pump.

Zacryon@feddit.org · 2 months ago

I know that words are tokenized in the vanilla transformer. But do GPT and similar LLMs still do that as well? I assumed they also tokenize on character/symbol level, possibly mixed up with additional abstraction down the chain.