• Farid@startrek.website
    link
    fedilink
    arrow-up
    12
    arrow-down
    21
    ·
    2 days ago

    I get the meme aspect of this. But just to be clear, it was never fair to judge LLMs for specifically this. The LLM doesn’t even see the letters in the words, as every word is broken down into tokens, which are numbers. I suppose with a big enough corpus of data it might eventually extrapolate which words have which letter from texts describing these words, but normally it shouldn’t be expected.

    • cyrano@lemmy.dbzer0.comOP
      link
      fedilink
      arrow-up
      23
      ·
      2 days ago

      True and I agree with you yet we are being told all job are going to disappear, AGI is coming tomorrow, etc. As usual the truth is more balanced

    • Zacryon@feddit.org
      link
      fedilink
      arrow-up
      2
      ·
      2 days ago

      I know that words are tokenized in the vanilla transformer. But do GPT and similar LLMs still do that as well? I assumed they also tokenize on character/symbol level, possibly mixed up with additional abstraction down the chain.

    • kayzeekayzee@lemmy.blahaj.zone
      link
      fedilink
      English
      arrow-up
      9
      ·
      2 days ago

      I’ve actually messed with this a bit. The problem is more that it can’t count to begin with. If you ask it to spell out each letter individually (ie each letter will be its own token), it still gets the count wrong.

      • Farid@startrek.website
        link
        fedilink
        arrow-up
        1
        ·
        2 days ago

        In my experience, when using reasoning models, it can count, but not very consistently. I’ve tried random assortments of letters and it can count them correctly sometimes. It seems to have much harder time when the same letter repeats many times, perhaps because those are tokenized irregularly.

    • Farid@startrek.website
      link
      fedilink
      arrow-up
      2
      ·
      2 days ago

      I don’t know what part of what I said prompted all those downvotes, but of course all the reasonable people understood, that the “AGI in 2 years” was a stock price pump.