speak for yourself

blibla@slrpnk.net · 2 days ago

speak for yourself

i_love_FFT@jlai.lu · 12 hours ago

The main breakthrough of LLM happened when they figured out how to tokenize words… The subsequent transformer architecture was already being tested on various data types and struggled compared to similarly advanced CNN.

When they figured out word encoding, it created a buzz because transformers could work well with words. They never quite worked as well on images. For that, stable diffusion (a variation on CNN) has always been better.

It’s only because of the buzz on LLMs that they tried applying them to other data types, mostly because that’s how they could get funding. By throwing in disproportionate amount of resources, it works… But it would have been so much more efficient to use different architectures.