“misrepresent” is a vague term. Actual graph from the study
The main issue is usual… sources. AI is bad at sources without a proper pipeline. They note that Gemini is the worst at 72%.
Note, they’re not testing models with their own pipeline. They’re testing other people’s products. This is more indicative of the product design than the actual models
“misrepresent” is a vague term. Actual graph from the study
The main issue is usual… sources. AI is bad at sources without a proper pipeline. They note that Gemini is the worst at 72%.
Note, they’re not testing models with their own pipeline. They’re testing other people’s products. This is more indicative of the product design than the actual models