That’s true, but they’re also pretty good at verifying stuff as an independent task too.
You can give them a “fact” and say “is this true, misleading or false” and it’ll do a good job. ChatGPT 4.0 in particular is excellent at this.
Basically whenever I use it to generate anything factual, I then put the output back into a separate chat instance and ask it to verify each sentence (I ask it to put <span> tags around each sentence so the misleading and false ones are coloured orange and red).
It’s a two-pass solution, but it makes it a lot more reliable.
It’s a two-pass solution, but it makes it a lot more reliable.
So your technique to “make it a lot more reliable” is to ask an LLM a question, then run the LLM’s answer through an equally unreliable LLM to “verify” the answer?
That’s true, but they’re also pretty good at verifying stuff as an independent task too.
You can give them a “fact” and say “is this true, misleading or false” and it’ll do a good job. ChatGPT 4.0 in particular is excellent at this.
Basically whenever I use it to generate anything factual, I then put the output back into a separate chat instance and ask it to verify each sentence (I ask it to put <span> tags around each sentence so the misleading and false ones are coloured orange and red).
It’s a two-pass solution, but it makes it a lot more reliable.
So your technique to “make it a lot more reliable” is to ask an LLM a question, then run the LLM’s answer through an equally unreliable LLM to “verify” the answer?
We’re so doomed.
Give it a try.
The key is in the different prompts. I don’t think I should really have to explain this, but different prompts produce different results.
Ask it to create something, it creates something.
Ask it to check something, it checks something.
Is it flawless? No. But it’s pretty reliable.
It’s literally free to try it now, using ChatGPT.
Hey, maybe you do.
But I’m not arguing anything contentious here. Everything I’ve said is easily testable and verifiable.