It’s not the LLM that does the web searching, but the software stack around it. On its own, an LLM is just a text completer. What you’d need a frontend like OpenWebUI or Perplexica that would ask the LLM for, say five internet search queries that could return useful information for the prompt, throw those queries into SearxNG, and then pipe the results into the LLM’s context for it to be used.
As for the models themselves, any decently-sized one that was released fairly recently would work. If you’re looking specifically for open-source rather than open-weight models (meaning that the training data and methodologies were also released rather than just the model weights), GPT-OSS 20B/120B and the OLMo models are recent standouts there. If not, the Qwen3 series are pretty good. (There are other good models out there, this is just what I remember off the top of my head.)
Does anyone know of any OSS LLMs that can search the web the way ChatGPT can?
It’s not the LLM that does the web searching, but the software stack around it. On its own, an LLM is just a text completer. What you’d need a frontend like OpenWebUI or Perplexica that would ask the LLM for, say five internet search queries that could return useful information for the prompt, throw those queries into SearxNG, and then pipe the results into the LLM’s context for it to be used.
As for the models themselves, any decently-sized one that was released fairly recently would work. If you’re looking specifically for open-source rather than open-weight models (meaning that the training data and methodologies were also released rather than just the model weights), GPT-OSS 20B/120B and the OLMo models are recent standouts there. If not, the Qwen3 series are pretty good. (There are other good models out there, this is just what I remember off the top of my head.)
Thank you
Depends. Does ChatGPT ignore robots.txt too?