I haven't tried running any myself so my knowledge is just from having glanced at a few discussions in AI communities when it's come up, but I think Mistral 7B might be the current best, or a fine-tune of it such as Mistral 7B OpenOrca or Mistral 7B OpenHermes.
I think the most advanced OpenSource LLM model right now is considered to be Mistral 7B Open Orca. You can serve it via the Oobabooga GUI (which let's you try other LLM models as well). If you don't have a GPU for interference, this will be nothing like the ChatGPT experience though but much slower.
Mistral OpenOrca is a good one. I pull about 10 to 11 tokens/sec. Very impressive. For some reason though, i cannot get GPT4ALL to use my 2080ti even though it is selected in the settings.
Can someone explain what is the benefit of running all of these models locally? Are they better than the free available chatgpt? Any good reading on how to learn/get started with all this?
I'd recommend koboldcpp for your backend, SillyTavern for your frontend, and I've been a fan of dolphin-2.1-mistral-7B. I've been using the Q4_K_S. But you could probably run a 13B model just fine.
I've heard good things about the nous-hermes models (I was a big fan of their Llama2 model). I'd stick to mistral variants, personally. Their dataset/training has far surpassed base Llama2 stuff in my opinion.
It allows you to manage and download models from Hugging Face, and suggests models compatible with your machine. Additionally, it can initiate a local HTTP server that functions similarly to OpenAI's API."
You're conflating gender with sex, which is a human error. You also seem to be obsessed with this topic since youre bringing it up when it's irrelevant. Cringe.