Ollama is still the easiest way to start local LLMs, but it's the worst way to keep running them
…If you want even more performance, ik_llama.cpp is a fork that pushes CPU and multi-GPU performance further, with three or four times speed improvements in some multi-GPU configurations…