I switched my local LLM setup to Ollama's new MLX engine, and my Mac suddenly feels twice as fast
…The engine also improves GPU-backed sampling, allowing tokens to generate much faster than before. Ollama claims the updated engine can deliver roughly 20% higher output speed than the previous Q4_K…