After a year of self-hosting LLMs, I realized the real bottleneck isn’t the GPU
… A MacBook with 16GB or 32GB of unified RAM can load and run LLMs at speeds that rival or exceed systems with discrete GPUs, making Apple laptops surprisingly competitive for local AI. …