Win on TCO: How AMD Instinct™ MI355X Achieves Cost-Competitive Distributed Inference Through SGLang with MoRI
…2.9% lower cost than Nvidia B200 TRT-LLM and 1.22x higher throughput per GPU than Nvidia B200 SGLang. This result is driven by a full-stack optimization effort across compute…