NVIDIA Dynamo
…It supports open source inference engines including SGLang, TensorRT™ LLM, and vLLM and simplifies the complexities of distributed serving by disaggregating the various phases of inference across different GPUs, intelligently routing requests…