Post-Training Quantization of LLMs with NVIDIA NeMo and NVIDIA TensorRT Model Optimizer | NVIDIA Technical Blog
…NeMo and TensorRT Model Optimizer offer a broad range of models suitable for quantization, including the following families: GPT Llama Gemma StarCoder Nemotron (including the recently announced Nemotron4-340b ) PTQ support also…
