NVIDIA/Model-Optimizer/ptq
Model Optimizer
NVIDIA
This skill should be used when the user asks to "quantize a model", "run PTQ", "post-training quantization", "NVFP4 quantization", "FP8 quantization", "INT8 quantization", "INT4 AW...
How to get this skill
Agent Skill by NVIDIA. Download or clone it, then install it in your agent.
Setup & Installation
- Clone the repository:
git clone https://github.com/NVIDIA/skills.git - Copy the skill folder (which contains
SKILL.md) into your agent skills folder, e.g..claude/skills/. - Restart or reload the agent to auto-discover the skill.
- Check
SKILL.mdfor any special instructions or requirements.
Related skills
Model Optimizer
NVIDIA/Model-Optimizer/accessing-mlflow
Query and browse evaluation results stored in MLflow.
NVIDIA
Details →
Model Optimizer
NVIDIA/Model-Optimizer/debug
Run commands inside a remote Docker container via the file-based command relay (tools/debugger).
NVIDIA
Details →
Model Optimizer
NVIDIA/Model-Optimizer/deployment
Serve a quantized or unquantized LLM checkpoint as an OpenAI-compatible API endpoint using vLLM, SGLang, or TRT-LLM.
NVIDIA
Details →
Model Optimizer
NVIDIA/Model-Optimizer/evaluation
Evaluates accuracy of quantized or unquantized LLMs using NeMo Evaluator Launcher (NEL).
NVIDIA
Details →