Awamer awamer.ai

NVIDIA/Megatron-Bridge/resiliency

Megatron-Bridge NVIDIA

Resiliency features in Megatron Bridge including fault tolerance, straggler detection, in-process restart, preemption, and re-run state machine.

How to get this skill

Agent Skill by NVIDIA. Download or clone it, then install it in your agent.

Setup & Installation

  1. Clone the repository: git clone https://github.com/NVIDIA/skills.git
  2. Copy the skill folder (which contains SKILL.md) into your agent skills folder, e.g. .claude/skills/.
  3. Restart or reload the agent to auto-discover the skill.
  4. Check SKILL.md for any special instructions or requirements.