Towards speed-of-light text generation with Nemotron-Labs diffusion language models
NVIDIA's Nemotron‑Labs Diffusion family adds diffusion and self‑speculation modes to 3B‑14B LLMs, delivering up to 6.4× faster token generation while keeping accuracy competitive.