NVIDIA Gated DeltaNet-2: Decouples Erase & Write in AI

May 24·0:00 listen·Source: MarkTechPost

Summary

NVIDIA has released Gated DeltaNet-2, a new linear attention layer. This model decouples the active memory edit into two channel-wise gates. It aims to improve how AI models edit compressed memory without scrambling existing information. Gated DeltaNet-2 was trained with 1.3 billion parameters on 100 billion FineWeb-Edu tokens. It outperforms several previous models, including Mamba-2, Gated DeltaNet, KDA, and Mamba-3, across various benchmarks. The model addresses a problem in delta-rule models where a single scalar gate controlled both forgetting old content and committing new content. Gated DeltaNet-2 introduces a channel-wise erase gate and a channel-wise write gate, separating these two decisions. This allows for more precise control over memory updates. This development could lead to more efficient and effective AI memory systems.

Read the full article on MarkTechPost →

This is an AI-generated audio summary. Always check the original source for complete reporting.

NVIDIA Gated DeltaNet-2: Decouples Erase & Write in AI

Summary

Suprema: ISO/IEC 42001 Certified for AI Governance

Bunkerhill Health Raises $55M for AI in Healthcare

AI Under Pressure: Scams, Security, Sustainability