Nvidia is launching the HGX H200 GPU

Nvidia has unveiled the HGX H200 GPU, its latest high-end graphics processing unit (GPU) designed for training and deploying artificial intelligence models. The H200 boasts 141GB of next-generation “HBM3” memory, making it capable of generating text, images, or predictions using AI models. This release has significantly boosted Nvidia’s sales, with a projected 170% surge in the current quarter.

Major tech companies like Microsoft, Google, Amazon, and Oracle are interested in acquiring these GPUs.

Nvidia is launching the HGX H200 GPU – Key Points

Nvidia’s HGX H200 GPU offers 1.4x more memory bandwidth and 1.8x more memory capacity compared to its predecessor, the H100.
The availability of the H200 is uncertain, and it will be released in the second quarter of 2024.
Nvidia is collaborating with global system manufacturers and cloud service providers to make the H200 accessible.
Nvidia’s HGX H200 GPU uses a new, faster memory spec called HBM3e, increasing memory bandwidth to 4.8 terabytes per second and total memory capacity to 141GB.
It is compatible with systems already supporting H100 GPUs, and major cloud providers like Amazon, Google, Microsoft, and Oracle will offer the H200.
Nvidia has confirmed that the H200 will be compatible with the previous H100 model, allowing a seamless transition for AI companies.
It aims to fuel the acceleration of generative AI and large language models (LLMs), maintaining Nvidia’s dominant position in the AI hardware market.
While the raw compute performance hasn’t changed significantly, certain workloads like Large Language Models (LLMs) could benefit greatly from the increased memory capacity, offering up to 18X higher performance compared to the original A100.
Nvidia’s HGX H200 GPU stands out with 141GB of HBM3e memory and 4.8 TB/s of total bandwidth per GPU, a significant improvement over the H100.
The H200 features six HBM3e stacks, providing 76% more memory capacity and 43% more bandwidth compared to the H100.
Nvidia also teased the upcoming Blackwell B100.
Additionally, there is a new GH200, which combines the H200 GPU with the Grace CPU, featuring a combined 624GB of memory.
Pricing details are not disclosed, but the prior-generation H100s are estimated to be expensive, ranging from $25,000 to $40,000 each.
Several supercomputers, including the Alps and Venado, will be powered by GH200, with the Jupiter supercomputer being one of the largest installations.
Nvidia expects over 200 exaflops of AI computational performance to come online in the next year with these new supercomputer installations.
Nvidia is actively adapting to export restrictions by introducing new chips tailored for China, including the H200, to maintain its supply and market share in the region.
Nvidia’s H100 chips are in high demand for processing data required in training and operating generative AI models.
Nvidia plans to triple its production of H100 in 2024 to meet the growing demand for AI chips.
The pace of semiconductor advancements may lead to newer, faster Nvidia AI chips, with the company planning a one-year release pattern, including the forthcoming B100 chip based on the Blackwell architecture in 2024.
The introduction of the H200 and other chips reflects Nvidia’s determination to navigate the challenging regulatory landscape and continue serving China’s growing demand for AI chips.

Overall, Nvidia’s HGX H200 GPU with HBM3e memory technology represents a significant advancement in AI hardware, offering enhanced performance and memory capabilities. It addresses the evolving needs of the AI industry and competitive market dynamics.

What are AI Chips and Why Do They Matter

Explore the vital role of AI chips in driving the AI revolution, from semiconductors to processors: key players, market dynamics, and future implications.