Intel® Gaudi® 3 Platform with GIGABYTE solutions
Leap forward in performance and efficiency with an open, Ethernet-based AI system.
Always striving for the perfect balance between performance, efficiency, stability, and scalability, GIGABYTE has developed numerous designs to fit various use cases for these AI-era GPU powerhouses. For this newcomer to the GIGABYTE AI lineup, a robust 8U chassis with optimized thermal capabilities was designed to extract every ounce of its potential. It marks the first GIGABYTE server to adopt an 8U air-cooling solution that seamlessly fits into industry-standard air-cooled infrastructure.
By fully utilizing this Ethernet-centric, scalable solution on GIGAPOD – GIGABYTE’s well-optimized and proven rack solution – customers can quickly adopt the latest Intel Gaudi 3 solution with minimal verification required. The rack solution features a 4-server configuration with a Rear Door Heat Exchanger (RDHx), maximizing compute density for optimal utilization of limited facility space.
To learn more about GIGAPOD, please visit our GIGAPOD solution page.
Effortless adoption or migration of existing code with Intel Gaudi software, purpose-built for Gen AI with industry-leading software capabilities.
Designed for Ethernet hardware with 1200GB/s open standard RoCE connection among accelerators, scaling cost-effectively for even the largest and most complex deployments.
A mix of 8 Matrix Multiplication Engines (MME) and 64 Tensor Processor Cores (TPC) on two interconnected compute dies, delivering optimal performance across a wide range of workloads.
A total of 128GB of HBM and 96MB L2 cache, effectively addressing the memory bottlenecks often seen in AI training and inference, efficiently accelerating memory-intensive applications like LLM.
Model | Intel® Gaudi® 3 Accelerator |
---|---|
BF16/FP8 MME TFOPs | 1835 |
BF16 Vector TFLOPs | 28.7 |
MME Units | 8 |
TPC Units | 64 |
HBM Capacity | 128 GB |
HBM Bandwidth | 3.7 TB/s |
On-die SRAM Capacity | 96 MB |
On-die SRAM Bandwidth | 12.8 TB/s |
Networking | 1200 GB/s bidirectional |
Host Interface | PCIe Gen5 x16 |
Media | 14 Decoders |
Complex problem-solving in HPC applications use numerical methods, simulations, and computations to achieve significant insights. While traditionally less dependent on GPUs, the overwhelming parallel computing power of GPGPUs has greatly accelerated the development of HPC in recent years, making hybrid configurations a growing trend in modern supercomputers.
With the rapid adoption of AI, from general applications to the fast-evolving deep learning, GPGPUs have become a game changer for the industry. The parallel processing capabilities of GPGPUs allow for the handling of massive datasets and complex algorithms, which are essential for training and deploying AI models. As a result, AI has become the key to making modern systems faster and “smarter” in the most efficient way.
In data-intensive applications such as big data and computational simulations, systems rely heaving on GPGPUs for high parallel processing, low latency, and high bandwidth to facilitate data mining and large-scale data processing. The ability of GPGPUs to handle vast amounts of data simultaneously not only accelerates the processing of massive datasets but also enables more accurate and timely insights, driving informed decision-making in fields like finance, healthcare, and scientific research.