Success-Case

NTHU’s Student Team Won 2nd Place in the ISC 2024 Student Cluster Competition

by TechNews
A student team from National Tsing Hua University (NTHU), led by Professor Jerry Chou, secured second place in the ISC 2024 Student Cluster Competition. Leveraging GIGABYTE servers and high-performance computing (HPC) solutions,
Professor Jerry Chou of NTHU(fourth from the left) and his student team (from left to right: Kuo Pin-Yi, Lin Chan-Yi, Wei Shih-Hsun, Weng Chun-Mu, Mou Chan-You, Yu Hao-Tien, and Bai Chen-An) won second place at the ISC 2024 Student Cluster Competition. (Source: TechNews)
High-performance computing (HPC) has long been a critical method for solving complex scientific problems, driving sustained research efforts across various fields. To encourage students to dive into the world of supercomputing, major competitions at ISC in Europe, SC in North America, and SCA in Asia that globally attract university STEM student teams every year, creating intense competition. At the major European HPC competition, the ISC Student Cluster Competition, a team led by Professor Jerry Chou from National Tsing Hua University (NTHU) built a supercomputing system using GIGABYTE servers, ultimately landing their team the runner-up award.

Professor Chou from the Department of Computer Science at NTHU expressed that their goal has always been to offer students hands-on experience with HPC and AI challenges through competitions, broadening their knowledge beyond textbooks and nurturing talent in complex computing. The competition team this year included not only computer science students but also students from the College of Science, College of Engineering, and College of Arts. The diverse member backgrounds encouraged a multidimensional approach to problem-solving, a key factor in their success. The team was eager to express their gratitude toward GIGABYTE and Giga Computing for providing crucial supercomputing equipment and funding, enabling the students to compete on the world stage, and showcasing Taiwan’s hardware and software expertise in HPC.

Creating the Computing Solution to Win
Professor Chou's research encompasses distributed systems, cloud computing, HPC, big data, storage systems, and resource or data management. Recognizing the global importance of HPC, in addition to hosting courses on distributed system design and practical HPC cluster computing, Professor Chou also leads NTHU’s Large-Scale System Architecture Lab, aiming to cultivate cross-disciplinary HPC talent.

Professor Chou pointed out that the HPC field used to focus on architecture design, resource allocation, and even AI model training. In recent years, due to rapid advancements in AI technology and increasingly large-scale language model parameters exceeding the capacity of standard research institutes, the focus has shifted to AI inference. International supercomputing competitions follow similar trends, with the ISC 2024 competition limiting the computing power consumption to 6000W. Thus, the team must figure out the optimal CPU-GPU combination for each type of challenging workload in the competition.

The competition covered molecular simulations, fluid dynamics, and weather modeling. Over three days, students had to utilize their configured supercomputing systems to complete several calculations for scientific applications. After thorough discussions with the student team, the Giga Computing’s engineering team prepared the necessary GIGABYTE R183-S90 rack servers, along with a GIGABYTE G493-SB1 server populated with ten NVIDIA H100 Tensor Core GPUs, to cater to all the students' needs for their project.

 For components directly impacting CPU and GPU performance, GIGABYTE uses Micron’s DDR5 RDIMM 4800MT/s memory and 7450 PRO series NVMe SSDs. For inter-node communication, the Broadcom P1200G high-speed network card is utilized, and the Ufispace S9300-32D 32x400G was utilized as the network switch.

The G493-SB1 stands out with the feature of supporting up to 10 GPUs, where 8 GPUs use NVIDIA NVLINK™ technology for interconnectivity, achieving a significantly faster data transfer rate between GPUs compared to traditional PCIe connections. Compared to other teams relying solely on PCIe connections, this advantage had a positive impact on the team’s performance. Additionally, Giga Computing’s comprehensive technical support, especially during the pre-competition phase, fully covered the team’s needs and became a decisive factor leading to outstanding results.

Chan-You Mou, the coach supporting the students in this year's competition, noted that when GPU computing power is concentrated on a single node, data exchange time is reduced, shortening the processing time; thus, achieving strong High-Performance Linpack (HPL) benchmark results. GIGABYTE’s server solutions are diverse, enabling the team to prepare for any unexpected issues and challenges during the competition.

In the fluid dynamics simulation task, the team leveraged NVIDIA H100 GPUs and NVLINK™, significantly improving the time to completion. And the results matched their expectations, giving them a substantial advantage in the competition.

We are delighted that Micron's DDR5 RDIMMs and 7450 PRO NVMe SSDs played a critical role for the Tsinghua University team at the ISC 2024 Student Cluster Competition. Through our collaboration with GIGABYTE, we utilized our high-performance solutions to help teams maximize the potential of their supercomputing systems to tackle complex scientific computing challenges. This collaboration demonstrates our determination and efforts to drive innovation in high-performance computing technologies within the education sector,” said by Wai Leong Chan, Micron Sales Director.

Broadcom is pleased its Ethernet adapter was a part of the 2024 ISC European Supercomputer Competition and congratulates Tsinghua University and GIGABYTE on its outstanding achievement," said Jas Tremblay, Vice President and General Manager of the Data Center Solutions Group at Broadcom. "Broadcom continues to be deeply committed to the open ecosystem to enable AI data centers and infrastructure with high-performing, power efficient solutions.

Vincent Wang, VP of Sales at Giga Computing, stated, “We are incredibly proud to collaborate with NTHU, supporting their outstanding achievement of securing second place at the 2024 ISC European Supercomputing Competition. This accomplishment not only highlights Giga Computing’s strength in high-performance computing solutions but also demonstrates the seamless integration of GIGABYTE servers with other components to maximize performance, underscoring our commitment to advancing AI and supercomputing technologies.”

▲ Professor Chou leads NTHU’s Distributed Systems Architecture Lab to cultivate interdisciplinary HPC talent. (Source: TechNews)
Built-in Server Management Tools Aid Students in Real-Time Performance Monitoring
To address the demands of the ISC 2024 competition, the NTHU team deployed three GIGABYTE R183-S90 general-purpose servers and one GIGABYTE G493-SB1 GPU server with ten GPUs. The G493-SB1 was chosen because it is a high-performance GPU server specifically designed for AI, deep learning, and HPC applications, using dual 5th generation Intel® Xeon® Scalable Processors and supporting up to 10 dual-slot GPUs, to deliver top-tier AI computing power. Equipped with 32 memory slots, with two DIMMs per channel (2DPC), this server also accommodates up to twelve 2.5”/3.5” Gen5 NVMe, SATA, or SAS-4 drives. Featuring advanced cooling and power solutions, the product design ensures operational stability, making it ideal for complex workloads like data analysis and scientific simulations.

On the other hand, the R183-S90 is specifically designed for applications that don’t leverage a GPU, and it also is built for 5th Gen Intel® Xeon® Scalable Processors. It includes up to 32 memory slots configured as 2DPC, providing impressive memory capacity and speed, making it suitable for various computing tasks such as data analysis, cloud computing, and virtualization. Both servers feature comprehensive management tools that allow team members to monitor operational performance in real time.

Using the standard IPMI interface, the NTHU team developed custom tools to monitor CPU and GPU temperatures and status via GIGABYTE’s server management tools. Given the strict 6000W power limit of the competition, the team also optimized cooling fan speeds, balancing CPU/GPU chip temperatures and power consumption— which later became crucial to their strong performance.

No hardware can be tweaked during the HPC process, turning contingency response into a major challenge. For instance, during the competition’s final day, an application failed to run due to a faulty component that was limiting CPU performance. The team quickly identified the problem and modified the computing process, resolving the issue swiftly. Additionally, while testing software versions, the team found that the Intel version outperformed the initially preferred GNU Compiler Collection (GCC) version, further accelerating computations. These experiences enhanced the team’s troubleshooting skills, while also providing valuable examples for future growth and success.

Lastly, interacting with other amazing competitors enriched the team’s experience. Such interactions and knowledge exchange allowed participants to break through limitations of their own thinking, optimize system parameters, and further enhance their technical skills and knowledge in the HPC field.

▲ A team of NTHU students, led by Professor Chou, participated in the ISC 2024 Student Cluster Competition using GIGABYTE servers. (Source: TechNews)
Supporting NTHU's HPC Lab Project and Reinforcing Taiwan's Global Tech Capabilities
In the competition, GIGABYTE and Giga Computing provided full support to the NTHU team, showcasing Taiwan's hardware and software prowess in high-performance computing on a global stage. They also plan to further collaborate with Professor Chou in establishing a dedicated HPC lab. GIGABYTE and Giga Computing aim to equip NTHU with servers, switches, and other components, creating a realistic computing environment that will help students excel in future competitions.

Professor Chou noted that GIGABYTE servers, known for their superior quality, have become the preferred choice for laboratories worldwide and are pivotal in exhibiting Taiwan's technical strength in international competitions. Chou also highlighted that this collaboration not only encourages more students to engage in HPC research but also cultivates interdisciplinary talent. An NTHU Master’s student, Pin-Yi Kuo, shared that his ongoing involvement in HPC research has sparked his strong interest in GPU resource sharing and integration. He hopes that his future research will yield significant breakthroughs, contributing to greater advancements in technology.

(Source of main image: TechNews; Caption: Professor Jerry Chou of National Tsing Hua University (fourth from the left) with his student team (from left to right: Kuo Pin-Yi, Lin Chan-Yi, Wei Shih-Hsun, Weng Chun-Mu, Mou Chan-You, Yu Hao-Tien, and Bai Chen-An), who won second place at the ISC 2024 Student Cluster Competition)

Get the inside scoop on the latest tech trends, subscribe today!
Get Updates
Get the inside scoop on the latest tech trends, subscribe today!
Get Updates