Dell EMC DSS 8440 Server Powered by NVIDIA RTX GPUs for HPC and AI Workloads
The Dell EMC DSS8440 server is a 2 Socket, 4U server designed for High Performance Computing, Machine Learning (ML) and Deep Learning workloads. This article compares the performance of various GPUs such as NVIDIA Volta V100S and NVIDIA Tesla T4 Tensor Core GPUs as well as NVIDIA quadro RTX GPUs in this system.
In this blog, we evaluate the performance of the cost-effective NVIDIA Quadro RTX 6000 and the NVIDIA Quadro RTX 8000 GPUs compared to the top tier accelerator V100S GPU by using various industry standard benchmarking tools. This includes testing against single vs double precision workloads. While the Quadro series has existed for a long time, RTX GPUs with NVIDIA Turing Architecture launched in late 2018. The specifications in Table 1 show that the RTX 8000 GPU is superior to the RTX 6000 in terms of higher memory configuration. However, the RTX 8000 and RTX 6000 GPUs have higher power needs compared to the V100S GPU. For workloads that require a higher memory capacity, the RTX 8000 is the better choice.
Table.1 GPU Specifications
Server | DellEMC PowerEdge DSS8440 |
Processor | 2 x Intel Xeon 6248, 20 C @ 2.5 GHz |
Memory | 24 x 32 GB @ 2933 MT/s (768 GB Total) |
GPU | 8 x Quadro RTX 6000 8 x Quadro RTX 8000 8 x Volta V100S - PCIe |
Storage | 1 x Dell Express Flash NVMe 1 TB 2.5" U.2 (P4500) |
Power Supplies | 4 x 2400 W |
Table.2 Server configuration details
BIOS | 2.5.4 |
OS | RHEL 7.6 |
Kernel | 3.10.0-957,ek7.x86_64 |
System Profile | Performance Optimized |
CUDA Toolkit CUDA Driver |
10.1 440.33.01 |
Table.3 System Firmware details
Application | Version |
HPL | hpl_cuda_10.1_ompi-3.1_volta_pascal_kepler_3-14-19_ext Intel MKL 2018 Update 4 |
LAMMPS | March 3 2020 OpenMPI – 4.0.3 |
MLPERF | v0.6 Training docker 19.03 |