Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Intro

Similar to the Supek supercomputer, the Vrančić computer cluster intended for advanced computing in the cloud consists of several servers of different purposes:

PurposeNumberCPUGPURAM (GB)
CPU servers862 x AMD EPYC 7713-512
Mem servers22 X AMD EPYC 7713-

2048

GPU servers41 X AMD EPYC 77134 X NVIDIA A100512


Servers with processing resources

...

10 x servers with additional local NVMe SSD + 76 x server = 10 x (2 x 64 core AMD EPYC 7713 @ 2.0 GHz) + 76 x (2 x 64 core AMD EPYC 7713 @ 2.0 GHz) = 11,008 CPU cores


The specifications of the AMD EPYC 7713 processor are as follows:

  • Number of cores: 65
  • Number of threads: 128
  • Base clock: 2.00 GHz
  • Maximum clock: 3.2 GHz
  • Cache memory: L3 - 256 MB, L2 - 512 kB, L1 - 64 kB
  • TDP: 255 W
  • It supports DDR4 memory modules up to 3200 MHz
  • Supports up to eight channels of DDR4 memory
  • PCIe version: PCIe 4.0


Servers with memory resources

Inside the server with memory resources there are 2 servers, each with 2 AMD EPYC 7713 @ 2.0 GHz processors and 16 GB of working memory per core.

Sum of all servers and associated resources:

  • 2 x server (2 x 64 core AMD EPYC 7713 @ 2.0 GHz) = 256 CPU cores
  • 256 CPU cores x 16 GB RAM = 4096 GB RAM
  • each server has 128 CPU cores and 2048 GB of RAM


Servers with graphics processors

Within the graphics resource server there are 4 servers with 4 x NVIDIA A100 GPUs per server.

  • 4 x NVIDIA A100 GPU per server = 4 x 16 CPU core = 64 CPU core -> AMD EPYC 7713 @ 2.0 GHz

RAM:

  • 4 x 96 GB RAM = 384 GB RAM - built-in 512 GB RAM per server
  • 32 x 64 GB RAM module = 2048 GB RAM / 4 servers = 512 GB RAM


NVIDIA A100 40GB is a graphics card that is specially designed for performing demanding computing operations, such as scientific computing, machine learning and high performance computing. Thanks to its Ampere architecture, the NVIDIA A100 40GB provides improved data processing and performance compared to previous NVIDIA graphics cards. Its specification includes:

  • Architecture: Ampere
  • Processor: NVIDIA A100 Tensor Core GPU
  • Number of CUDA cores: 6,912; various instance sizes up to 7 MIG @ 5GB
  • Number of Tensor cores: 432
  • Memory: 40 GB
  • Memory type: HBM2
  • Bus: 5120 bit
  • Bandwidth: 1555 GB/s
  • TDP: 500W (2000W)


The NVIDIA A100 Tensor Core GPU processor consists of 6,912 CUDA cores and 432 Tensor cores. The difference between CUDA and Tensor cores can be seen in their primary function. CUDA cores are used to run a wide range of algorithms in parallel for image processing, scientific computing, and many other applications that can be parallelized. Tensor cores are special cores used for tensor processing. These cores help perform complex mathematical operations quickly, which is critical for performing demanding machine learning operations.

The total memory capacity of the graphics card is 40GB. This amount of memory enables fast storage of large amounts of data used in demanding computer applications. This means that users can process large amounts of data and reduce the time required to perform computer operations.


Storage resources

Fast Storage Servers

In the configuration of these storage resources, there are 3 servers with a total capacity of 415 TB NVMe SSD evenly distributed across all servers.


Storage:

  • 27 NVMe SSD drives x 15.36 TB = 414.72 TB
  • 9 NVMe SSD drives per server x 10 = 90 + 4 cores = 94 CPU cores -> 2 x AMD EPYC 7643 48-core @ 2.3GHz

Working memory:

  • 10 GB RAM x 9 (number of disks) = 90 + 16 (additional GB RAM) = 106 -> Each server has 256 GB RAM


Servers for the standard storage


In the configuration of these storage resources, there are 6 servers with a total of 3 PB HDD and 120 TB NVMe SSD evenly distributed across all servers.

Storage:

  • 168 HDD x 18 TB = 3024 TB
  • 18 NVMe SSD drives x 7.68 TB = 138.24 TB

Single server configuration:

  • 28 HDD x 0.5 CPU core = 14 core + 4 = 18 CPU core -> AMD EPYC 7543P 32 core @ 2.8GHz
  • 28 HDD x 5 GB RAM = 140 GB RAM + 16 GB RAM = 156 GB RAM -> each server has 192 GB RAM