btrzx4 (2018,deprecated)
General remarks
This cluster will be be decommissioned on 10.02.2023.
The cluster btrzx4 is an extension of btrzx2 which went into operation in April 2018 and was purchased within the project HiPerSim4all of the European Social Fund with the aim to strengthen small and medium business on a regional scale. Besides two login nodes, it features four different node types with the properties compute20, ssd, phi, and cuda which are connected by an Intel Omni-Path network. While the phi nodes are equipped with a single Intel Xeon Phi CPU, the compute20, ssd, and cuda nodes all contain two 10-core Intel Xeon CPUs, eventually with a 1TB SSD (ssd) or a Nvidia Tesla GPU (cuda).
Login nodes
- btrzx4-1.rz.uni-bayreuth.de
- btrzx4-2.rz.uni-bayreuth.de
Compute nodes (state June 2018)
- 79 nodes (compute20)
2x Intel Xeon E5-2630 v4 @ 2.2GHz (Broadwell) with 2x 64GB RAM
(in total 20 physical cores, hyperthreading disabled) - 5 nodes (ssd)
2x Intel Xeon E5-2630 v4 @ 2.2GHz (Broadwell) with 2x 64GB RAM
(in total 20 physical nodes, hyperthreading disabled) and
1 TB SSD - 2 nodes (phi)
1x Intel Xeon Phi 7210 @ 1.3GHz (Knights Landing) with 1x 256GB RAM
(in total 64 physcal cores, 4 threads per core) - 5 nodes (cuda)
2x Intel Xeon E5-2630 v4 @ 2.2GHz (Broadwell) with 2x 128GB RAM
(in total 20 physical cores, hyperthreading disabled) and
1x Nvidia Tesla P100 (12GB)
Queues
- default
Wall time limit: 100 hours
Restrictions: no
Network
- Intel Omni-Path
- 2-level Fat Tree (Blocking factor 2)
User file space (network and local)
- ITS home directories (/home)
- Panasas file system (/data & /scratch)
- Local disk (/tmp, 840 GByte)
Commissioning & Extension
- Commissioning: Apr. 2018
Resource Manager & Scheduler
- PBS Torque
- Maui
Operating system
- CentOS
Node topology (likwid-topology -g)
- compute20, ssd, & cuda: 2x Intel Xeon E5-2630 v4Einklappen
-
-------------------------------------------------------------------------------- CPU name: Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz CPU type: Intel Xeon Broadwell EN/EP/EX processor CPU stepping: 1 ******************************************************************************** Hardware Thread Topology ******************************************************************************** Sockets: 2 Cores per socket: 10 Threads per core: 1 -------------------------------------------------------------------------------- HWThread Thread Core Socket Available 0 0 0 0 * 1 0 1 0 * 2 0 2 0 * 3 0 3 0 * 4 0 4 0 * 5 0 5 0 * 6 0 6 0 * 7 0 7 0 * 8 0 8 0 * 9 0 9 0 * 10 0 10 1 * 11 0 11 1 * 12 0 12 1 * 13 0 13 1 * 14 0 14 1 * 15 0 15 1 * 16 0 16 1 * 17 0 17 1 * 18 0 18 1 * 19 0 19 1 * -------------------------------------------------------------------------------- Socket 0: ( 0 1 2 3 4 5 6 7 8 9 ) Socket 1: ( 10 11 12 13 14 15 16 17 18 19 ) -------------------------------------------------------------------------------- ******************************************************************************** Cache Topology ******************************************************************************** Level: 1 Size: 32 kB Cache groups: ( 0 ) ( 1 ) ( 2 ) ( 3 ) ( 4 ) ( 5 ) ( 6 ) ( 7 ) ( 8 ) ( 9 ) ( 10 ) ( 11 ) ( 12 ) ( 13 ) ( 14 ) ( 15 ) ( 16 ) ( 17 ) ( 18 ) ( 19 ) -------------------------------------------------------------------------------- Level: 2 Size: 256 kB Cache groups: ( 0 ) ( 1 ) ( 2 ) ( 3 ) ( 4 ) ( 5 ) ( 6 ) ( 7 ) ( 8 ) ( 9 ) ( 10 ) ( 11 ) ( 12 ) ( 13 ) ( 14 ) ( 15 ) ( 16 ) ( 17 ) ( 18 ) ( 19 ) -------------------------------------------------------------------------------- Level: 3 Size: 25 MB Cache groups: ( 0 1 2 3 4 5 6 7 8 9 ) ( 10 11 12 13 14 15 16 17 18 19 ) -------------------------------------------------------------------------------- ******************************************************************************** NUMA Topology ******************************************************************************** NUMA domains: 2 -------------------------------------------------------------------------------- Domain: 0 Processors: ( 0 1 2 3 4 5 6 7 8 9 ) Distances: 10 21 Free memory: 59312.3 MB Total memory: 65441 MB -------------------------------------------------------------------------------- Domain: 1 Processors: ( 10 11 12 13 14 15 16 17 18 19 ) Distances: 21 10 Free memory: 62872.7 MB Total memory: 65536 MB -------------------------------------------------------------------------------- ******************************************************************************** Graphical Topology ******************************************************************************** Socket 0: +---------------------------------------------------------------------------------------------------------------+ | +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ | | | 0 | | 1 | | 2 | | 3 | | 4 | | 5 | | 6 | | 7 | | 8 | | 9 | | | +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ | | +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ | | | 32 kB | | 32 kB | | 32 kB | | 32 kB | | 32 kB | | 32 kB | | 32 kB | | 32 kB | | 32 kB | | 32 kB | | | +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ | | +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ | | | 256 kB | | 256 kB | | 256 kB | | 256 kB | | 256 kB | | 256 kB | | 256 kB | | 256 kB | | 256 kB | | 256 kB | | | +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ | | +-----------------------------------------------------------------------------------------------------------+ | | | 25 MB | | | +-----------------------------------------------------------------------------------------------------------+ | +---------------------------------------------------------------------------------------------------------------+ Socket 1: +---------------------------------------------------------------------------------------------------------------+ | +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ | | | 10 | | 11 | | 12 | | 13 | | 14 | | 15 | | 16 | | 17 | | 18 | | 19 | | | +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ | | +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ | | | 32 kB | | 32 kB | | 32 kB | | 32 kB | | 32 kB | | 32 kB | | 32 kB | | 32 kB | | 32 kB | | 32 kB | | | +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ | | +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ | | | 256 kB | | 256 kB | | 256 kB | | 256 kB | | 256 kB | | 256 kB | | 256 kB | | 256 kB | | 256 kB | | 256 kB | | | +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ | | +-----------------------------------------------------------------------------------------------------------+ | | | 25 MB | | | +-----------------------------------------------------------------------------------------------------------+ | +---------------------------------------------------------------------------------------------------------------+
- phi: 1x Intel Xeon Phi 7210Einklappen
-
-------------------------------------------------------------------------------- CPU name: Intel(R) Xeon Phi(TM) CPU 7210 @ 1.30GHz CPU type: Intel Xeon Phi (Knights Landing) (Co)Processor CPU stepping: 1 ******************************************************************************** Hardware Thread Topology ******************************************************************************** Sockets: 1 Cores per socket: 64 Threads per core: 1 -------------------------------------------------------------------------------- HWThread Thread Core Socket Available 0 0 0 0 * 1 0 1 0 * 2 0 2 0 * 3 0 3 0 * 4 0 4 0 * 5 0 5 0 * 6 0 6 0 * 7 0 7 0 * 8 0 8 0 * 9 0 9 0 * 10 0 10 0 * 11 0 11 0 * 12 0 12 0 * 13 0 13 0 * 14 0 14 0 * 15 0 15 0 * 16 0 16 0 * 17 0 17 0 * 18 0 18 0 * 19 0 19 0 * 20 0 20 0 * 21 0 21 0 * 22 0 22 0 * 23 0 23 0 * 24 0 24 0 * 25 0 25 0 * 26 0 26 0 * 27 0 27 0 * 28 0 28 0 * 29 0 29 0 * 30 0 30 0 * 31 0 31 0 * 32 0 32 0 * 33 0 33 0 * 34 0 34 0 * 35 0 35 0 * 36 0 36 0 * 37 0 37 0 * 38 0 38 0 * 39 0 39 0 * 40 0 40 0 * 41 0 41 0 * 42 0 42 0 * 43 0 43 0 * 44 0 44 0 * 45 0 45 0 * 46 0 46 0 * 47 0 47 0 * 48 0 48 0 * 49 0 49 0 * 50 0 50 0 * 51 0 51 0 * 52 0 52 0 * 53 0 53 0 * 54 0 54 0 * 55 0 55 0 * 56 0 56 0 * 57 0 57 0 * 58 0 58 0 * 59 0 59 0 * 60 0 60 0 * 61 0 61 0 * 62 0 62 0 * 63 0 63 0 * -------------------------------------------------------------------------------- Socket 0: ( 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 ) -------------------------------------------------------------------------------- ******************************************************************************** Cache Topology ******************************************************************************** Level: 1 Size: 32 kB Cache groups: ( 0 ) ( 1 ) ( 2 ) ( 3 ) ( 4 ) ( 5 ) ( 6 ) ( 7 ) ( 8 ) ( 9 ) ( 10 ) ( 11 ) ( 12 ) ( 13 ) ( 14 ) ( 15 ) ( 16 ) ( 17 ) ( 18 ) ( 19 ) ( 20 ) ( 21 ) ( 22 ) ( 23 ) ( 24 ) ( 25 ) ( 26 ) ( 27 ) ( 28 ) ( 29 ) ( 30 ) ( 31 ) ( 32 ) ( 33 ) ( 34 ) ( 35 ) ( 36 ) ( 37 ) ( 38 ) ( 39 ) ( 40 ) ( 41 ) ( 42 ) ( 43 ) ( 44 ) ( 45 ) ( 46 ) ( 47 ) ( 48 ) ( 49 ) ( 50 ) ( 51 ) ( 52 ) ( 53 ) ( 54 ) ( 55 ) ( 56 ) ( 57 ) ( 58 ) ( 59 ) ( 60 ) ( 61 ) ( 62 ) ( 63 ) -------------------------------------------------------------------------------- Level: 2 Size: 1 MB Cache groups: ( 0 1 ) ( 2 3 ) ( 4 5 ) ( 6 7 ) ( 8 9 ) ( 10 11 ) ( 12 13 ) ( 14 15 ) ( 16 17 ) ( 18 19 ) ( 20 21 ) ( 22 23 ) ( 24 25 ) ( 26 27 ) ( 28 29 ) ( 30 31 ) ( 32 33 ) ( 34 35 ) ( 36 37 ) ( 38 39 ) ( 40 41 ) ( 42 43 ) ( 44 45 ) ( 46 47 ) ( 48 49 ) ( 50 51 ) ( 52 53 ) ( 54 55 ) ( 56 57 ) ( 58 59 ) ( 60 61 ) ( 62 63 ) -------------------------------------------------------------------------------- ******************************************************************************** NUMA Topology ******************************************************************************** NUMA domains: 1 -------------------------------------------------------------------------------- Domain: 0 Processors: ( 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 ) Distances: 10 Free memory: 183806 MB Total memory: 193180 MB -------------------------------------------------------------------------------- ******************************************************************************** Graphical Topology ******************************************************************************** Socket 0: +----------------------------------------- ... -----------------------------------------+ | +-------+ +-------+ +-------+ +-------+ ... +-------+ +-------+ +-------+ +-------+ | | | 0 | | 1 | | 2 | | 3 | ... | 60 | | 61 | | 62 | | 63 | | | +-------+ +-------+ +-------+ +-------+ ... +-------+ +-------+ +-------+ +-------+ | | +-------+ +-------+ +-------+ +-------+ ... +-------+ +-------+ +-------+ +-------+ | | | 32 kB | | 32 kB | | 32 kB | | 32 kB | ... | 32 kB | | 32 kB | | 32 kB | | 32 kB | | | +-------+ +-------+ +-------+ +-------+ ... +-------+ +-------+ +-------+ +-------+ | | +-----------------+ +-----------------+ ... +-----------------+ +-----------------+ | | | 1 MB | | 1 MB | ... | 1 MB | | 1 MB | | | +-----------------+ +-----------------+ ... +-----------------+ +-----------------+ | +----------------------------------------- ... -----------------------------------------+
Node-level performance (likwid-bench)
- Parallel data transfer rate - compute20Einklappen
-
Parallel data transfer rate on btrzx2 compute20 nodes. In the 2-socket data set, both sockets are balanced.
- Vector triad - compute20Einklappen
-
Performance of the vector triad on btrzx2 compute20 nodes.
- Parallel data transfer rate - phiEinklappen
-
Parallel data transfer rate on btrzx2 compute8 nodes. In the 2-socket data set, both sockets are balanced.
- Vector triad - phiEinklappen
-
Performance of the vector triad on btrzx2 compute8 nodes.
MPI benchmarks (OSU)
- MPI BandwidthEinklappen
-
P2P MPI Bandwidth between processes on the same socket (by core), the same node but different sockets (by socket), and different nodes (by node) using ofi or shm:ofi fabric interfaces.
- MPI LatencyEinklappen
-
Single-point measurements of the point-2-point MPI latency (limit: size 0) between processes on the same socket (by core), the same node but different sockets (by socket), and different nodes in the same or different racks (by node).
Statistics
- 2018 - Job size [compute20]Einklappen
-
Spent computation time by job size and runtime.