Druckansicht der Internetadresse:

Forschungszentrum für wissenschaftliches Rechnen an der Universität Bayreuth

Seite drucken

festus (under construction,2024)

General remarks

The cluster "festus" (btrzx24) is expected to be available from November 2024 for the groups involved in the procurement. It consists of two management nodes, one virtualization server, two login nodes, several sotrage servers and 73 compute nodes which are connected by an 100G Infiniband Interprocess- and a 25G Sericenetwork. "festus" uses Slurm (24.05) as resource manager. The ITS file server (e.g., the ITS home directory) is not mounted on the cluster for performance reasons, every users has a separate home directory (5GB) which lies on the clusters own nfs-server.

Login

The login nodes of festus will be accessible with ssh via festus.hpc.uni-bayreuth.de only from university networks. If you are outside the university, a VPN connection is required. If your login shell is (t)csh or ksh, you have to change it to bash or zsh in the ITS self-service portal.

Compute nodes


typAtypBtypCtypDtypEtypF
N6241132
CPU (x2)EPYC 9554EPYC 9684XXEON® 8480+XEON® 8480+EPYC 9554EPYC 9554
cores total128192112112128128
CpuFreqMax3.75GHz3.42GHz3.8GHz3.8GHz3.75GHz3.75GHz
DDR5 (4.8GT/s)24x 16GB24x 64 RAM16x 128GB16x 128GB24x 16GB24x 16GB
local /tmp-Space (NVMe)~200GB~200GB ~14TB~14TB ~3.84TB~3.84TB
GPU--4x H1004x MI2102x L402x MI210
PartitionnormalHighMemAIAInormalnormal

Queues / Partitions

  • normal
    Priority: multifiactor, most weight on the group's financial share in the cluster and consumed ressources
    Wall time: 8 hours (default), 24 hours (max)
    Restrictions: cpu nodes only
  • HighMem
    Priority: multifiactor, most weight on the group's financial share in the cluster and consumed ressources
    Wall time limit: 8 hours (default), 24 hours (max)
    Restrictions: HighMem nodes only
  • AI
    Priority: multifiactor, most weight on the group's financial share in the cluster and consumed ressources
    Wall time limit: 8 hours (default), 24 hours (max)
    Restrictions: typC and typD nodes only

Network

  • Infiniband (100 Gbit/s)
    • 2-level Fat Tree (Blocking factor 2)
  • Ethernet (25 Gbit/s)

User file space (network and local)

  • NFS file system
    • /groups/<org-id>: Group directory (only for groups financially involved in the cluster)
    • /home: 5GB per User
    • /workdir: no soft-quota
      • no snapshots, no backup
      • data lifetime 60 days
  • BeeGFS 
    • /scratch:
      • only for large MPIIO or phdf5 Workloads
      • data lifetime 10 days
  • Local disk (/tmp):
    • typA/B: ~200GB
    • typC/D: ~14TB
    • typE/F: ~3.84TB

Administrative limitations

  • system: max. 500 jobs per cycle (30s) get queued
  • per shareholder account: max 1000 jobs submitted, 6192 cores simultaneously
  • default account (overall): max 1000 jobs submitted, 2048 cores simultaneously
Commissioning & Extension
  • November 2024

Resource Manager & Scheduler

  • Slurm 24.05

Operating system

  • RHEL 9.4 / RockyLinux 9.4

Verantwortlich für die Redaktion: Dr.rer.nat. Ingo Schelter

Facebook Twitter Youtube-Kanal Instagram UBT-A Kontakt