Callan HPC Cluster

Callan is the general access HPC Cluster for Trinity Researcher's. It's hardware characteristics are:

  • 12 Compute nodes, each with 64 Intel Xeon Gold 6430 CPU Core's, 256GB of RAM and a 1.8TB local scratch, /tmp, disk.
  • 1 head node with 64 Intel Xeon Gold 6430 CPU Core's, 64GB of RAM and a 1TB local scratch, /tmp, disk.
  • A 69TB shared parallel file system available at /home across all the nodes in the cluster.
  • HDR, 200 Gb/s, InfiniBand high speed interconnect.

Access

To access it you must have a Research IT account, please apply for one if you don't have one.

To request access to the cluster please email ops@tchpc.tcd.ie.

Login

To login please connect to callan.tchpc.tcd.ie using the usual SSH instructions. It is accessible from the College network, including the VPN. To connect to it from the internet please first login to the College VPN or relay through rsync.tchpc.tcd.ie as per our instructions.

Details of the Callan file system

  • There is 1 file system: /home, which is accessible to all nodes in the cluster. The default user quota in /home is 250GB for all files in /home per user.

  • /home/users/$user - 50GB quota. Backed up for disaster recovery purposes. Accessible only to you.

  • /home/scratch/$user - Temporary scratch storage. Files here are automatically deleted 90 days after they are created. Do not use for long term storage. Not backed up.

  • /home/projects/pi-$pi - 50GB quota. Not backed up. Accessible to all users in the $pi group. To be used for group collaborative data.

Software

Software is installed with our usual modules system. You can view the available software with module available and load software with module load, e.g. module load gcc/13.1.0-gcc-8.5.0-k3cddbg. The modgrep utility will search the available module files from the head node, e.g. modgrep conda will display any modules with conda in their name.

Intel OneAPI

Suggested intel modules to load:

module load tbb/latest compiler-rt/latest oclfpga/latest compiler/latest mpi/latest

Manual activation: source /home/support/intel/oneapi/2024.1.0/setvars.sh

Running jobs

Running jobs must be done via the Slurm scheduler.

Batch job example's

Node sharing is enabled.

Create a batch submission script, e.g. name: run.sh, and submit it to the queue with the sbatch command. E.g.

> sbatch run.sh

Here are some example submission scripts:

  • 12 cores, 48GB of RAM from 1 node:
#!/bin/bash
#SBATCH -n 12
#SBATCH --mem=48GB
module load openmpi
echo "Starting"
./exe.x
  • 1 full node:
#!/bin/bash
#SBATCH -n 64
#SBATCH --mem=256000
module load openmpi
echo "Starting"
./exe.x
  • 2 full nodes:
#!/bin/bash
#SBATCH -n 128
#SBATCH --mem=256000
module load openmpi
echo "Starting"
./exe.x

Interactive allocation

salloc -n 12 --mem=48GB - this will automatically log you into the node once it has been assigned.

Transferring data from kelvin, lonsdale, parsons, rsync

If you need to transfer files that exist on the Kelvin, Lonsdale or Parsons HPC Clusters you can do so through our access host; rsync.tchpc.tcd.ie. Here is an example command that will transfer files:

> rsync -av rsync.tchpc.tcd.ie:source_directory destination_directory/

Ensure to update the source_directory path to the path where the data resides on rsync. And the destination_directory/ to the path where the data is to be transferred to on Callan.

See our Transferring files for more notes on this.

Further instructions

See the HPC clusters usage documentation for further instructions.