Tinney

Tinney is a HPC cluster purchased with funding from Science Foundation Ireland under the 2018 President of Ireland Future Research Leaders Award. The clusters purpose is to facilitate research by the FRAILMatics group in The Irish Longitudinal Study on Ageing (TILDA). FRAILMatics’ quest is to research and develop more accurate frailty tools that without expert input automatically identify subtle dysregulated responses to stressors across physiological systems.

Tinney cluster hardware overview

  • 1 Head node with 110TB of shared disk available to all compute nodes. Do not run computationally intensive jobs on the head node.

  • 11 compute nodes, each with two 12 core Intel Xeon Silver 4214 CPU @ 2.20GHz CPU's, 24 cores total per node.

  • 6 nodes have 192G of RAM.

  • 5 nodes have 386G of RAM.

  • 1 of the 386G RAM nodes has 2 NVIDIA P40 GPU's.

  • Node interconnect and access to the shared file system is via 10G ethernet.

Who can access Tinney?

Users who have been granted access by the principal investigator of the FRAILmatics' project. Please email ops@tchpc.tcd.ie if you want to query if you can be granted access.

How to access Tinney

You must have a Research IT account. If you don't please apply for one from here.

Tinney is accessible via SSH to the URL

tinney.tchpc.tcd.ie

We have more instructions on how to access via SSH.

Please note, Tinney is not directly accessible from the internet. If you are off campus and not connected to the College VPN please login first to rsync.tchpc.tcd.ie and from there:

> ssh tinney.tchpc.tcd.ie

File transfers

Please see our instructions on file transfers, remember that tinney.tchpc.tcd.ie is the address to connect to.

Software Available

The modules system

The modules system is used to control access to the available software. Here are more details on using the modules system in general.

To list available modules

module av

To show what modules you currently have loaded:

module list

To search for available modules

To search for stata in a module name:

modgrep stata

Examples of some software installed via the modules system. (Please note the trailing hashes may change).

Stata

Stata 14 is installed, to load it into your environment use:

module load stata/14

MPlus

Version 8.4 can be loaded into your environment with:

module load mplus/8.4

Unloading / Removing software from your environment

E.g. to unload MPlus from your environment :

module unload mplus/8.4

Or

module rm mplus/8.4

Unload all modules

module purge

Installing new software

Remember, you are free to install software into your home folder without needs Research IT to do so.

If you want to request new software be installed system wide or need help installing something please get in contact with us at ops@tchpc.tcd.ie.

Slurm - the queueing system

slurm is the resource manager or queue installed on Tinney. All computational work on the cluster must be done with it. Please see our guide on slurm.

Scheduling policy

The compute nodes are weighted as follows:

  • 192G RAM nodes have a low weight.

  • 386G RAM nodes have a medium weight.

  • The GPU node has a high weight.

The scheduler will assign the node with the lowest weight that meets the allocation requirements that is available. This means the high memory and GPU nodes will only be scheduled in two circumstances:

  1. All other nodes are already scheduled.

  2. The user explicitly requests a high memory and / or GPU node. See instructions below for how to do that.

Slurm command examples

Working with the queue

Show the queue status

squeue

Show a particular users jobs in the queue only

squeue --user user

Show when jobs are estimated to start

squeue --start

Show when a users jobs are expected to start

squeue --start --user username

Display long output about my jobs in the queue

squeue --user usename -l

Getting info on resources available

Display queue/partition names, run times and available nodes

sinfo

More detailed view of resources available in each node

sinfo -Nel

Get info on a job

scontrol show jobid 108

Note, you will need the job number. You can use the squeue command to find those.

Cancel a running or pending job

scancel 108

Note, you will need the job number. You can use the squeue command to find those.

Some Tinney specific references

High Memory Nodes

Requesting a high memory node interactively:

salloc -N 1 --mem=384000

High memory node via batch:

#SBATCH --mem=384000

GPU nodes

One node, tinney-n08, has 2 NVIDIA TESLA P40 GPU cards. The instructions on how to access them are as follows.

Requesting an interactive GPU allocation

salloc -N 1 --gres=gpu:1

Or to request 2 GPU's interactively:

salloc -N 1 --gres=gpu:2

Batch script submission parameters, 1 GPU:

#SBATCH --gres=gpu:1

Or for 2 GPU's:

#SBATCH --gres=gpu:2

Get queue information including details of GPU's available and their state:

sinfo -o "%.5a %.10l %.6D %.6t %.20N %G"

Batch Jobs

Batch jobs are ones you submit to be run by the resource manager.

Example submission scripts for Batch Jobs The following example:

  1. Requests 24 cores, 1 node when they all have 24 cores
  2. Gives it a job name
  3. Loads some modules
  4. Uses mpirun to run the cpie executable
  5. Not all applications will need mpirun, you should run them with what is appropriate
  6. The cpie executable needs to exist for this to work

The submit-cpie.sh file could be as follows. Note, you can call your submission file, submit-cpie.sh, whatever you want

#!/bin/bash
#SBATCH -n 24 # 24 cores or 1 node
#SBATCH -J "Job name"

# [load modules, you will need to modify this for your needs](#load-modules-you-will-need-to-modify-this-for-your-needs)
module load gcc-9.2.0-gcc-4.8.5-breo5ur apps openmpi/4.0.4

# [run your work, you will need to modify this for your needs](#run-your-work-you-will-need-to-modify-this-for-your-needs)
mpirun ./cpie.x

To submit the job to the queue to be processed:

sbatch submit-cpie.sh

How to submit a job

To submit this, run the following command:

sbatch myscript.sh

Warning: do not execute the script

The job submission script file is written to look like a bash shell script. However, you do NOT submit the job to the queue by executing the script.

In particular, the following is INCORRECT:

# [this is the INCORRECT way to submit a job](#this-is-the-incorrect-way-to-submit-a-job)
./myscript.sh  # wrong! this will not submit the job!

The correct way is noted above (sbatch myscript.sh).

staskfarm

If you have multiple independent serial tasks, you can pack them together into a single Slurm job. This is suitable for simple task-farming.

This can take advantage of the fact that a single node in the cluster has many CPU cores available. For example, each Tiney node has 24 cores, so you can pack up to 24 tasks into a single job.

This could be done with the staskfarm utility we use on our other clusters. Running on Tinney would be very slightly different.

Your job submission script could look something like this:

#!/bin/sh
#SBATCH -n 24           # 24 cores
#SBATCH -t 4-00:00:00   # 1 day and 3 hours
#SBATCH -p compute      # partition name
#SBATCH -J MM_HEALTH_LTA_6Class  # sensible name for the job

# [load up the correct modules, if required](#load-up-the-correct-modules-if-required)
module load apps mplus staskfarm

# [execute the commands via the slurm task farm wrapper](#execute-the-commands-via-the-slurm-task-farm-wrapper)
cd ~/MM_LTA_Patterns/HealthAssessmentResults/
staskfarm commands.txt

And the commands.txt file that it calls could look like the following:

mplus MM_HEALTH_LTA_10Clust.inp
mplus MM_HEALTH_LTA_2Clust.inp
mplus MM_HEALTH_LTA_3Clust.inp
mplus MM_HEALTH_LTA_4Clust.inp
mplus MM_HEALTH_LTA_5Clust.inp
mplus MM_HEALTH_LTA_6Clust.inp
mplus MM_HEALTH_LTA_7Clust.inp
mplus MM_HEALTH_LTA_8Clust.inp
mplus MM_HEALTH_LTA_9Clust.inp