Tinney
Tinney is a HPC cluster purchased with funding from Science Foundation Ireland under the 2018 President of Ireland Future Research Leaders Award. The clusters purpose is to facilitate research by the FRAILMatics group in The Irish Longitudinal Study on Ageing (TILDA). FRAILMatics’ quest is to research and develop more accurate frailty tools that without expert input automatically identify subtle dysregulated responses to stressors across physiological systems.
Tinney cluster hardware overview
-
1 Head node with 110TB of shared disk available to all compute nodes. Do not run computationally intensive jobs on the head node.
-
11 compute nodes, each with two 12 core Intel Xeon Silver 4214 CPU @ 2.20GHz CPU's, 24 cores total per node.
-
6 nodes have 192G of RAM.
-
5 nodes have 386G of RAM.
-
1 of the 386G RAM nodes has 2 NVIDIA P40 GPU's.
-
Node interconnect and access to the shared file system is via 10G ethernet.
Who can access Tinney?
Users who have been granted access by the principal investigator of the FRAILmatics' project. Please email ops@tchpc.tcd.ie if you want to query if you can be granted access.
How to access Tinney
You must have a Research IT account. If you don't please apply for one from here.
Tinney is accessible via SSH to the URL
tinney.tchpc.tcd.ie
We have more instructions on how to access via SSH.
Please note, Tinney is not directly accessible from the internet. If you are off campus and not connected to the College VPN please login first to rsync.tchpc.tcd.ie
and from there:
> ssh tinney.tchpc.tcd.ie
File transfers
Please see our instructions on file transfers, remember that tinney.tchpc.tcd.ie
is the address to connect to.
Software Available
The modules system
The modules system is used to control access to the available software. Here are more details on using the modules system in general.
To list available modules
module av
To show what modules you currently have loaded:
module list
To search for available modules
To search for stata
in a module name:
modgrep stata
Examples of some software installed via the modules system. (Please note the trailing hashes may change).
Stata
Stata 14 is installed, to load it into your environment use:
module load stata/14
MPlus
Version 8.4 can be loaded into your environment with:
module load mplus/8.4
Unloading / Removing software from your environment
E.g. to unload MPlus from your environment :
module unload mplus/8.4
Or
module rm mplus/8.4
Unload all modules
module purge
Installing new software
Remember, you are free to install software into your home folder without needs Research IT to do so.
If you want to request new software be installed system wide or need help installing something please get in contact with us at ops@tchpc.tcd.ie.
Slurm - the queueing system
slurm is the resource manager or queue installed on Tinney. All computational work on the cluster must be done with it. Please see our guide on slurm.
Scheduling policy
The compute nodes are weighted as follows:
-
192G RAM nodes have a low weight.
-
386G RAM nodes have a medium weight.
-
The GPU node has a high weight.
The scheduler will assign the node with the lowest weight that meets the allocation requirements that is available. This means the high memory and GPU nodes will only be scheduled in two circumstances:
-
All other nodes are already scheduled.
-
The user explicitly requests a high memory and / or GPU node. See instructions below for how to do that.
Slurm command examples
Working with the queue
Show the queue status
squeue
Show a particular users jobs in the queue only
squeue --user user
Show when jobs are estimated to start
squeue --start
Show when a users jobs are expected to start
squeue --start --user username
Display long output about my jobs in the queue
squeue --user usename -l
Getting info on resources available
Display queue/partition names, run times and available nodes
sinfo
More detailed view of resources available in each node
sinfo -Nel
Get info on a job
scontrol show jobid 108
Note, you will need the job number. You can use the squeue
command to find those.
Cancel a running or pending job
scancel 108
Note, you will need the job number. You can use the squeue
command to find those.
Some Tinney specific references
High Memory Nodes
Requesting a high memory node interactively:
salloc -N 1 --mem=384000
High memory node via batch:
#SBATCH --mem=384000
GPU nodes
One node, tinney-n08, has 2 NVIDIA TESLA P40 GPU cards. The instructions on how to access them are as follows.
Requesting an interactive GPU allocation
salloc -N 1 --gres=gpu:1
Or to request 2 GPU's interactively:
salloc -N 1 --gres=gpu:2
Batch script submission parameters, 1 GPU:
#SBATCH --gres=gpu:1
Or for 2 GPU's:
#SBATCH --gres=gpu:2
Get queue information including details of GPU's available and their state:
sinfo -o "%.5a %.10l %.6D %.6t %.20N %G"
Batch Jobs
Batch jobs are ones you submit to be run by the resource manager.
Example submission scripts for Batch Jobs The following example:
- Requests 24 cores, 1 node when they all have 24 cores
- Gives it a job name
- Loads some modules
- Uses
mpirun
to run thecpie
executable - Not all applications will need
mpirun
, you should run them with what is appropriate - The
cpie
executable needs to exist for this to work
The submit-cpie.sh
file could be as follows. Note, you can call your submission file, submit-cpie.sh
, whatever you want
#!/bin/bash
#SBATCH -n 24 # 24 cores or 1 node
#SBATCH -J "Job name"
# [load modules, you will need to modify this for your needs](#load-modules-you-will-need-to-modify-this-for-your-needs)
module load gcc-9.2.0-gcc-4.8.5-breo5ur apps openmpi/4.0.4
# [run your work, you will need to modify this for your needs](#run-your-work-you-will-need-to-modify-this-for-your-needs)
mpirun ./cpie.x
To submit the job to the queue to be processed:
sbatch submit-cpie.sh
How to submit a job
To submit this, run the following command:
sbatch myscript.sh
Warning: do not execute the script
The job submission script file is written to look like a bash shell script. However, you do NOT submit the job to the queue by executing the script.
In particular, the following is INCORRECT:
# [this is the INCORRECT way to submit a job](#this-is-the-incorrect-way-to-submit-a-job)
./myscript.sh # wrong! this will not submit the job!
The correct way is noted above (sbatch myscript.sh
).
staskfarm
If you have multiple independent serial tasks, you can pack them together into a single Slurm job. This is suitable for simple task-farming.
This can take advantage of the fact that a single node in the cluster has many CPU cores available. For example, each Tiney node has 24 cores, so you can pack up to 24 tasks into a single job.
This could be done with the staskfarm
utility we use on our other clusters. Running on Tinney would be very slightly different.
Your job submission script could look something like this:
#!/bin/sh
#SBATCH -n 24 # 24 cores
#SBATCH -t 4-00:00:00 # 1 day and 3 hours
#SBATCH -p compute # partition name
#SBATCH -J MM_HEALTH_LTA_6Class # sensible name for the job
# [load up the correct modules, if required](#load-up-the-correct-modules-if-required)
module load apps mplus staskfarm
# [execute the commands via the slurm task farm wrapper](#execute-the-commands-via-the-slurm-task-farm-wrapper)
cd ~/MM_LTA_Patterns/HealthAssessmentResults/
staskfarm commands.txt
And the commands.txt
file that it calls could look like the following:
mplus MM_HEALTH_LTA_10Clust.inp
mplus MM_HEALTH_LTA_2Clust.inp
mplus MM_HEALTH_LTA_3Clust.inp
mplus MM_HEALTH_LTA_4Clust.inp
mplus MM_HEALTH_LTA_5Clust.inp
mplus MM_HEALTH_LTA_6Clust.inp
mplus MM_HEALTH_LTA_7Clust.inp
mplus MM_HEALTH_LTA_8Clust.inp
mplus MM_HEALTH_LTA_9Clust.inp