High Performance Cluster (HPC) Notes
This post contains notes on working with an HPC running SLURM.
It includes a guide to submitting multiple R/Python jobs and other odds and ends.
Some useful resources (and sources for this post):
Run R jobs
Run a function across multiple jobs with different parameters
This is an example of running an R function across many jobs, each with a different parameter combination
Create function file
This is the R
code that you want to run.
my_function.R
:
get_normal_sims <- function(n, mean, sd) {
sims <- rnorm(n = n, mean = mean, sd = sd)
saveRDS("sims")
}
Create CSV with parameter combinations of interest
Each row of this csv contains one parameter combination of interest, to be inputed into the previous R
function.
job_params.csv
:
n,mean,sd
1000,0,2
100,10,1
10000,50,1000
Create job script to send one job to HPC
run_one_job.sh
:
#!/bin/bash
#SBATCH --job-name=rnormalsim # create a short name for your job
#SBATCH --nodes=1 # node count
#SBATCH --partition=day-long-cpu
#SBATCH --ntasks=1 # total number of tasks across all nodes
#SBATCH --cpus-per-task=1 # cpu-cores per task (>1 if multi-threaded tasks)
#SBATCH --mem=4G # total memory per node (4 GB per cpu-core is default)
#SBATCH -o rnorm-sim-%A_%a.out # output file format (%A is jobID, %a is task index)
module purge
module load R
Rscript -e << EEOF
source('your_function.R')
get_normal_sims(n=as.integer(Sys.getenv('PARAM1')),
mean=as.integer(Sys.getenv('PARAM2')),
sd=as.integer(Sys.getenv('PARAM3')))
EEOF
Note that the output of Sys.getenv()
is always a string, and thus needs to be converted to the type that get_normal_sims()
expects.
Create shell script to run all the jobs
run_all_jobs.sh
:
#!/bin/bash
tail -n +2 job_params.csv | while IFS=',' read -r param1 param2 param3
do
sbatch --export=PARAM1="param1",PARAM2="param2",PARAM3="param3" run_one_job.sh
done
Submit jobs
Finally run the following bash commands in the cluster to send all jobs:
chmod +x run_all_jobs.sh
./run_all_jobs.sh
Other Notes
Install R package from github
This is an example of installing current version of grmbayes
from my github.
Run the following code on the HPC:
module load R
Rscript -e "devtools::install_github('wyattgmadden/grmbayes')"
ssh into node
ssh NODE#
top -u wmadden
See current jobs
squeue -u wmadden
Cancel all jobs sharing a name
squeue -u wmadden --format="%i %j" | awk '$2=="job_name" {print $1}' | xargs -I {} scancel {}
Check current use of private nodes
sinfo -p nodeowner -o "%10N %20C %20m %10F"
where nodeown
is the username of the owner of the private nodes.