NOTE: A lot of software defaults to using all of the available cores on the machine. This is a sensible setting when running on a desktop machine, but it causes problems on the cluster if you've only asked grid engine for a few cores and your code is attempting to use all 20 cores on a node. Generally software will provide an option to specify the number of threads - use it and make sure you have asked for an equal number of cores.
#!/bin/sh # where do you want STDOUT and STDERR to go? #$ -o ~/sge-out #$ -e ~/sge-out # How much memory do you need **per core**? #$ -l h_vmem=2G # Number of cores you need #$ -pe smp 2 # Which queues would you like to submit to? #$ -q HighMemLongterm.q,HighMemShortterm.q,LowMemLongterm.q,LowMemShortterm.q # load any modules you need module load general/R/3.2.1 # Run thing that uses multiple cores someprogram --threads 2 /my/input/file
If you're using MPI, you probably know what you're doing. The parallel environment you want is -pe mpislots. Run qconf -sp mpislots for details.
If you want to do the same thing to a whole load of similar files, Grid Engine provides a shorthand way of submitting them:
#!/bin/sh # where do you want STDOUT and STDERR to go? #$ -o ~/sge-out #$ -e ~/sge-out # How much memory do you need (per-core) #$ -l h_vmem=2G # Which queues would you like to submit to? #$ -q HighMemLongterm.q,HighMemShortterm.q,LowMemLongterm.q,LowMemShortterm.q # if you have lots of similar jobs to run, you might want to use an array job # this will give you a SGE_TASK_ID environment varioable which you can use in your # command to do the same thing to lots of file #$ -t 1-10 # Constrain the number of concurrently running jobs (e.g. so you don't swamp the cluster, # or you want to avoid hundreds of jobs concurrently trying to write to the same file) # (limiting to 2 as an example. Obviously you can use more!) #$ -tc 2 echo "filename$SGE_TASK_ID.txt"