.. _job_script_examples: =================== Job script examples =================== Basic examples ============== General blueprint for a jobscript --------------------------------- You can save the following example to a file (e.g. run.sh) on scicluster. Comment the two ``cp`` commands that are just for illustratory purpose (lines 46 and 55) and change the SBATCH directives where applicable. You can then run the script by typing:: $ sbatch run.sh Please note that all values that you define with SBATCH directives are hard values. When you, for example, ask for 6000 MB of memory (``--mem=6000MB``) and your job uses more than that, the job will be automatically killed by the manager. Important: Please note that the standard out and err streams from the code are redirected to a file despite the specification of standard out and err for the job. This is very important unless stdout/stderr from your code is less than a few MB. The job output is spooled locally on the execution node and copied to the user working directory only after the job completes. Since the spool size is small (a few GB) you can overfill the disk and crash all the jobs on the node. With redirection approach you avoid this and in addition you can monitor out.txt during runtime. .. literalinclude:: files/slurm-blueprint.sh :language: bash .. _job_arrays: Running many sequential jobs in parallel using job arrays --------------------------------------------------------- In this example we wish to run many similar sequential jobs in parallel using job arrays. We take Python as an example but this does not matter for the job arrays: .. literalinclude:: files/test.py :language: python Save this to a file called "test.py" and try it out:: $ python test.py start at 15:23:48 sleep for 10 seconds ... stop at 15:23:58 Good. Now we would like to run this script 8 times at the same time. For this we use the following script: .. literalinclude:: files/slurm-job-array.sh :language: bash Submit the script and after a short while you should see 8 output files in your submit directory:: $ ls -l output*.txt -rw------- 1 user user 60 Oct 14 14:44 output_1.txt -rw------- 1 user user 60 Oct 14 14:44 output_10.txt -rw------- 1 user user 60 Oct 14 14:44 output_11.txt -rw------- 1 user user 60 Oct 14 14:44 output_12.txt -rw------- 1 user user 60 Oct 14 14:44 output_13.txt -rw------- 1 user user 60 Oct 14 14:44 output_14.txt -rw------- 1 user user 60 Oct 14 14:44 output_15.txt -rw------- 1 user user 60 Oct 14 14:44 output_16.txt Packaging smaller parallel jobs into one large parallel job ----------------------------------------------------------- There are several ways to package smaller parallel jobs into one large parallel job. The preferred way is to use Job Arrays. Browse the web for many examples on how to do it. Here we want to present a more pedestrian alternative which can give a lot of flexibility. In this example we imagine that we wish to run 2 MPI jobs at the same time, each using 4 tasks, thus totalling to 8 tasks. Once they finish, we wish to do a post-processing step and then resubmit another set of 2 jobs with 4 tasks each: .. literalinclude:: files/slurm-smaller-jobs.sh :language: bash The ``wait`` commands are important here - the run script will only continue once all commands started with ``&`` have completed. .. _allocated_entire_memory: Example on how to allocate entire memory on one node ---------------------------------------------------- .. literalinclude:: files/slurm-big-memory.sh :language: bash How to recover files before a job times out ------------------------------------------- Possibly you would like to clean up the work directory or recover files for restart in case a job times out. In this example we ask Slurm to send a signal to our script 120 seconds before it times out to give us a chance to perform clean-up actions. .. literalinclude:: files/slurm-timeout-cleanup.sh :language: bash OpenMP and MPI ============== You can copy and paste the examples given here to a file (e.g. run.sh) and start it with: .. code-block:: bash $ sbatch run.sh Example for an OpenMP job ------------------------- .. literalinclude:: files/slurm-OMP.sh :language: bash Example for a MPI job --------------------- .. literalinclude:: files/slurm-MPI.sh :language: bash Example for a hybrid MPI/OpenMP job ----------------------------------- .. literalinclude:: files/slurm-MPI-OMP.sh :language: bash If you want to start more than one MPI rank per node you can use ``--ntasks-per-node`` in combination with ``--nodes``: .. code-block:: bash #SBATCH --nodes=2 --ntasks-per-node=2 --cpus-per-task=8 This will start 2 MPI tasks each on 2 nodes, where each task can use up to 8 threads. Example for a GPU job ----------------------------------- .. literalinclude:: files/slurm-gpu.sh :language: bash