Pawsey

Info about Pawsey can be found here.

There are 3 supercomputers available for use, these are magnus, zeus, and zythos.
magnus is the supercomputer which is used for running parallel applications such as WRF.
zeus and zythos are meant to be used more for single processor and memory intensive jobs.

Extensive documentation for the supercomputers can be found here

Detailed documentation for magnus can be found here

To log-on to magnus, zeus, or zythos:

ssh –Y user@magnus.ivec.org
ssh -Y user@zeus.ivec.org
ssh -Y user@zythos.ivec.org

data.pawsey.org.au houses our data

If you want to access our data, you can do this via a web interface (go to https://data.pawsey.org.au/ and click on the "My Data" link at the top of the page), or you can use the command line tool to access this data from magnus. Instructions on the command line tool are available from data.ivec.org under the "Tools" link at the top of the page.

(if you are not confident in Linux, please refer to our Linux page)

Filesystems

  • All 3 supercomputers (magnus, zeus, zythos) share the same 3 filesystems, /home, /scratch, and /group
    • /home/your_username/ contains system files such as your .bashrc file
    • /scratch/y98 is where we run stuff. /scratch has a 30 day purge policy so nothing should go here that you want to keep. Use /scratch to process data and run jobs and then move the data somewhere else.
    • /group/y98 has group read, write, and execute permissions by default, to make sharing easier, which /scratch does not. Currently, we only have an allocation of 1Tb on /group so this space fills up quickly.

Software

  • All 3 supercomputers used the module tool
    • module list - list currently loaded modules
    • module avail - list available modules
    • module load/remove module_name - add or remove modules

Running Jobs

There is extensive documentation available online for running jobs on magnus. The Magnus user guide can be found on the Pawsey web site.

Unlike the old supercomputer, epic, which used PBS Pro as its job scheduler, Magnus uses SLURM. More information on SLURM which is specific to Magnus can be found on the SLURM pages at ivec.org. Here is a quick example of how a job script to run on magnus should look.

#!/bin/bash -l
#SBATCH --account=y98
#SBATCH --ntasks=168
#SBATCH --ntasks-per-node=24
#SBATCH --time=18:00:00
#SBATCH --mail-type=END
#SBATCH --mail-type=FAIL
#SBATCH --mail-user=juliaandrys@gmail.com
#SBATCH --export=NONE
module swap PrgEnv-cray PrgEnv-intel
module load cray-netcdf
export WRFIO_NCD_LARGE_FILE_SUPPORT=1
export NETCDF=/opt/cray/netcdf/4.3.0/INTEL/130/
cd /scratch/y98/julia/WRF-ERA-10Y/WRFV3/run/MI/
cp /scratch/y98/julia/WRF-ERA-10Y/WRFV3/run/run_real_2000/wrfbdy_d01_2000_06 /scratch/y98/julia/WRF-ERA-10Y/WRFV3/run/MI/wrfbdy_d01
cp /scratch/y98/julia/WRF-ERA-10Y/WRFV3/run/run_real_2000/wrffdda_d01_2000_06 /scratch/y98/julia/WRF-ERA-10Y/WRFV3/run/MI/wrffdda_d01
cp /scratch/y98/julia/WRF-ERA-10Y/WRFV3/run/run_real_2000/wrfinput_d01_2000_06 /scratch/y98/julia/WRF-ERA-10Y/WRFV3/run/MI/wrfinput_d01
cp /scratch/y98/julia/WRF-ERA-10Y/WRFV3/run/run_real_2000/wrfinput_d02_2000_06 /scratch/y98/julia/WRF-ERA-10Y/WRFV3/run/MI/wrfinput_d02
cp /scratch/y98/julia/WRF-ERA-10Y/WRFV3/run/run_real_2000/wrfinput_d03_2000_06 /scratch/y98/julia/WRF-ERA-10Y/WRFV3/run/MI/wrfinput_d03
cp /scratch/y98/julia/WRF-ERA-10Y/WRFV3/run/run_real_2000/wrflowinp_d01_2000_06 /scratch/y98/julia/WRF-ERA-10Y/WRFV3/run/MI/wrflowinp_d01
cp /scratch/y98/julia/WRF-ERA-10Y/WRFV3/run/run_real_2000/wrflowinp_d02_2000_06 /scratch/y98/julia/WRF-ERA-10Y/WRFV3/run/MI/wrflowinp_d02
cp /scratch/y98/julia/WRF-ERA-10Y/WRFV3/run/run_real_2000/wrflowinp_d03_2000_06 /scratch/y98/julia/WRF-ERA-10Y/WRFV3/run/MI/wrflowinp_d03
cp tmp_namelist_2000_06 namelist.input
aprun -B ./wrf.exe >& wrf_2000_06.out </dev/null
mv rsl.error.0000 rsl_error_2000_06 ;mv rsl.out.0000 rsl_out_2000_06; rm rsl.*
./run_wrf_2000_07 >& run_wrf_2000_07.out; exit
  • This job will run on 7 nodes (168 cores / 24 cores per node). The queue that the job will run on is determined by how you submit the job to the job scheduler. Assuming I am running on magnus and want the job to run on the magnus work queue, I would submit this job using the following:
sbatch era_MI_2000_06
  • If you want to submit this job when you are logged into another supercomputer, you need to specify which system and which job queue you want the job to run on. It is good practice to always fully specify where you want a job to run so that any scripts you run will always work. To submit the above script to magnus using specified options:
sbatch -M magnus -p workq era_MI_2000_06
  • To query jobs on either work queue:
squeue -M magnus -p workq -u <your username>
squeue -M zeus -p work -u <your username>

Moving Data

  • If you have a lot of data that you want to move, from data.pawsey.org.au, or from any other source, you would need to use the copy queue (copyq) on zeus. Here is a script to move some data from /scratch to data.pawsey.org.au:
#!/bin/bash
#SBATCH --account=y98
#SBATCH --ntasks=1 --ntasks-per-node=1
#SBATCH --time=12:00:00
#SBATCH --mail-type=END --mail-type=FAIL
#SBATCH --mail-user=juliaandrys@gmail.com
#SBATCH --export=NONE
cd /scratch/y98/julia/WRF-ERA-10Y/WRFV3/run/MI/
mkdir -p /scratch/y98/julia/WRF-ERA-10Y/WRFV3/run/MI/2000
mv wrfout_d0*_2000-04-* /scratch/y98/julia/WRF-ERA-10Y/WRFV3/run/MI/2000/
tar cfvp era_wrfrst_2000-04-01.tar wrfrst_d0*_2000-04-01_00:00:00 >& tar_era_wrfrst_2000-04-01.out
tar cfvp era_wrfrst_2000-04-15.tar wrfrst_d0*_2000-04-15_00:00:00 >& tar_era_wrfrst_2000-04-15.out
cd /scratch/y98/julia/WRF-ERA-10Y/WRFV3/run/MI/2000/
tar cfvp era_wrfout_d01-2000-04.tar wrfout_d01_2000-04-* >& tar_era_wrfout_d01-2000-04.out
tar cfvp era_wrfout_d02-2000-04.tar wrfout_d02_2000-04-* >& tar_era_wrfout_d02-2000-04.out
tar cfvp era_wrfout_d03-2000-04.tar wrfout_d03_2000-04-* >& tar_era_wrfout_d03-2000-04.out
cd /home/julia/bin/
module load java
ashell.py "cf /projects/SWWA Downscaled Climate/WRF-CLIM-OUT/ERA-10Y-MI/wrf_rst + put /scratch/y98/julia/WRF-ERA-10Y/WRFV3/run/MI/era_wrfrst_2000-04-01.tar" >& /scratch/y98/julia/WRF-ERA-10Y/WRFV3/run/MI/rsync_wrfrst_2000_04-01.out
ashell.py "cf /projects/SWWA Downscaled Climate/WRF-CLIM-OUT/ERA-10Y-MI/wrf_rst + put /scratch/y98/julia/WRF-ERA-10Y/WRFV3/run/MI/era_wrfrst_2000-04-15.tar" >& /scratch/y98/julia/WRF-ERA-10Y/WRFV3/run/MI/rsync_wrfrst_2000_04-15.out
ashell.py "cf /projects/SWWA Downscaled Climate/WRF-CLIM-OUT/ERA-10Y-MI/wrf_out + put /scratch/y98/julia/WRF-ERA-10Y/WRFV3/run/MI/2000/era_wrfout_d01-2000-04.tar" >& /scratch/y98/julia/WRF-ERA-10Y/WRFV3/run/MI/rsync_wrfout_d01_2000_04.out
ashell.py "cf /projects/SWWA Downscaled Climate/WRF-CLIM-OUT/ERA-10Y-MI/wrf_out + put /scratch/y98/julia/WRF-ERA-10Y/WRFV3/run/MI/2000/era_wrfout_d02-2000-04.tar" >& /scratch/y98/julia/WRF-ERA-10Y/WRFV3/run/MI/rsync_wrfout_d02_2000_04.out
ashell.py "cf /projects/SWWA Downscaled Climate/WRF-CLIM-OUT/ERA-10Y-MI/wrf_out + put /scratch/y98/julia/WRF-ERA-10Y/WRFV3/run/MI/2000/era_wrfout_d03-2000-04.tar" >& /scratch/y98/julia/WRF-ERA-10Y/WRFV3/run/MI/rsync_wrfout_d03_2000_04.out
  • To submit this job to the copy queue:
sbatch -M zeus -p copyq <job name>

data.pawsey.org.au

data.pawsey.org.au is the new interface through which we will access our data stored on Pawsey infrastructure. Data can be viewed, managed and shared via the online interface accessible at data.pawsey.org.au or it can be accessed from magnus and zeus using a program called ashell.py. This program, and the instructions on how to use it, can be found in the Tools section here. Some more detailed documentation on ashell can be found via the help pages on the Pawsey website. Look for the "How do I use the Command Line Tool" button at the bottom of the page.

The documentation available on the Pawsey website is very detailed with respect to ashell so I am not going to go over it here. If you do use ashell, you will need to set up a delegate. Information on how to do this is available on the help page, near the bottom. This will allow you to use ashell without having to log in all the time.

The script above uses the "put" command in ashell to move data to data.pawsey.org.au (here's a line of that script):

ashell.py "cf /projects/SWWA Downscaled Climate/WRF-CLIM-OUT/ERA-10Y-MI/wrf_rst + put /scratch/y98/julia/WRF-ERA-10Y/WRFV3/run/MI/era_wrfrst_2000-04-01.tar" >& /scratch/y98/julia/WRF-ERA-10Y/WRFV3/run/MI/rsync_wrfrst_2000_04-01.out

and here is an example of a script which is using "get" to download data from from data.pawsey.org.au:

#!/bin/bash
#SBATCH --account=y98
#SBATCH --ntasks=1
#SBATCH --ntasks-per-node=1
#SBATCH --time=02:00:00
#SBATCH --mail-type=END
#SBATCH --mail-type=FAIL
#SBATCH --mail-user=juliaandrys@gmail.com
cd $SLURM_SUBMIT_DIR

mm=(01 02 03 04 05 06 07 08 09 10 11 12)
dd=(31 28 31 30 31 30 31 31 30 31 30 31)
for i in {0..11}
        do
        cd /home/julia/bin
        ashell.py "cd /scratch/y98/julia/WRF_CLIM_PROCESSING/CCSM + get /projects/SWWA Downscaled Climate/WRF-CLIM-OUT/CCSM-20C/wrf_hrly/20C_wrfhrly_d02-1970-${mm[$i]}.tar"
        cd /scratch/y98/julia/WRF_CLIM_PROCESSING/CCSM
        tar -xvf 20C_wrfhrly_d02-1970-${mm[$i]}.tar
        rm 20C_wrfhrly_d02-1970-${mm[$i]}.tar
        #check files
        files=$(ls -l wrfhrly_d02_1970-${mm[$i]}* | wc -l)
        if [ "$files" -ne ${dd[$i]} ]; then
        echo "File issue in ${mm[$i]} 1970"
        echo "File issue in ${mm[$i]} 1970" | mailx -s "ISSUE WITH FILES" juliaandrys@gmail.com
        fi
        #check file sequence
        if ./issequential.sh "wrfhrly_d02_1970-${mm[$i]}*" 21-22; then
                echo "Files are in sequence"
                else
                echo "Files are not in sequence"
        echo "File sequence issue in ${mm[$i]} 1970" | mailx -s "ISSUE WITH FILE SEQUENCE" juliaandrys@gmail.com
        fi
done
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License