Python

Installing Python libraries on epic

Only the core Python lib, numpy and scipy and a few others are already available on epic. Others will need manual install by the user, as there are just too many python libraries out there.
To install, get the python library using wget and for consistency, place it under:

 /group/y98/Downloads/

cd to the directory, then run:

python setup.py install --home=/group/$IVEC_PROJECT/pythonlibs
export PYTHONPATH=/group/$IVEC_PROJECT/pythonlibs/lib/python:$PYTHONPATH

Note that putting the Python libs under /group/y98/ was not a good idea, in retrospect. This is because only stuff under /group/y98/$IVEC_USER/ has the right group permission, but stuff under /group/y98/ will Not. So, if you add new libs, you will have to manually chmod the directory everytime

Installing Python libraries on Magnus

Details TBA soon.

CORDEX Python post-processor

A Python based post-processor, to process WRF output files for CORDEX requirements has been developed at UNSW. It is a very handy tool to use, when needing to compute means, max, and min of various variables from WRF output files.

Running the post-processor on epic

The code is a git repository and at:

/group/y98/jatinkala/Python-PostProc/narclim_scripts/NARCliM_postprocess

This was git-cloned by Jatin. If you want to use the post-processor, you should probably get your own "clone". Ask Jatin and he will put you in touch will the UNSW team, and they will add you to the project.

There are several important things to understand before using this processor:

  • Go read the README first.
  • The code will look for at least one wrfout_d0* file, to get the static 2D info from, such as lat-lon. So you need to have one of these in your input directory, which is specified in the namelist file.
  • This code uses joblib, which is a Python library used to parallelize python scripts. Whilst it works fine on Linux servers which do not have a scheduler (such as PBS), it does not seem to work on a machine such as epic, which is a proper cluster with scheduler. So, if you get a clean copy of the code, you will need to change "njobs" in "postprocess_modules.py", such that "njobs=1" (it's already changed in my copy of the code). Otherwise we seem to encounter memory problems. It should be possible to work out how to make joblib work within this code on epic/magnus, but i don't have time.
  • The number of files per year for each wrf{out,dly,xtrm,hrly} is hard-wired in the code, and you need to change it accordingly in postprocess_modules.py. Here is how i changed it fro my copy of the code (make sure you understand this):
def get_filefreq(filet):
  """ Method to get information about the type of file
      filet: type of file (wrfhrly, wrfout,wrfxtrm, wrfdly)
      ---
      - period: number of years of the period.
      - n_files: number of files per period. If n_files=-1 then
      it assumes that there is one file per day. Leap years are
      corrected accordingly.
      - time_step: hours between two time steps
      - file_freq: string defining the frequency.
      - tbounds: whether the output file should include time bounds.
  """
  file_info={}

  if filet=='wrfhrly':
    file_info['n_files']=-1
    file_info['time_step']=1 #hours between two time steps
    file_info['file_freq']='01H'
    file_info['tbounds']=False
    file_info['period']=1

  elif filet=='wrfout':
    file_info['n_files']=-1
    file_info['time_step']=1 #hours between two time steps
    file_info['file_freq']='01H'
    file_info['tbounds']=False
    file_info['period']=5

  elif filet=='wrfxtrm':
    file_info['n_files']=60
    file_info['time_step']=24 #hours between two time steps
    file_info['file_freq']='DAY'
    file_info['tbounds']=True
    file_info['period']=5

  elif filet=='wrfdly':
    file_info['n_files']=60
    file_info['time_step']=24 #hours between two time steps
    file_info['file_freq']='DAY'
    file_info['tbounds']=True
    file_info['period']=5

  else:
  • The code is hard-wired to process/produce 5 and 10 year files. This does not mean that you cannot process fewer years, just that some output files will be created which will have erroneous data. The postprocessor will print out warnings letting you know which files are erroneous, so this is not really an issue. To keep life simple, maybe process lots of 10 years at a time.

Here is sample script which runs the post-processor:

#!/bin/bash
#PBS -W group_list=y98
#PBS -q routequeue
#PBS -l select=1:ncpus=1:mem=23GB:mpiprocs=1
#PBS -l walltime=03:00:00
#PBS -m ae
#PBS -M Jatin.Kala.JK@gmail.com
cd $PBS_O_WORKDIR
export PYTHONPATH=/group/y98/pythonlibs/lib/python:$PYTHONPATH
module unload openmpi/1.6.5
module unload intel/12.1.7
module load gcc/4.4.7
module load python/2.7.5
module load numpy/1.6.2
module load scipy/0.11.0 
python postprocess_NARCliM.py -i Jatin_NARCliM_post_ERA-Int-1981-2010.input >& test_Jatin.out
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License