Python

Installing Python on the Linux Server

The easiest way to install and manage Python and Python libraries is by using conda, which can be downloaded from the miniconda web-page. Always select Python3.X as Python2.X will not be continued, and most libraries have already ported to Python3.X. To install the the Linux server, get the 64 bit installer for Linux.

conda will by default install in your $HOME directory. Since there is limited space on $HOME, instead you should install to something like:

 /data/$USERNAME/miniconda3

Enter the path during the install process. After installation, by default conda will active the so-called base environment by default.
We do not want to do this. So, you should type the following command after installation to prevent this:

conda config --set auto_activate_base false

Now, when you log on, the conda base environment will not be loaded. Instead you will manually load whichever environment you want to use, usually you will mostly use only 1 or 2 environments, rather than many.

So, now we need to create a conda environment, which we may use to predominantly analyze WRF outputs. We may want to call this environment wrf_analysis, for example:

conda create --name wrf_analysis

Once we have created an environment, we can load it anytime by using conda activate:

conda activate wrf_analysis

To install packages to this environment, you simply run the conda install. The following packages will cover almost everything you would need

conda install -c conda-forge wrf-python xarray netcdf4 scipy pyngl pynio ncl nco cdo gdal

This should install of the above packages in your wrf_analysis conda environment and covers almost all software you would use doing climate analysis (the dependencies are such that numpy and other widely used python libraries will be installed). Note that cdo cdo and ncl are not python packages, but different pieces of software all together, but it does not really matter, conda is nice and will install everything for us!

Later on, should you wish to install more libraries, you can do so anytime. Note that you should try to install from conda-forge for consistency, rather than anaconda, as there can be issues.

To find our which Python libraries are installed, you simply run:

conda list

Installing Python libraries on Magnus

Details TBA soon.

CORDEX Python post-processor

A Python based post-processor, to process WRF output files for CORDEX requirements has been developed at UNSW. It is a very handy tool to use, when needing to compute means, max, and min of various variables from WRF output files.

Running the post-processor on epic

The code is a git repository and at:

/group/y98/jatinkala/Python-PostProc/narclim_scripts/NARCliM_postprocess

This was git-cloned by Jatin. If you want to use the post-processor, you should probably get your own "clone". Ask Jatin and he will put you in touch will the UNSW team, and they will add you to the project.

There are several important things to understand before using this processor:

  • Go read the README first.
  • The code will look for at least one wrfout_d0* file, to get the static 2D info from, such as lat-lon. So you need to have one of these in your input directory, which is specified in the namelist file.
  • This code uses joblib, which is a Python library used to parallelize python scripts. Whilst it works fine on Linux servers which do not have a scheduler (such as PBS), it does not seem to work on a machine such as epic, which is a proper cluster with scheduler. So, if you get a clean copy of the code, you will need to change "njobs" in "postprocess_modules.py", such that "njobs=1" (it's already changed in my copy of the code). Otherwise we seem to encounter memory problems. It should be possible to work out how to make joblib work within this code on epic/magnus, but i don't have time.
  • The number of files per year for each wrf{out,dly,xtrm,hrly} is hard-wired in the code, and you need to change it accordingly in postprocess_modules.py. Here is how i changed it fro my copy of the code (make sure you understand this):
def get_filefreq(filet):
  """ Method to get information about the type of file
      filet: type of file (wrfhrly, wrfout,wrfxtrm, wrfdly)
      ---
      - period: number of years of the period.
      - n_files: number of files per period. If n_files=-1 then
      it assumes that there is one file per day. Leap years are
      corrected accordingly.
      - time_step: hours between two time steps
      - file_freq: string defining the frequency.
      - tbounds: whether the output file should include time bounds.
  """
  file_info={}

  if filet=='wrfhrly':
    file_info['n_files']=-1
    file_info['time_step']=1 #hours between two time steps
    file_info['file_freq']='01H'
    file_info['tbounds']=False
    file_info['period']=1

  elif filet=='wrfout':
    file_info['n_files']=-1
    file_info['time_step']=1 #hours between two time steps
    file_info['file_freq']='01H'
    file_info['tbounds']=False
    file_info['period']=5

  elif filet=='wrfxtrm':
    file_info['n_files']=60
    file_info['time_step']=24 #hours between two time steps
    file_info['file_freq']='DAY'
    file_info['tbounds']=True
    file_info['period']=5

  elif filet=='wrfdly':
    file_info['n_files']=60
    file_info['time_step']=24 #hours between two time steps
    file_info['file_freq']='DAY'
    file_info['tbounds']=True
    file_info['period']=5

  else:
  • The code is hard-wired to process/produce 5 and 10 year files. This does not mean that you cannot process fewer years, just that some output files will be created which will have erroneous data. The postprocessor will print out warnings letting you know which files are erroneous, so this is not really an issue. To keep life simple, maybe process lots of 10 years at a time.

Here is sample script which runs the post-processor:

#!/bin/bash
#PBS -W group_list=y98
#PBS -q routequeue
#PBS -l select=1:ncpus=1:mem=23GB:mpiprocs=1
#PBS -l walltime=03:00:00
#PBS -m ae
#PBS -M Jatin.Kala.JK@gmail.com
cd $PBS_O_WORKDIR
export PYTHONPATH=/group/y98/pythonlibs/lib/python:$PYTHONPATH
module unload openmpi/1.6.5
module unload intel/12.1.7
module load gcc/4.4.7
module load python/2.7.5
module load numpy/1.6.2
module load scipy/0.11.0 
python postprocess_NARCliM.py -i Jatin_NARCliM_post_ERA-Int-1981-2010.input >& test_Jatin.out
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License