How do I implement the Hybrid Single Particle Lagrangian Integrated Trajectory Model (HYSPLIT) for parallel processing on a computing cluster?

Introduction

HYSPLIT is not open source software. It is produced by the National Oceanic and Atmospheric Administration (NOAA). The unix version of the HYSPLIT model is available from NOAA to researchers who are affiliated with NOAA’s research projects.

The figure below is derived from the output of the HYSPLIT application. It shows the forecasted concentrations of SO2 on the Big Island of Hawaii. (source: Prof. Businger’s Vog Measurement and Prediction (VMAP) Project)

The project plan is as follows:

  1. Download and install the source code for HYSPLIT and the libraries on which it depends;
  2. Implement HYSPLIT as a serial application;
    1. Compile FORTRAN source code;
    2. Install in appropriate directories;
    3. Verify serial implementation using test data provided.
  3. Implement HYSPLIT as a parallel application using MPI.
    1. Recompile FORTRAN source code for MPI.
    2. Install in appropriate directories.
    3. Construct PBS script for parallel runs on fractal.
    4. Test and refine PBS script.
    5. Demonstrate successful parallel runs.
  4. Compute speedup of parallelization for 1, 2, 4, and 8 processors.

Dependencies

The HYSPLIT application is dependent on three major libraries:

  1. netCDF — a set of libraries that support machine-independent data formats for scientific computation, maintained and distributed by Unidata, a program of the University Center for Atmospheric Research, of which UH is a member;
  2. IOAPI — an input/output API used on other NOAA projects, currently maintained by Baron Advanced Meteorological Systems, distributed under the GNU Lesser General Public License 2.1; and
  3. GRIB2 — libraries to access “gridded” data in the World Meteorological Organization (WMO) GRIB2 format.

These in turn have additional dependencies. One of the challenges is to determine the dependency tree for HYSPLIT and successfully install all of these libraries.

Especially significant among these additional dependencies is HDF5, which supports version 5 of the Hierarchical Data Format, another well-known standard for storing and exchanging scientific data.

Building the dependent libraries has been challenging, especially HDF5 and netCDF. Errors in building the parallel version of the HDF5 library were not resolved at last attempt. The regular version of HDF5 was built instead. Building netCDF also required many attempts. The final successful build of netCDF was achieved by skipping a communication feature called OpenDAP.

I chose to build the dependent libraries myself rather than install binaries so that I would be more familiar with them. I suspect that some of the problems people enounter might be very subtle and might be related to incompatibilities or missing features among the dependent libraries. Building from source code helped me become familiar with the options and features of these libraries.

Building HYSPLIT was also challenging. The MPI-based programs did not build by default. The exec/Makefile had to be edited to specify where the MPI libraries, compilers, and run commands were located and make instructions for the MPI-based programs had to be inserted. Certain fortran modules also had to be located and copied into the main source directory.

Setting Initial Parameters

Initially, running HYSPLIT with multiple processors produced surprising results. There was no speedup whatsoever. In fact, it ran for the same amount of time or slightly longer with more processors.

The reason for this was the release rate of SO2 specified in the input files. It was so low that only one particle was being released per minute. The parallel algorithm allocates each particle per minute to a different MPI instance of the program. Since there was only one per minute, it allocated the same thing to each instance.

Build Journal

Following a “bottom up” process in reference to the dependency diagram.

curl library

./configure –prefix=/home/mpiuser002/project

make

make install

No apparent problems.

zlib library

./configure –prefix=/home/mpiuser002/project

make test
make install prefix=/home/mpiuser002/project

No apparent problems.

szip (szlib) library

./configure –prefix=/home/mpiuser002/project

make
make check
make install

No apparent problems.

hdf5 library

CC=mpicc ./configure –prefix=/share/huina/mlgonsal/project \

–enable-fortran –with-szlib=/share/huina/mlgonsal/project \

&& make && make check && make install && make check-install

fractal

From Albert

Please try:

CC=/usr/lib64/openmpi/bin/mpicc

./configure –enable-parallel | tee configure-prl.log

—————-

CC=mpicc \

./configure –prefix=/home/mpiuser002/project \

–enable-fortran \
–with-szlib=/home/mpiuser002/project

{better results if shared library turned off}

make
make check # run test suite.
make install
make check-install # verify installation.

bin/deploy NEW_DIR

RUNPARALLEL = /usr/lib64/openmpi/bin/mpiexec -n $${NPROCS:=6}

*** Unresolved errors in the MPI functionality tests. ***

===================================
MPI functionality tests
===================================
——————————–
Proc 0: *** MPIO 1 write Many read test…
——————————–
Proc 0: hostname=fractalMPI_ERR_OTHER
Proc 0: MPI_File_open failed (: known error not in list)
***FAILED with 6 total errors
——————————–
Proc 0: *** MPIO File size range test…
——————————–
MPI_Offset is signed 8 bytes integeral type
Proc 0: *** Parallel ERROR ***
VRFY (MPI_FILE_OPEN) failed at line 296 in t_mpi.c
aborting MPI processes
Proc 1: hostname=fractal
Proc 1: MPI_File_open failed (MPI_ERR_OTHER: known error not in list)
Proc 1: *** Parallel ERROR ***
VRFY (MPI_FILE_OPEN) failed at line 296 in t_mpi.c
aborting MPI processes
Proc 2: hostname=fractal
Proc 2: MPI_File_open failed (MPI_ERR_OTHER: known error not in list)
Proc 2: *** Parallel ERROR ***
VRFY (MPI_FILE_OPEN) failed at line 296 in t_mpi.c
aborting MPI processes
Proc 3: hostname=fractal
Proc 3: MPI_File_open failed (MPI_ERR_OTHER: known error not in list)
Proc 3: *** Parallel ERROR ***
VRFY (MPI_FILE_OPEN) failed at line 296 in t_mpi.c
aborting MPI processes
Proc 4: hostname=fractal
Proc 4: MPI_File_open failed (MPI_ERR_OTHER: known error not in list)
Proc 4: *** Parallel ERROR ***
VRFY (MPI_FILE_OPEN) failed at line 296 in t_mpi.c
aborting MPI processes
Proc 5: hostname=fractal
Proc 5: MPI_File_open failed (MPI_ERR_OTHER: known error not in list)
Proc 5: *** Parallel ERROR ***
VRFY (MPI_FILE_OPEN) failed at line 296 in t_mpi.c
aborting MPI processes
————————————————————————–
mpiexec has exited due to process rank 4 with PID 18059 on
node fractal exiting improperly. There are two reasons this could occur:

1. this process did not call “init” before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call “init”. By rule, if one process calls “init”,
then ALL processes must call “init” prior to termination.

2. this process called “init”, but exited without calling “finalize”.
By rule, all processes that call “init” MUST call “finalize” prior to
exiting or it will be considered an “abnormal termination”

This may have caused other processes in the application to be
terminated by signals sent by mpiexec (as reported here).
————————————————————————–
5 more processes have sent help message help-mpi-api.txt / mpi-abort
[fractal:18054] Set MCA parameter “orte_base_help_aggregate” to 0 to see all help / error messages
0.05user 0.11system 0:01.17elapsed 15%CPU (0avgtext+0avgdata 22304maxresident)k
0inputs+3536outputs (29major+11611minor)pagefaults 0swaps
make[4]: *** [t_mpi.chkexe_] Error 1
make[4]: Leaving directory `/home/mpiuser002/project/src/hdf5-1.8.9/testpar’
make[3]: *** [build-check-p] Error 1
make[3]: Leaving directory `/home/mpiuser002/project/src/hdf5-1.8.9/testpar’
make[2]: *** [test] Error 2
make[2]: Leaving directory `/home/mpiuser002/project/src/hdf5-1.8.9/testpar’
make[1]: *** [check-am] Error 2
make[1]: Leaving directory `/home/mpiuser002/project/src/hdf5-1.8.9/testpar’
make: *** [check-recursive] Error 1

Forgetting parallel, using serial IO options:

./configure –prefix=/home/mpiuser002/project –enable-fortran –with-szlib=/home/mpiuser002/project

make
make check # run test suite.
make install
make check-install # verify installation.

SUCCESS!

netCDF library

Note that for shared libraries, you may need to add the install directory to the LD_LIBRARY_PATH environment variable. See the netCDF FAQ for more details on using shared libraries.

After HDF5 is done, build netcdf, specifying the location of the HDF5, zlib, and (if built into HDF5) the szip header files and libraries in the CPPFLAGS and LDFLAGS environment variables. For example:

CPPFLAGS=-I/home/ed/local/include LDFLAGS=-L/home/ed/local/lib ./configure –prefix=/home/ed/local
make check install

CPPFLAGS=-I/home/mpiuser002/project/include LDFLAGS=-L/home/mpiuser002/project/lib
./configure –prefix=/home/mpiuser002/project \
–enable-parallel \
–enable-pnetcdf

CPPFLAGS=-I/home/mpiuser002/project/include LDFLAGS=-L/home/mpiuser002/project/lib
./configure –prefix=/home/mpiuser002/project

make check install

Build of non-parallel version failed 1 of 8 tests:

Cannot locate test server
FAIL: test_partvar

Building from serial HDF5:

CPPFLAGS=-I/home/mpiuser002/project/include LDFLAGS=-L/home/mpiuser002/project/lib
./configure –prefix=/home/mpiuser002/project

make

libtool: compile: gcc -DHAVE_CONFIG_H -I. -I.. -I../include -I../oc -g -O2 -MT libnetcdf4_la-nc4file.lo -MD -MP -MF .deps/libnetcdf4_la-nc4file.Tpo -c nc4file.c -fPIC -DPIC -o .libs/libnetcdf4_la-nc4file.o
nc4file.c: In function ‘nc4_create_file’:
nc4file.c:290: error: ‘H5F_LIBVER_18’ undeclared (first use in this function)
nc4file.c:290: error: (Each undeclared identifier is reported only once
nc4file.c:290: error: for each function it appears in.)
make[2]: *** [libnetcdf4_la-nc4file.lo] Error 1
make[2]: Leaving directory /home/mpiuser002/project/src/netcdf-4.2.1.1/libsrc4' make[1]: *** [all-recursive] Error 1 make[1]: Leaving directory/home/mpiuser002/project/src/netcdf-4.2.1.1′
make: *** [all] Error 2

CPPFLAGS=-I/home/mpiuser002/project/include LDFLAGS=-L/home/mpiuser002/project/lib
export LDFLAGS CPPFLAGS
./configure –prefix=/home/mpiuser002/project

make

make check

PASS: test_vara
Cannot locate test server
FAIL: test_partvar
*** Test: varm on URL: http://motherlode.ucar.edu/thredds/dodsC/testdods/coads_climatology.nc
*** Testing: stride case 1
*** Pass: stride case 1
*** Testing: stride case 2
*** Pass: stride case 2
*** Testing: stride case 3
*** Pass: stride case 3
PASS: test_varm3
*** Test: var conversions on URL: file:///home/mpiuser002/project/src/netcdf-4.2.1.1/ncdap_test/testdata3/test.02
*** testing: ch_data
*** testing: int8_data
*** testing: uint8_data
*** testing: int8toint32_data
*** testing: int82float32_data
*** testing: int16_data
*** testing: int16toint32_data
*** testing: int162float32_data
*** testing: int32_data
*** testing: int32tofloat32_data
*** testing: int32toilong_data
*** testing: float32_data
*** testing: float64_data

PASS: t_dap3a

1 of 8 tests failed

Please report to support-netcdf@unidata.ucar.edu

make[4]: *** [check-TESTS] Error 1
make[4]: Leaving directory /home/mpiuser002/project/src/netcdf-4.2.1.1/ncdap_test' make[3]: *** [check-am] Error 2 make[3]: Leaving directory/home/mpiuser002/project/src/netcdf-4.2.1.1/ncdap_test’
make[2]: *** [check-recursive] Error 1
make[2]: Leaving directory /home/mpiuser002/project/src/netcdf-4.2.1.1/ncdap_test' make[1]: *** [check] Error 2 make[1]: Leaving directory/home/mpiuser002/project/src/netcdf-4.2.1.1/ncdap_test’
make: *** [check-recursive] Error 1

bypassing open-dap check:

CPPFLAGS=-I/home/mpiuser002/project/include LDFLAGS=-L/home/mpiuser002/project/lib
export LDFLAGS CPPFLAGS
./configure –prefix=/home/mpiuser002/project \
–disable-dap-remote-tests

make

make check

SUCCESS!

netCDF-fortran

988 $DIR1=/home/mpiuser002/project
989 DIR1=/home/mpiuser002/project
990 export $DIR1
991 echo $DIR1
992 echo $SHELL
993 export $DIR1
994 export DIR1
995 export LD_LIBRARY_PATH=${DIR1}/lib:${LD_LIBRARY_PATH}
996 man export
997 CPPFLAGS=-I${DIR1}/include LDFLAGS=-L${DIR1}/lib ./configure –prefix=${DIR1}
998 make check
999 make install

SUCCESS!

PVM

export PVM_ROOT=$HOME/project/src/pvm3

make

SUCCESS!

IOAPI

Build netCDF before this.

export BIN=Linux2_x86_64gfort

BASEDIR = ${HOME}/project

IODIR = ${BASEDIR}/src/ioapi

# OBJDIR = ${IODIR}/../lib
# OBJDIR = ${IODIR}/../${BIN}
OBJDIR = ${BASEDIR}/${BIN}

#INSTDIR = ${INSTALL}/${BIN}
INSTDIR = ${BASEDIR}/lib

FIXDIR = ${IODIR}/fixed_src

SUCCESS! (after building PVM)

HYSPLIT

#———————————————————

# Master Makefile for the ../cmaq/mcip2arl directory

# Last Revised: 21 Nov 2005

# 21 Dec 2005 – put hysplit lib at end

# 09 Aug 2007 – support for gfortran

# 15 Jul 2010 – directory mod for repository

# 03 Nov 2010 – added conc2cdf (no IOAPI)

#———————————————————

SHELL = /bin/sh

SRC = .

EXE = ../exec

# LIBRARIES

LIB = ../library

# MLG – assuming API is location of IOAPI libraries

#API = /usr/local/lib

API = /home/mpiuser002/project/Linux2_x86_64gfort

#INC = /usr/local/include

INC = /home/mpiuser002/project/include

#CDF = /usr/local/lib

CDF = /home/mpiuser002/project/lib

[mpiuser002@fractal hysplit]$ bash update.sh

Repository location (local, server):

server

Enter command (export, checkout, log, version):

checkout

svn: OPTIONS of ‘https://svn.arl.noaa.gov:8443/svn/hysplit’: could not connect to server (https://svn.arl.noaa.gov:8443)

svn: OPTIONS of ‘https://svn.arl.noaa.gov:8443/svn/hysplit’: could not connect to server (https://svn.arl.noaa.gov:8443)

mv: cannot stat `version1.inc’: No such file or directory

mv: cannot stat `version2.inc’: No such file or directory

Edit makefiles to add “-lnetcdff” wherever “-lnetcdf” is used.

[mpiuser002@fractal trunk]$ ./compile.sh

./compile.sh: line 35: dos2unix: command not found

./compile.sh: line 37: dos2unix: command not found

./compile.sh: line 39: dos2unix: command not found

./compile.sh: line 41: dos2unix: command not found

./compile.sh: line 42: dos2unix: command not found

./compile.sh: line 43: dos2unix: command not found

./compile.sh: line 45: dos2unix: command not found

./compile.sh: line 46: dos2unix: command not found

./compile.sh: line 48: dos2unix: command not found

./compile.sh: line 49: dos2unix: command not found

./compile.sh: line 51: dos2unix: command not found

./compile.sh: line 52: dos2unix: command not found

./compile.sh: line 53: dos2unix: command not found

./compile.sh: line 56: dos2unix: command not found

./compile.sh: line 57: dos2unix: command not found

./compile.sh: line 58: dos2unix: command not found

./compile.sh: line 60: dos2unix: command not found

./compile.sh: line 61: dos2unix: command not found

./compile.sh: line 62: dos2unix: command not found

./compile.sh: line 64: dos2unix: command not found

./compile.sh: line 65: dos2unix: command not found

./compile.sh: line 66: dos2unix: command not found

./compile.sh: line 68: dos2unix: command not found

./compile.sh: line 69: dos2unix: command not found

./compile.sh: line 70: dos2unix: command not found

./compile.sh: line 72: dos2unix: command not found

./compile.sh: line 73: dos2unix: command not found

Building fcsubs library

Building hysplit source library

Building exec directory programs

Building the ascii2shp converter

Building the dbf editor

Building the WRF-ARW decoder (requires NetCDF)

Building the CMAQ converters (requires NetCDF & IOAPI)

[mpiuser002@fractal trunk]$

SUCCESS!

How do I implement the Hybrid Single Particle Lagrangian Integrated Trajectory Model (HYSPLIT) for parallel processing on a computing cluster?

Was this page helpful?

Do you have any feedback or suggestions to improve this page?