Build libraries with Spack¶
Here are some up-to-date instructions on installing a software stack for GEOS-Chem Classic or HEMCO with Spack.
Note
If you will be using GCHP, please see gchp.readthedocs.io for instructions on how to download required libraries with Spack.
Initial Spack setup¶
Install spack to your home directory¶
Spack can be installed with Git, as follows:
cd ~
$ git clone git@github.com:spack/spack.git
Initialize Spack¶
To initialize Spack type these commands:
$ export SPACK_ROOT=${HOME}/spack
$ source ${SPACK_ROOT}/spack/share/spack/setup-env.sh
Make sure the default compiler is in compilers.yaml¶
Tell Spack to search for compilers:
$ spack compiler find
You can confirm that the default compiler was found by inspecing
compilers.yaml
file with your favorite editor, e.g.:
$ emacs ~/.spack/linux/compilers.yaml
For example, the default compiler that was on my cloud instance was
the GNU Compiler Collection 7.4.0. This collection contains C
(gcc), C++ (:program`g++`), and Fortran
(gfortran) compilers. These are specified in the
compiler.yaml
file as:
compilers:
- compiler:
spec: gcc@7.4.0
paths:
cc: /usr/bin/gcc-7
cxx: /usr/bin/g++-7
f77: /usr/bin/gfortran-7
fc: /usr/bin/gfortran-7
flags: {}
operating_system: ubuntu18.04
target: x86_64
modules: []
environment: {}
extra_rpaths: []
As you can see, the default compiler executables are located in the
/usr/bin
folder. This is where many of the system-supplied
executable files are located.
Build the GCC 10.2.0 compilers¶
Let’s build a newer compiler verion with Spack. In this case we’ll build the GNU Compiler Collection 10.2.0 using the default compilers.
$ spack install gcc@10.2.0 target=x86_64 %gcc@7.4.0
$ spack load gcc%10.2.0
Update compilers.yaml¶
In order for Spack to use this new compiler to build other packages,
the compilers.yaml
file must be updated using these commands:
$ spack load gcc@10.2.0
$ spack compiler find
Install required libraries for GEOS-Chem¶
Now that we have installed a the GNU Compiler Collection 10.2.0, we can use it to build the required libraries for GEOS-Chem Classic and HEMCO.
HDF5¶
Now we can start installing libraries. First, let’s install HDF5, which is a dependency of netCDF.
$ spack install hdf5%gcc@10.2.0 target=x86_64 +cxx+fortran+hl+pic+shared+threadsafe
$ spack load hdf5%gcc@10.2.0
The +cxx+fortran+hl+pic+shared+threadsafe
specifies necessary options for building HDF5.
netCDF-Fortran and netCDF-C¶
Now that we have installed :program:, we may proceed to installing netCDF-Fortran (which will install netCDF-C as a dependency).
$ spack install netcdf-fortran%gcc@10.2.0 target=x86_64 ^hdf5+cxx+fortran+hl+pic+shared+threadsafe
$ spack load netcdf-fortran%gcc@10.2.0
$ spack load netcdf-c%gcc@10.2.0
We tell Spack to use the same version of HDF5 that we just built by appending
^hdf5+cxx+fortran+hl+pic+shared+threadsafe
to the spack install
command. Otherwise, Spack will try to build a new version of HDF5
with default options (which is not what we want).
ncview¶
Ncview is a convenient viewer for browsing netCDF files. Install it with:
$ spack install ncview%gcc@10.2.0 target=x86_64 ^hdf5+cxx+fortran+hl+pic+shared+threadsafe
$ spack load ncview%gcc@10.2.0
nco (The netCDF Operators)¶
The netCDF operators (nco) are useful programs for manipulating netCDF files and attributes. Install (nco) with:
$ spack install nco%gcc@10.2.0 target=x86_64 ^hdf5+cxx+fortran+hl+pic+shared+threadsafe
$ spack load nco%gcc@10.2.0
cdo (The Climate Data Operators)¶
The Climate Data Operators (cdo) are utilities for processing data in netCDF files.
$ spack install cdo%gcc@10.2.0 target=x86_64 ^hdf5+cxx+fortran+hl+pic+shared+threadsafe
$ spack load cdo%gcc@10.2.0
flex¶
The flex library is a lexical parser. It is a dependency for The Kinetic PreProcessor (KPP).
$ spack install flex%gcc@10.2.0 target=x86_64
$ spack load flex%gcc10.2.0
gdb and cgdb¶
Gdb is the GNU Debugger. Cgdb is a visual, user-friendly interface for gdb.
$ spack install gdb@9.1%gcc@10.2.0 target=x86_64
$ spack load gdb%10.2.0
$ spack install cgdb%gcc@10.2.0 target=x86_64
$ spack load cgdb%gcc@10.2.0
cmake and gmake¶
Cmake and gmake are used to build source code into executables.
$ spack install cmake%gcc@10.2.0 target=x86_64
$ spack load cmake%gcc@10.2.0
$ spack install gmake%gcc@10.2.0 target=x86_64
$ spack load gmake%gcc@10.2.0
Installing optional packages¶
These packages are useful not strictly necessary for GEOS-Chem.
OpenJDK (Java)¶
Some programs might need the openjdk Java Runtime Environment:
$ spack install openjdk%gcc@10.2.0
$ spack load openjdk%gcc@10.2.0
TAU performance profiler¶
The Tuning and Analysis Utilities (;program:tau) lets you profile GEOS-Chem and HEMCO in order to locate computational bottlenecks:
$ spack install tau%gcc@10.2.0 +pthread+openmp~otf2
$ spack load tau%gcc@10.2.0
Loading Spack packages at startup¶
Creating an environment file for Spack¶
Once you have finished installing libraries with Spack, you can create an environment file to load the Spack libraries whenever you start a new Unix shell. Here is a sample environment file that can be used (or modified) to load the Spack libraries described above.
#==============================================================================
# %%%%% Clear existing environment variables %%%%%
#==============================================================================
unset CC
unset CXX
unset EMACS_HOME
unset FC
unset F77
unset F90
unset NETCDF_HOME
unset NETCDF_INCLUDE
unset NETCDF_LIB
unset NETCDF_FORTRAN_HOME
unset NETCDF_FORTRAN_INCLUDE
unset NETCDF_FORTRAN_LIB
unset OMP_NUM_THREADS
unset OMP_STACKSIZE
unset PERL_HOME
#==============================================================================
# %%%%% Load Spack packages %%%%%
#==============================================================================
echo "Loading gfortran 10.2.0 and related libraries ..."
# Initialize Spack
# In the examples above /path/to/spack was ${HOME}/spack
export SPACK_ROOT=/path/to/spack
source $SPACK_ROOT/share/spack/setup-env.sh
# List each Spack package that you want to load
# (add the backslash after each new package that you add)
pkgs=( \
gcc@10.2.0 \
cmake%gcc@10.2.0 \
openmpi%gcc@10.2.0 \
netcdf-fortran%gcc@10.2.0 \
netcdf-c%gcc@10.2.0 \
hdf5%gcc@10.2.0 \
gdb%gcc@10.2.0 \
flex%gcc@10.2.0 \
openjdk%gcc@10.2.0 \
cdo%gcc@10.2.0 \
nco%gcc@10.2.0 \
ncview%gcc@10.2.0 \
perl@5.30.3%gcc@10.2.0 \
tau%gcc@10.2.0 \
)
# Load each Spack package
for f in ${pkgs[@]}; do
echo "Loading $f"
spack load $f
done
#==============================================================================
# %%%%% Settings for OpenMP parallelization %%%%%
#==============================================================================
# Max out the stack memory for OpenMP
# Asking for a huge number will just give you the max availble
export OMP_STACKSIZE=500m
# By default, set the number of threads for OpenMP parallelization to 1
export OMP_NUM_THREADS=1
# Redefine number threads for OpenMP parallelization
# (a) If in a SLURM partition, set OMP_NUM_THREADS = SLURM_CPUS_PER_TASK
# (b) Or, set OMP_NUM_THREADS to the optional first argument that is passed
if [[ -n "${SLURM_CPUS_PER_TASK+1}" ]]; then
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}
elif [[ "$#" -eq 1 ]]; then
if [[ "x$1" != "xignoreeof" ]]; then
export OMP_NUM_THREADS=${1}
fi
fi
echo "Number of OpenMP threads: $OMP_NUM_THREADS"
#==============================================================================
# %%%%% Define relevant environment variables %%%%%
#==============================================================================
# Compiler environment variables
export FC=gfortran
export F90=gfortran
export F77=gfortran
export CC=gcc
export CXX=g++
# Machine architecture
export ARCH=`uname -s`
# netCDF paths
export NETCDF_HOME=`spack location -i netcdf-c%gcc@10.2.0`
export NETCDF_INCLUDE=${NETCDF_HOME}/include
export NETCDF_LIB=${NETCDF_HOME}/lib
# netCDF-Fortran paths
export NETCDF_FORTRAN_HOME=`spack location -i netcdf-fortran%gcc@10.2.0`
export NETCDF_FORTRAN_INCLUDE=${NETCDF_FORTRAN_HOME}/include
export NETCDF_FORTRAN_LIB=${NETCDF_FORTRAN_HOME}/lib
# Other important paths
export GCC_HOME=`spack location -i gcc@10.2.0`
export MPI_HOME=`spack location -i openmpi%gcc@10.2.0`
export TAU_HOME=`spack location -i tau%gcc@10.2.0`
#==============================================================================
# %%%%% Echo relevant environment variables %%%%%
#==============================================================================
echo
echo "Important environment variables:"
echo "CC (C compiler) : $CC"
echo "CXX (C++ compiler) : $CXX"
echo "FC (Fortran compiler) : $FC"
echo "NETCDF_HOME : $NETCDF_HOME"
echo "NETCDF_INCLUDE : $NETCDF_INCLUDE"
echo "NETCDF_LIB : $NETCDF_LIB"
echo "NETCDF_FORTRAN_HOME : $NETCDF_FORTRAN_HOME"
echo "NETCDF_FORTRAN_INCLUDE : $NETCDF_FORTRAN_INCLUDE"
echo "NETCDF_FORTRAN_LIB : $NETCDF_FORTRAN_LIB"
Save this to your home folder with a name such as ~/.spack_env
. The
.
in front of the name will make it a hidden file like your
.bashrc
or .bash_aliases
.
Loading Spack-built libraries¶
Whenever you start a new Unix session (either by opening a terminal
window or running a new job), your .bashrc
and
.bash_aliases
files will be sourced, and the commands
contained within them applied. You should then load the Spack
modules by typing at the terminal prompt:
$ source ~/.spack.env
You can also add some code to your .bash_aliases
so that this
will be done automatically:
if [[ -f ~/.spack.env ]]; then
source ~/.spack.env
fi
In either case, this will load the modules for you. You should see output similar to:
Loading gfortran 10.2.0 and related libraries ...
Loading gcc@10.2.0
Loading cmake%gcc@10.2.0
Loading openmpi%gcc@10.2.0
Loading netcdf-fortran%gcc@10.2.0
Loading netcdf-c%gcc@10.2.0
Loading hdf5%gcc@10.2.0
Loading gdb%gcc@10.2.0
Loading flex%gcc@10.2.0
Loading openjdk%gcc@10.2.0
Loading cdo%gcc@10.2.0
Loading nco%gcc@10.2.0
Loading ncview%gcc@10.2.0
Loading perl@5.30.3%gcc@10.2.0
Loading tau%gcc@10.2.0
Number of OpenMP threads: 1
Important environment variables:
CC (C compiler) : gcc
CXX (C++ compiler) : g++
FC (Fortran compiler) : gfortran
NETCDF_HOME : /net/seasasfs02/srv/export/seasasfs02/share_root/ryantosca/spack/opt/spack/linux-centos7-x86_64/gcc-10.2.0/netcdf-c-4.7.4-22bkbtqledcaipqc2zrgun4qes7kkm5q
NETCDF_INCLUDE : /net/seasasfs02/srv/export/seasasfs02/share_root/ryantosca/spack/opt/spack/linux-centos7-x86_64/gcc-10.2.0/netcdf-c-4.7.4-22bkbtqledcaipqc2zrgun4qes7kkm5q/include
NETCDF_LIB : /net/seasasfs02/srv/export/seasasfs02/share_root/ryantosca/spack/opt/spack/linux-centos7-x86_64/gcc-10.2.0/netcdf-c-4.7.4-22bkbtqledcaipqc2zrgun4qes7kkm5q/lib
NETCDF_FORTRAN_HOME : /net/seasasfs02/srv/export/seasasfs02/share_root/ryantosca/spack/opt/spack/linux-centos7-x86_64/gcc-10.2.0/netcdf-fortran-4.5.3-mtuoejjcl3ozbvd6prgqm44k5jre3hne
NETCDF_FORTRAN_INCLUDE : /net/seasasfs02/srv/export/seasasfs02/share_root/ryantosca/spack/opt/spack/linux-centos7-x86_64/gcc-10.2.0/netcdf-fortran-4.5.3-mtuoejjcl3ozbvd6prgqm44k5jre3hne/include
NETCDF_FORTRAN_LIB : /net/seasasfs02/srv/export/seasasfs02/share_root/ryantosca/spack/opt/spack/linux-centos7-x86_64/gcc-10.2.0/netcdf-fortran-4.5.3-mtuoejjcl3ozbvd6prgqm44k5jre3hne/lib
Once you see this output, you can then start using programs that rely on these Spack-built libraries.
Setting the number of cores for OpenMP¶
If you type:
$ source ~/.spack.env
by itself, this will set the OMP_NUM_THREADS
variable
to 1. This variable sets the number of computational cores that OpenMP
should use.
You can change this with, e.g.
source ~/.spack.env 6
which will set OMP_NUM_THREADS
to 6. In this case, GEOS-Chem
Classic (and other programs that use OpenMP parallelization) will
parallelize with 6 cores.
If you are using the SLURM scheduler and are source .spack.env
in your job script, then OMP_NUM_THREADS
will be automatically
set to SLURM_CPUS_PER_TASK
, which is then number of cores
requested. If you are not using SLURM then you should add e.g.
export OMP_NUM_THREADS=6
(or however many cores you have requested) in your SLURM job script.