The GEOS-Chem Input Data portal

The main GEOS-Chem Input Data portal is hosted at the AWS S3 bucket s3://geos-chem. From here you may download the data required to run GEOS-Chem Classic, GCHP, or HEMCO standalone simulations.

Data organization

The GEOS-Chem Input Data portal is structured into the following categories:

  1. Initial conditions input data (aka Restart files)

  2. Chemistry input data

  3. Emissions input data

  4. Meteorology input data

Initial conditions input data

Initial conditions include initial species concentrations (aka Restart files) used to start a GEOS-Chem simulation.

Chemistry input data

Chemistry input data includes:

  • Tables of aerosol optical properties

  • Quantum yields and cross sections for photolysis using either Cloud-J or legacy FAST-JX

  • Climatology data for Linoz stratospheric ozone chemistry

  • Boundary conditions for UCX stratospheric chemistry routines

Emissions input data

Emissions input data includes the following data:

  • Emissions inventories

  • Input data for HEMCO Extensions

  • Input data for GEOS-Chem specialty simulations

  • Scale factors

  • Mask definitions

  • Surface boundary conditions

  • Leaf area indices

  • Land cover map

Meteorology input data

GEOS-Chem Classic be driven by the following meteorology products:

  1. MERRA-2

  2. GEOS-FP

  3. GEOS-IT

  4. GCAP 2.0 (available at the atmos.earth.rochester.edu data portal)

Attention

We are still evaluating GEOS-Chem with the new NASA GEOS-IT meterorology product. For the time being, you should use one of the other meteorology options.

Data access

You may access the GEOS-Chem Input Data portal in several ways, as described below.

AWS S3 Explorer

You can browse the contents of the GEOS-Chem Input Data portal with the AWS S3 Explorer interface. Simply point your web browser to the following link:

This is an easy way for you to familiarize yourself with the directory structure. Before downloading large amounts of data, we recommend that you use the AWS S3 Explorer to find the path to the relevant data directories.

AWS CLI (command-line interface)

You can also use the AWS command-line interface (aka AWS CLI) to browse and download data from the GEOS-Chem Input Data portal. For example, use this command to get a data listing:

$ aws s3 ls s3://geos-chem/   # Get a directory listing

For detailed instructions about using AWS CLI, please see: Tutorial: Accessing GEOS-Chem Input Data using AWS CLI.

HTTP or wget download

You can also access the GEOS-Chem Input Data portal via the alternate web link http://geoschemdata.wustl.edu.

As with the AWS S3 Explorer, you can navigate through the web interface to find the data sets that you wish to download. You can then use the wget command to download the data.

Dry-run simulation (GEOS-Chem Classic and HEMCO standalone only)

If you plan to run a GEOS-Chem Classic or HEMCO standalone simulation, we recommend first performing a dry-run simulation. The dry-run simulation workflow is as follows:

  1. Configure your GEOS-Chem Classic or HEMCO standalone simulation.

  2. Run GEOS-Chem Classic or HEMCO standalone with the --dryrun flag. This will generate a list of required data files.

  3. Pass this list to a Python script, which will download the data to your computer system or AWS EC2 instance.

For more information, please see the following links:

Globus

Many institutions use the Globus file transfer utility, which has much higher data download speeds than normal SSH or HTTP connections.

If your institution uses Globus, you can download data from the GEOS-Chem Data (WashU) endpoint to your computer system.

Bashdatacatalog

We have created the bashdatacatalog tool to facilitate downloading large amounts of data from the GEOS-Chem Input Data portal. Please see our Manage a data archive with bashdatacatalog guide for usage instructions.

Example directory structure

The directory structure of the GEOS-Chem Input Data portal adheres to the format listed below. You can see easily browse through the portal using one of the following web links:

ExtData/
│
├── GEOSCHEM-RESTARTS/
│   ├── GC_14.2.0/
│   ├── GC_14.3.0/
│   └── ...
│
├── CHEM_INPUTS/
│   ├── CLOUD-J/
│   ├── FAST-JX/
│   └── ...
│
├── HEMCO/
│   ├── UVALBEDO/
│   └── ...
│
├── GEOS_0.5x0.625/
│   ├── MERRA2/
│   │   ├── 2023/
│   │   ├── 2024/
│   │   └── ...
│   └── ...
│
├── GEOS_0.25x0.3125/
│   ├── GEOS_FP/
│   │   ├── 2023/
│   │   ├── 2024/
│   │   └── ...
│   ├── GEOS_FP_Raw/
│   └── ...
│
└── ...