GISS GCM ModelE

HOW-TO Document for the GISS GCM

This document outlines how to achieve some commonly sought results with the new modelE. Some of the procedures will be the same as previously, but it is more likely that the details will have become slightly more straightforward (hopefully!). However, due to the modularization that has been done, you are no longer free to do things in an arbitrary way, see the notes below for more details. Please feel free to add to this document.

This is only part of the documentation, you can find additional information in the glossary of terms (not yet available), the Frequently Asked Questions (FAQ) file, the description of current functionality and options, or the index of all prognostic variables (determined using 'gmake htmldoc' (see below).

The document is set up in four parts: Part 0 deals with getting the code, Part I deals with running the model for basic users who do not need to know how to alter the code. Part II dealing with examining the diagnostics, while Part III deals with more technical issues involved in adding functionality or diagnostics.

PART 0: Getting the code and configuring the model

  1. HOW-TO get the code from the CVS repository
  2. HOW-TO configure modelE on your machine
  3. System requirements

PART I: Basic user information

  1. HOW-TO setup and run the GISS GCM
    1. How to select a rundeck name
    2. How to compile the model
    3. How to interact with CVS to make sure the run is reproducible
    4. How to run the model
    5. How to run the model on multiple processors
    6. How to specify individual compilation options for certain files
  2. HOW-TO set rundeck parameters
    1. How to tune a specified ocean run (using U00ice/HRMAX)
    2. How to set up a Q-flux model with a single ocean layer
    3. How to set up a Q-flux model with diffusion into a deep ocean
    4. How to use them to control numerical instabilities
    5. How to use IRANDI to create an ensemble of runs
    6. How to fall back to a smaller time step automatically
    7. How to choose which ISTART to use
  3. HOW-TO stop and restart the GCM
  4. HOW-TO change boundary conditions
    1. HOW-TO alter the topography file consistently
    2. HOW-TO run the model with a new land mask
  5. HOW-TO change trace gas amounts
  6. HOW-TO have a perpetual winter
  7. HOW-TO travel back/forward in time
  8. HOW-TO move to Mars
  9. HOW-TO move to Wonderland
  10. HOW-TO report a problem
  11. HOW-TO keep up-to-date
  12. HOW-TO move the model offsite
  13. HOW-TO add to this documentation

Part II: Getting information from the diagnostics

  1. HOW-TO look at the output
  2. HOW-TO change what is included in the printout
  3. HOW-TO produce daily/monthly or seasonal diagnostics
  4. HOW-TO control the format of the binary output (i.e. GISS/netcdf/hdf etc.)
  5. HOW-TO calculate a model score

Part III: The model code

  1. HOW-TO find where a variable is defined
  2. HOW-TO follow the program structure
  3. HOW-TO use or add to the automatic documentation
  4. HOW-TO ensure that your code meets the new standards
  5. HOW-TO add functionality
  6. HOW-TO read in an external file
  7. HOW-TO add new variables to the restart files
  8. HOW-TO access variables that are already calculated somewhere else
  9. HOW-TO deal with the inevitable bugs you introduce
  10. HOW-TO deal with restart problems (non-reproducibility of results)
  11. HOW-TO get an innovation included in everybody's versions

Appendix: Some Technical details

  1. HOW-TO create and manage cvs revisions

Finally, CREDITS where CREDITS are due:

  • For ongoing development (who to blame for the current shambles):
    Gavin Schmidt, Reto Ruedy, Jean Lerner, Max Kelley, Igor Aleinov, Gary Russell, Ye Cheng, Andy Lacis, Valdar Oinas, M. S. Yao., Larissa Nazarenko, Nick Tausnev, Shan Sun, Greg Faluvegi, Susanne Bauer, Dorothy, Koch, Nadine Bell, Jan Perlwitz
  • Historical development (stuff we haven't rewritten yet):
    Bob Suozzo, Suki Sheth, Hans-Peter Zinn, Rick Healy, Frank Abramopoulos, Greg Hartke
  • Inspiration (the people paying for this):
    David Rind, Jim Hansen, Tony Del Genio, NASA GSFC

Part 0: Getting the code and configuring the model

1) HOW-TO get the code from the CVS repository

The modelE code includes source code, scripts and some documentation. This is stored through the CVS revision control system, and copies can be checked-out to your home directory using the following commands:

export CVSROOT=/u/cmrun/cvs/repository
cvs checkout -r modelE1 modelE

This will create a directory modelE/ in your directory. The code will correspond to modelE1 - the frozen version for 2004. Please check to see if this actual is the latest release. To place the model in a different directory, use the -d dirname option. Depending on your level of skill in using CVS, and whether you are a developer, you may want to set up a private branch where you can store your own changes and additions without interfering with the main development process. Be careful here, because committing changes from your private branch is difficult and error prone, and so if you anticipate contributing to the group effort, you are advised to set up at least one version that is on the main development tree (i.e. with no branch). Branches are useful for keeping track of special code that you need for particular sensitivity experiments you might want to perform.

To set up a private branch:

cd modelE
cvs tag -b your_tag_here
cvs update -r your_tag_here

Please subscribe to the mailing list 'giss-gcm-users-l@giss.nasa.gov' by sending a message to Majordomo@giss.nasa.gov with a line

SUBSCRIBE giss-gcm-users-l

in the body of the text. This is to ensure that you are notified of bug fixes and new releases. Also this is a forum for you to ask questions about the model. If you are a developer, then subscribe to the giss-gcm-develop-l list. This will make sure that you are notified of all changes to the code and any specific messages related to the development process.

2) HOW-TO configure modelE on your machine

For modelE to work properly you have to create a configuration file .modelErc in your home directory. This file keeps some system-wide information (such as your directory structure) which is common to all modelE runs you do on current machine. To create this file go to the directory modelE/decks and execute:

gmake config

This will create .modelErc file in your home directory. It will contain default settings, which are adjusted for ra.giss.nasa.gov. So if you are working on GISS ra or poncho then you are set. On all other machines you should edit the file ~/.modelErc to describe your working environment. .modelErc also allows you to set some personal preferences like to which email address you want the notifications to be sent when the program stops or whether you want a verbose output during the compilation. It contains short descriptions for all configurable parameters.

Here is the list of currently supported options (with their default values) which one can set in ~/.modelErc:

  • DECKS_REPOSITORY - a directory for permanent storage of run info. All rundecks that you create will be copied to this directory. This directory exists mostly for archival purposes.
    (default: DECKS_REPOSITORY=/u/cmrun/modelE/decks)
  • CMRUNDIR - directory to which all run directories will be linked. This directory will be searched by most scripts for locations of specific runs. It can coincide with SAVEDISK
    (default: CMRUNDIR=/u/cmrun)
  • GCMSEARCHPATH - directory to search for gcm input files. All necessary input files should be copied or linked to this directory.
    (default: GCMSEARCHPATH=/u/cmrun)
  • EXECDIR - path to directory with modelE scripts. This directory should contain the scripts from modelE/exec. Auxiliary binaries which are not run-dependent should also be put into this directory.
    (default: EXECDIR=/u/exec)
  • NETCDFHOME - path to directory which contains netcdf library. If netcdf is not installed on your computer you should leave it blank. modelE will work fine without netcdf but you will not able to compile some auxiliary binaries.
    (default: NETCDFHOME=/usr/local/netcdf-3.4/lib64)
  • SAVEDISK - a directory where all run directories will be created. These run directories will contain all files written by the model during the run (i.e. rsf, acc, PRT etc.). So SAVEDISK should be big enough to accommodate all model output.
    (default: SAVEDISK=/raid1)
  • MAILTO - email address of the user. When the program ends/crashes all notifications will be sent to this address. If not specified `whoami` will be used.
    (default: MAILTO= )
  • OVERWRITE - if set to YES it will allow gmake to overwrite certain files (like rundecks in repository) which by default are protected. It may be useful when you do a lot of testing and don't want to add OVERWRITE=YES on the command line all the time. In general it should be always set to NO. Leave it at its default unless you know what you are doing.
    (default: OVERWRITE=NO)
  • OUTPUT_TO_FILES - if set to YES all errors and warnings will be sent to files with the names source_name.ERR. This is user's preference and can be set according to personal taste.
    (default: OUTPUT_TO_FILES=YES)
  • VERBOSE_OUTPUT - if set to YES gmake will show compilation commands and some other information while compilation is in process. Otherwise most of the output will be suppressed. Set it according to your taste.
    (default: VERBOSE_OUTPUT=NO)
  • MP - multiprocessing support. If set to YES gmake will compile the code with OpenMP instructions. Don't set it to YES unless you are really planning to run the model on several processors, otherwise it will be unnecessary waste of resources (and your model can actually run slower on one cpu). Also keep in mind that one can not compile OpenMP objects with an non-OpenMP linker, so if you change from MP=YES to MP=NO you have to recompile the entire model. This option has effect currently on SGI and Compaq only, but can be set to work for dual-processor Linux systems depending on the compiler.
    (default: MP=NO)

3) System requirements

The set-up code and automatic compilation rely heavily on gmake (version 3.79 or higher) and perl (version 5.005 or higher). Both of these programs are public domain and available for almost all platforms. Up-to-date versions are required to be able to set up and compile the model.

The code requires a FORTRAN 90/95 compiler. It has been tested with the SGI, IBM, and COMPAQ/DEC compilers on their respective workstations. For Linux, Macs or PCs, the choice of compiler is wider, and we have not been able to test all possibilities. The Absoft ProFortran compiler works well, as do the Lahey/Fujitsu and Portland Group compilers. We have been unable (as yet) to get the model to compile with the Intel or VAST compilers due to internal errors in those products. Please let us know if you succeed (or more importantly, fail) in compiling the code using any other compiler.

Note that the parallelization used in the code is based on the OpenMP shared memory architecture. This is not appropriate for multi-processing on a distributed memory platform (such as a Beowlf cluster). For machines that share a number of processors per board, the OpenMP directives will work up to that number of processors. We are moving towards a domain decomposition/MPI approach (which will be clear if you look at the code), but this effort is not yet complete or functional.

Many GCM subroutines declare temporary arrays which, in FORTRAN, reside in a part of the computer's memory known in computer parlance as the "stack." It is essential that a user's stack size be large enough to accommodate these arrays; a minimum value of 8192 kilobytes is recommended, with stack requirements increasing with model resolution. Default stack sizes vary among different computing environments, but can be easily adjusted. Unfortunately this parameter must be explicitly set by the user rather than by the GCM setup scripts. Stack size is set by commands native to a user's $SHELL; for example, a bash user would type ulimit -s 32768 to get a stack size of 32768 kb. Mysterious crashes in the radiation are quite often related to a too small stack.


Part I: Basic user tasks

1) HOW-TO setup and run the GISS GCM

In order to run the GCM a 'rundeck' file (with a suffix .R) must be created. This contains information about the run you would like to perform, the fortran code that you intend to use, pre-processing options that are required, the boundary and initial conditions and any run-specific parameters (such as the time step, length of run etc.). Most of the specifics have been automated using 'gmake'. The sub-directory 'decks' is the main control directory, most commands will be issued in this directory. Typing 'gmake' will produce some documentation. There are a number of sample rundecks in the model/ directory (E001*.R). These represent runs with the standard fixed SST atmosphere only run (E001.R), a Qflux run (E001q.R), a coupled ocean-atmosphere run (E001o.R), a varying SST (AMIP style) run (E001a.R), a Qflux with deep diffusion (E001qd.R), a stratospheric model (E001M23.R), a tracer model (E001tr.R) etc. These can be used as templates for any runs you do.
  1. How to select a rundeck name

    In order to make a rundeck that you can use and modify, you must first select a unique name for your run. Note that there is a 16 character limit on the length of the rundeck name due to limitations in the batch processing system we normally use. There is a central database for storing user's rundecks (along with other pertinent information such as the time of compilation, the compiler version, and the current CVS branch) in ra:/u/cmrun/modelE/decks (configurable in ~/.modelErc). This allows us to automatically tell whether a rundeck name has already been used. However, in order to minimise any confusion, we propose the following naming convention:

    • Each rundeck should start with 'E' (to distinguish it from other models in the building)
    • The next three characters should be a counter (starting at '001') which the individual user will update as they perform successive runs.
    • The next part consists of the users initials (or other unique identifier.
    • If the user is experimenting with variable resolutions, it may be helpful to add 'M12' or 'M23' or 'F32' etc. to the end of the name to distinguish between the Medium resolution 12 or 23 layer model, or the Fine resolution, 32 layer model etc.

    Note that this implies that there would be not necessarily be any connection between runs E034xxx and E034yyy, since the numbers now refer to the run sequence, rather than the model version sequence.

  2. How to compile the model

    Once a rundeck name has been chosen (for example E001xyz), the rundeck will be created using

         gmake rundeck RUN=E001xyz
    
    This will make a rundeck E001xyz.R, that will be a copy of E001.R. To make a copy based on another of the sample decks, use, for instance,
         gmake rundeck RUN=E001xyz RUNSRC=E001q
    
    A copy of the new rundeck will be stored in the rundeck depository on ra. If there is a clash with an existing rundeck name (including perhaps one of your previous runs), there will be an error message. If you want to insist on overwriting a previous version say, then give the command
         gmake rundeck RUN=E001xyz OVERWRITE=YES
    
    If you create a rundeck in any other fashion, ie. by directly copying one. There is no chance to verify its uniqueness. However, a secondary check is made when you attempt a 'setup' (see below) which should catch any inadvertent duplication.

    When making a new rundeck, please add a documentation line, and edit the label to reflect your model experiment!

    The list of fortran files within the rundeck determines which files are compiled into the executable. The compilation is automatic, and will take account of all dependencies, so that if a file is modified it will be re-compiled, along with any files that directly or indirectly depend on modules or routines within it. The executable is made using

         gmake gcm RUN=E001xyz
    
    To actually run the model, you must first 'setup' the run by running the first hour (or specifically the first source time step).
        gmake setup RUN=E001xyz
    
    If this runs successfully there will be a message saying so. Likewise if there is a problem.

  3. How to interact with CVS to make sure the run is reproducible

    It is of some importance that runs be reproducible, either by other people, or later on by the users themselves. With version control now using the CVS system, it implies that CVS MUST know about any code variations that make it into any production run (i.e. one that is not simply a compilation test or debugging). We know that test runs and production runs are not necessarily easily distinguishable a priori (i.e. a run is a test run until it works, at which point it retroactively turned out to be the production run). So the best way to ensure thats the relevant information is saved is through frequent 'commits' to the cvs repository of all substantial changes you make to the code. By this we mean at minimum all changes that have effects on the prognostic variables (as opposed to diagnostic printout).

    You can also define tags within the CVS tree that can be personal to you, which just mark a particular set of code as a branch point, which can then be retrieved at any point.

    There is no cost to this procedure (although you should be careful to ensure that code at least compiles before any commits are made.)

    At the 'setup' stage (or at the 'exe' stage if you are recreating an executable files) further information is added to the rundeck depository indicating any changes to the rundeck since it was made, along with full CVS information on what was actually compiled. To retrieve this exact code, it is sufficient to know the time the run was compiled, and the CVS branch that the user was on.

  4. How to run the model

    Once a successful 'setup' is done, use the command

      runE E001xyz
    
    to set the model running for the specified length of the run.

    If you need to stop the run at any point, use the 'sswE E001xyz' command to do it in a controlled fashion (ensuring that all variables are saved correctly, and that no printout gets stuck in a buffer). Using 'runE E001xyz' subsequently picks up from where you left off.

    If you would like to update the executable (putting in new printout, or fixing an error etc.) without starting from 'setup' all over again, you can use

      gmake exe RUN=E001xyz
    
    which will recompile the fortran, and replace the executable file in the run directory. Then 'runE E001xyz' will pick up from where you stopped previously, but with the new code. Any changes to the CVS information between the initial setup and the new code will be recorded in the central rundeck repository.

  5. How to run the model on multiple processors

    ModelE uses OpenMP application program interface. It consists in a set of instructions (starting with C$OMP) which tell the compiler how to parallelize the code. To be able to run the model on multiple processors one has to compile it with enabled OpenMP. This is done by appending MP=YES to the compile line, i.e.

      gmake gcm RUN=E001xyz MP=YES 
    
    It is important to keep in mind that one can't mix OpenMP objects with a non-OpenMP compilation. This means that one has to do gmake vclean when switching from OpenMP to non-OpenMP compilation. The option MP=YES can be set in ~/.modelErc file (see Part 0.2). In that case one can skip it on the command line.

    In the setup stage, the number of processors is defined by the relevant environmental variable $MP_SET_NUMTHREADS. However, when using runE, it is similar to the way it is used for the non-OpenMP model, except that one has to specify the number of processors as a second argument to runE. For instance if one wants to run the model E001xyz on 4 processors one starts it with

      runE E001xyz 4 
    
    On batch systems (such as halem or charney), the number of processors is more automatic, and currently, only four can be used (since that is the number of processors per shared-memeory node. No additional argument is therefore needed. Slightly different run commands are available on those systems related to the different possible queues.

  6. How to specify individual compilation options for certain files

    In some circumstances you may need to specify individual compilation options for certain files. For example, you may want to compile some files with debugging information or with REAL*8 for default reals etc. This can be done inside the rundeck by adding all necessary options in vertical slashes (||) right after the name of the file. All options specified between the slashes will be appended to the compile line during compilation. For instance, the following line in the rundeck

      GHY_COM GHY_DRV|-g| GHY|-g -r8|
    
    will tell gmake to compile GHY_COM.f with default options, compile GHY_DRV.f with debugging information and compile GHY.f with debugging information and with default reals set to REAL*8.

    One should keep in mind that changing an option inside the rundeck doesn't force automatic recompilation. One needs to 'touch' corresponding source files to tell gmake that they need to be recompiled.

2) HOW-TO set rundeck parameters

There are 2 types of parameters that may be set in the rundeck: DATA BASE (dp) parameters and NAME LIST (nl) parameters The dp-params are set between &&PARAMETERS and &&END_PARAMETERS The nl-params are set between &INPUTZ and &END The dp-params are saved in record 3 of the save files (acc,rsf) They are tuning or control parameters. the nl-params are not saved except for IYEAR1 (which is saved in record 1). Most of the others set start and stop date for the run: (YEAR/MONTH/DATE/HOUR)I or E or IHOURE or determine the (debug) printout: KDIAG,QDIAG,IWRITE,...,QCHECK ISTART determines whether this is an initial- or restart IRANDI may be used to perturb the current temperatures. See NL-PARAMS for more information. When a run starts, parameters not set in the rundeck take their default values. On restarts, YEAR/MONTH/DATE/HOURI are ignored; all other parameters are determined first by the rundeck, then by the restart file if they were saved, else use the defaults. So the priorities are: 1. rundeck, 2. restart file, 3. defaults. To start a run, the following parameters HAVE to be explicitly set: ISTART=2 | 4 5 6 7 or 8 to start from obs | model Initial State YEARI = year in which model starts YEARE (or IHOURE) = year (or hour) at which model stops To change parameters after a run was started, do the following: 1: edit the file 'I' in the run directory, 2: stop and restart the run, 3: re-edit I to ready it for the next restart (If the change affects the results, attach a log of all such changes to the end of the rundeck)

A: How to TUNE a specified ocean run (using U00ice/HRMAX)

It is desirable to run a specified ocean model in such a way that it is in RADIATIVE BALANCE, i.e. that the net flux of energy into the ocean (which has no effect, since ocean data are specified) is zero if averaged over a whole year. One way to roughly achieve this goal is to pick the right values for U00ice/U00wtrX/HRMAX. These parameters influence the cloud formation where the temperature is below/above 0C and the cloud formation in the boundary layer respectively, and can therefore be used to change the radiative balance.

Run the model for a year and find the annual mean net heating at the ground (GLOBAL: NET HEAT AT Z0, or OCEAN: NET HEAT Z0+RVR budget pages). That quantity can change from year to year within .5 W/m2. Increasing U00ice by .01 decreases the cloud cover and planetary albedo. Increasing U00wtrX to be larger than 1 reduces cloud cover in the low latitudes. HRMAX can vary between 200 and 1000 m. Increasing HRMAX increases the cloud cover in the boundary layer which has a warming effect. Note HRMAX is only affective if the do_BLU00 option is set (which is not be default). Rerun that year with the new values and hope that the annual mean net heating at the ground averages out to less than .3 W/m2 in absolute value over the next few years.

Other obsolete ways to handle the rad. balance problem: Change the solar constant until rad. balance results (most people hate that way), change the solar irradiance into the ocean when running the Qflux model (using the old SRCOR parameter) offsetting the imbalance (advantage: no need to repeat the control run, disadvantage: Jim hated it, no longer part of the code).

B: How to set up a Q-flux model with a single ocean layer

To run the Q-flux model it is necessary to supply the implied horizontal heat transports and proper initial condition as input files. To calculate the horizontal ocean heat transports it is necessary to run the model with prescribed ocean mixed layer temperature and sea ice extent for 5-6 years. It is desirable to have this control run well balanced (Net Heat at z0 is supposed to be close to 0 for annual mean. This Net Heat at z0 can be found on the first budget page for the global quantities. To get the balanced control run, you can adjust U00ice/HRMAX (see 1b). In your control run-decks, you have to have Kvflxo=1. Then the model will save the data for the heat transport. Note that if you would like to have dynamic ice in the Q-flux model, you must include the dynamic ice modules in the climatological SST run, otherwise the implied fluxes will be wrong.

Based on the saved VFLXOmmmyyyy data, a new input file (OTSPEC...) has to be created that specifies the horizontal heat transports in the ocean; in addition, the ocean temperatures at the annual maximum mixed layer depth and the mean temperature between that level and the bottom of the current mixed layer depth have to be computed for the final time of the preliminary run and added to the final restart file of that preliminary run. That augmented restart file will serve as initial state file for the new run.

For the ocean heat transport calculation, it is necessary to skip the first year of this control run because of the spin-up of the atmosphere. For this calculation you need to do next (suppose your control run is RunIDcontrol):

  1. Go to the ../decks directory and do gmake auxqflux RUN=RunIDcontrol. This will create executables for the ocean heat transport calculation and it will put them to the directory ../modelE/decks/RunIDcontrol_bin.
  2. Go to the ../aux directory and run the script
        mkOTSPEC.E001.M250D RunIDcontrol first_month first_year last_month last_year
    
    where months and years are integers and span the months that you wish to be used for the heat transport calculation. Usually first_month=1, first_year=1951, last_month=12, last_year=1955 if your control model went for 6 years starting Jan.1 1950. Note that the surface heat imbalance number is the ratio of the incoming energy to the outgoing. A number less than one denotes a net heating of the ocean. If the number is with 1% of unity, there is no need to worry.
  3. After this you will get two files (in the original run directory) which you will use as input data for your Q-flux model:
    1. OTSPEC.RunIDcontrol.M250D.first_year-last_year This file is the implied ocean heat transport data. In your Q-flux run-deck the input file OHT has to have this name, in other words OHT=OTSPEC.RunIDcontrol.M250D.first_year-last_year. You should move this file to /u/cmrun directory.
    2. IMJAN.RunIDcontrol.O250D is the modified restart for the Q-flux model. This restart is basically is the same restart as your control model restart but it also has corrected temperatures of the ocean which were not used in the control run. These temperatures are the temperature between the mixed layer and the annual maximal mixed layer depths (TOCEAN(2,im,jm)) and the temperature at annual maximal mixed layer depth (TOCEAN(3,im,jm)). You have to specify AIC=IMJAN.RunIDcontrol.O250D in your Q-flux run-deck. You have to move this file to /u/cmrun directory as well.
    3. In your aux directory you will also get the line-plot file ohtRunIDcontrol.first_year-last_year.lpl which you can plot and compare with observational data if you like. You will also get few PRT files which you can remove.
    In your Q-flux model run-decks, you have to
    1. specify KOCEAN=1;
    2. make sure that U00ice/HRMAX are the same as in your control run. Then your Q-flux model will be as well balanced as your control run.
    3. specify AIC and OHT as described above.

    During the first few years of the q-flux run, monitor the time series of global annual means of Ocean Ice Cover and Surface Air Temperature. Hopefully, they will no longer change after a few years (10-25). The model is said to be in balance as soon as that happens. If this is not the case, it may be possible to adjust some of the ice-related parameters to improve the sea ice distribution and hence the stability. Two parameters can be used 'oi_ustar0' and 'silmfac'. These control the basal and lateral melt of sea ice. Since they are not used in a fixed SST/sea ice run they can be used to tune the q-flux model. Increasing 'oi_ustar0' (default=5d-3) reduces the temperature difference between the ocean and sea ice and increases the basal heat flux, decreasing ice thickness and ice extent. Increasing 'silmfac' (default=1.4d-8) increases lateral melt and therefore ice concentration. Note that 'oi_ustar0' is only effective with non-dynamic ice.

C: How to set up a Q-flux model with diffusion into a deep ocean

Qflux model with Diffusion into the deep ocean

Such a run needs a preliminary Q-flux model running for 10 years in equilibrium, saving (KCOPY=3) the data necessary for diffusion computations. A new file has to be created ('TG3M' in the rundeck) using 'gmake auxdeep RUN=E001xyz' and then running 'mkdeep *.oda.*'. This creates the initial conditions and climatology at the base of the mixed layer. Use E001qd.R as a rundeck template.

The deep ocean layers 2-LMOM (currently = 12) start at the annual maximal mixed layer depth. The thickness of layer 2 (DZ2) is 10m and the layer thicknesses increase geometrically s.t. DZ1X+DZ2+DZ3+...+DZlmom=5000m (lmom=12); i.e. DZ1X=10/Q, DZ3=10*Q, DZ4=DZ3*Q, ... and DZ1=ann.max.mixed layer depth DZ1X is not used; the dzo's to compute the vertical gradients are given by : dzo1=10/sqrt(Q) , dzo2=dzo1*Q , dzo3=dzo2*Q , etc Q=1.701...

The diffusion into the deep ocean is computed once every day using the temperature anomaly with respect to 'climatology' as tracer. The climatology is found by taking a 10-year mean of the Qflux model without a deep ocean. That run is also used to initialize the deep ocean run.

More detailed description:

T1 = temperature of mixed layer (C)                       =TOCEAN(1)
T2 = T between mixed layer and ann.max mixed layer depth  =TOCEAN(2)
T3 = temperature at annual maximal mixed layer depth.     =TOCEAN(3)

RTGO(L=1->lmom) is set to 0 at the beginning of the run

RTGO(L=2->LMOM) = temperature anomaly for deep ocean layer L 
RTGO(1)=12-month-mean anomaly of T3 w.r. to climatology, i.e.:
    before the diffusion computation it is reset to the difference
    dT3ann="mean_T3_over_last_12_full_months - 10year_mean_of_prel.run";
    dT3ann is kept constant throughout the month. The diffusion
    computation changes RTGO(1); that change is added to T1,T2,T3,
    Before the next diffusion computation, RTGO(1) is reset to dT3ann.  

The heat flux at interface of layers L and L+1 is given by
    EDO*(RTGO(L)-RTGO(L+1))/dzo(L)  L=1,...,8
however, a semi-implicit scheme is used to compute these fluxes.

The following auxiliary arrays are used to carry out this scheme:
STG3 is set to 0 at the beginning of each month, at the end of
    each day  T3  is added; i.e. STG3=SumD_T3 (summed over days) 

TG3M(M) initially contains a 10-year mean of SumD_T3 for month M
    (the sum is taken over each day of month M, not averaged yet)
    of the Q-flux run without deep ocean that is used to start
    up the current run, at the end of month M it is set to STG3.
    From month 13 on, TG3M contains SumD_T3 for each of the last
    12 months.

DTG3 is set to 0 initially and is updated at the end of month M
    by adding (STG3-TG3M(M)). Then STG3 overwrites TG3M(M), so
    that next year this STG3 will be subtracted and cancels out:
    after 1 year: DTG3 = (sumD_T3_for_yr1 - SumM orig.TG3M) and
    after that the oldest month is subtracted, the newest added,
    hence DTG3 = (sumD_T3 over last 12 months - SumM orig.TG3M); so:
ADTG3 = DTG3/365 = 12-month running mean of T3 - T3clim ;
    this is used to initialize RTGO(.,.,1) before the diffusion.

Compared to the Qflux model without deep ocean, the model with deep ocean needs one extra input file (EDDY = diffusivity coefficients) and uses the ODEEP module instead of OCNML. If starting from a restart file from the qflux run, you must use ISTART=4.

D: How to use parameters to control numerical instabilities

Parameter affects: depends on:
DT stability at poles hor. resolution, AVRX
NIsurf stability at the surface vert.resolution near surf
XcdLM stability at the top model top pressure PMTOP
M/NFILTR checker board patterns dynamics scheme

E: How to use IRANDI to create an ensemble of runs

If ISTART<10 and IRANDI/=0, all atmospheric initial temperatures are randomly perturbed by at most 1C . The perturbations depend on the value of IRANDI.

So to create an ensemble of N runs whose only difference is the initial state they start up from, pick N (odd) numbers and create N copies of the rundeck that only differ in the runID and the value set for IRANDI.

It is good practice to put IRANDI on the same line as ISTART so it will not affect later extensions or reruns, since it is then automatically removed from the file 'I'.

F: How to fall back to a smaller time step automatically

The atmospheric dynamics code uses a fixed time step which is defined by the rundeck parameter DT. If the program stops because of the atmospheric instability due to the fact that DT is too big, one has to restart it with a smaller time step, then stop and restart it again with a normal time step.

This process can be done automatically. To set the model for an automatic restart one has to add to the rundeck a line

DTFIX=new_time_step

after the label but before the &&PARAMETERS instruction. If such line is present and the model stops due to too big DT then the model will restart with DT=new_time_step, compute 48 hours of model time, then stop and restart with the normal time step. Corresponding .PRT files will be saved.

F: How to choose which ISTART to use

ISTART controls how the model first picks up the initial conditions to start the run. We have a number of options depending on how much information is already available.

  1. ISTART=1 Default start. This sets atmospheric variables to constants and requires input files for ground values (a GIC file), and ocean values (OIC) if required.
  2. ISTART=2 Observed start. This sets atmospheric values to observations, based on a particular format of AIC file. As for ISTART=1, input files are required for ground and ocean variables.
  3. ISTART=3 Not used.
  4. ISTART=4 A restart from an rsf file from a previous run, but the ocean is reinitialised. Needs an initial OIC file (for fully coupled models).
  5. ISTART=5 A restart from an rsf file from a previous run, but no tracers. This is only useful for tracer runs that need to be initialised with a particular model state.
  6. ISTART=6 A restart from an rsf file from a previous run that might not have had the same land-ocean mask. This makes sure to reset snow values, pbl values and ocean values accordingly.
  7. ISTART=7 A restart from an rsf file from a previous run with the same land-ocean mask. This still makes sure to set snow values and ocean values. This is used mainly for converted model II' data.
  8. ISTART=8 This is a restart from a model configuration identical to the run now starting. This is for perturbation experiments etc.
  9. ISTART=9 This is a restart from this model run. (i.e. a continuation, or a backtrack to an older rsf file).
  10. ISTART=10 This is used internally to pick up from the instantaneous rsf file (the later of fort.1 and fort.2). Does not ever need to be set in the rundeck.
  11. ISTART=11 This is used internally to pick up from an instantaneous rsf file (fort.1). Only rarely used.
  12. ISTART=12 This is used internally to pick up from an instantaneous rsf file (fort.2). Only rarely used.
  13. ISTART=13 This is used internally to pick up from the instantaneous rsf file (the earlier of fort.1 and fort.2). Only rarely used.
  14. ISTART<0 This option is used by the post-processing program to run the model to generate nice diagnostics. This should never need to be set manually.
Please note that if you get this wrong i.e. you use an ISTART that is not compatible with the AIC, GIC or OIC files in the rundeck, the model is not guaranteed to crash nicely. In fact, the opposite is likely. Strange crashes on start up are very often related to this.

3) HOW-TO stop and restart the GCM

Use the command 'sswE E001xyz' to stop the model run in a controlled fashion. Restart it with the command 'runE E001xyz'. If you would like the run-time printout to be suppressed (to save filespace) use the command 'runE -q E001xyz'. The monthly/seasonal/annual printout can always be recreated if needed.

4) HOW-TO change boundary conditions

Simply alter the file declarations in the rundeck appropriately!

  1. HOW-TO alter the topography file consistently

    The topography file (TOPO) in the rundeck, must have a degree of internal consistency. Therefore if changes are made to the ocean mask (FOCEAN), for instance, other changes are required for the other fields (FLAKE,FGRND,FGICE etc.). Note that in the model FGRND=FEARTH, and FLICE=FGICE for historical reasons not worth going into. This file is made up of 9 fields (in the standard TITLE*80,DATA(IM,JM) format):

    1. FOCEAN fraction of ocean
    2. FLAKE fraction of lake
    3. FGRND (or FEARTH) fraction of non-glaciated land
    4. FGICE (or FLICE) fraction of glaciated land
    5. ZATMO mean topography (m)
    6. HOCEAN depth of ocean (m) (not used except for GISS ocean)
    7. HLAKE depth of lake (m)
    8. HGICE depth of glacier (m)(not yet used)
    9. ZSOLID not sure! (not used except for GISS ocean)
    In particular:
           IF (FOCEAN(I,J).gt.0) THEN
             FLAKE(I,J)=0.   ! lake and ocean cannot co-exist at the same grid point
             HLAKE(I,J)=0.
             FLAND=1.-FOCEAN(I,J)
           ELSE
             FLAND=1.-FLAKE(I,J)
    	 HLAKE(I,J)=MAX(1.,HLAKE(I,J))
           END IF
           FGRND(I,J)=MAX(0.,FLAND-FGICE(I,J))! land ice fraction takes precedence over ground
           FGICE(I,J)=FLAND-FGRND(I,J)
           
    If land boxes are added close to the Antarctica or Greenland ice sheets, FGICE should be set to FLAND (i.e. 1-FOCEAN), FGRND, FLAKE, and HLAKE to zero. Note as well that the polar boxes must be homogeneous (i.e. FOCEAN(2:IM,JM)=FOCEAN(1,JM) etc.)

    Note that if FLAKE(I,J)>0, then HLAKE must be greater than 1.

  2. HOW-TO run the model with a new land mask

    If you changed the topography file you may have created new land cells. These cells are missing both boundary conditions and prognostic variables related to the land surface model. One has to extend the land surface data to these cells before the model can run.

    To remedy this problem we created a set of "extended" boundary conditions files. These files are the same as original ones except the data was spread all over the ocean cells using a nearest-neighbor algorithm, so that each cell contains some reasonable land surface data. If you are working on Ra then these files are at the same location and have the same name as originals with the suffix .ext appended to them. Here is the list of the files which needed extension as they would appear in the rundeck:

         CDN=CD4X500S.ext                     ! - roughness length
         RVR=RD4X525.RVR.ext                  ! - river routing
         SOIL=S4X50093.ext                    ! - soil data
         VEG=V72X46.1.cor.ext                 ! - vegetation data
         TOP_INDEX=top_index_72x46.ij.ext     ! - topographic index
        
    The above files provide boundary conditions. One also needs to use a compatible restart file with extended prognostic variables, currently
        AIC=DEC1958.rsfB394M12.modelE.17.ext  ! - initial conditions
        
    One should keep in mind that the data should be extended in a compatible way. At least the data in SOIL and AIC should be extended from the same "nearest neighbors". To preserve the information about this "extension" process the file conv_index.ext was created (located on Ra in /u/cmrun). It contains the coordinates of "nearest neighbors" used when extending the data. More precisely, the data array is extended as follows:
        integer :: conv_index(im,jm,2)
        open(1,file='conv_index.ext')
        read(1) conv_index
        do j=1,jm
          do i=1,im
            if ( conv_index(i,j,1) .ne. 0 ) then
              i0 = conv_index(i,j,1)
              j0 = conv_index(i,j,2)
              array(i,j)=array(i0,j0)
            endif
          end do
        end do
        
    One can use this algorithm to extend any land surface data if necessary. To create a new extended AIC file with conv_rsf one has to pass to it conv_index.ext as a third parameter. For current setup on Ra it will look:
        conv_rsf /u/cmrun/DEC1958.rsfB394M12 new_rsf_file /u/cmrun/conv_index.ext
        

    To summarize, when using a non-standard land mask one has to replace the above mentioned files in the rundeck with their .ext equivalents. One also has to use ISTART=6 (instead of ISTART=7) to force regeneration of PBL data.

5) HOW-TO change trace gas amounts

Trace gas amounts can be set in two ways. Firstly from a scenario (or historical data) as a function of time. This is useful for transient experiments. Secondly, you can set most trace gas amounts as multiples of the default values in order to look at equilibrium sensitivities for instance.

  1. Setting the Trace Gas Year:

    The file GHG in the rundeck can be used to define a trace gas scenario (the default file gives observed values from 1850 to 2000). However, any other scenario can be defined and used. If the variable 'ghg_yr' is set in the rundeck, then that year's values are used throughout the run. If a transient scenario is required, set ghg_yr=0 and the years will be taken from the model year. (A similar procedure works for aerosols (Aero_yr), solar forcing (sol_yr), volcanoes (volc_yr) and ozone (O3_yr) as well). Note that stratospheric water vapour amounts are related to the ghg_yr (and the methane concentration) if H2ObyCH4=1 (the default is not to include the methane related flux).

  2. Changing fixed trace gas concentrations:

    If you want to test the sensitivity to various hypothetical trace gas amounts, there are some multiplier parameters that can be used for the three principle greenhouse gases (CO2, CH4 and stratospheric water vapour). These variables CO2X, CH4X and H2OstratX are by default 1. Since the default ghg_yr is 1951, changing the values of these multipliers gives a tracer gas concentration of (CO2X) times the 1951 CO2 level. These defaults for reference are, CO2=311.098 and CH4=1.10298 (but really depend on the 1951 values in the GHG file). H2OstratX multiplies whatever the stratospheric water vapour amount is.

  3. For other gases (CFCs, N2O, O3) there are no rundeck parameters available, but they can be changed based on the 'FULGAS' array in init_RAD in a similar way to CO2.

6) HOW-TO have a perpetual winter

Orbital parameters must be fixed and so 'ORBIT' in DAILY should be given the same JDAY every day. (i.e. JDAY0=1 for JAN 1st conditions). Though this will have consequences for the polar regions (since there will be no seasonality there). Possibly there is another way....

7) HOW-TO travel back/forward in time

A number of things can change depending on the time period concerned.

  1. orbital parameters. These can be changed directly in CONST.f, or by setting 'calc_orb_par=1' in the rundeck, and then using 'paleo_orb_yr' to define a year (Before Present (i.e. 1950)) from which the values can be calculated.
  2. trace gas amounts. (see Q6) Note that for really large changes, the mean atmospheric composition will change (and thus so will RVAP etc. See CONST.f for details)
  3. continental configuration. Change the boundary condition files in the run deck. (Also for land use/cover changes, ice sheets etc.)
  4. Mean atmospheric pressure. Change PSF etc. in the RES_xyz file.
  5. Speed of rotation. A little tricky since this effects OMEGA (in CONST.f) but also SDAY, and depending on how you define an 'hour', the number of seconds in an hour, or the number of hours in the day. Remember that dtsrc must divide into sday exactly.
  6. Solar constant. This can be set by the multiplicative factor S0X within the rundeck.
  7. Aerosol/dust amounts. These can be changed within RAD_DRV.f or by changing the RADN input files.

8) HOW-TO move to Mars

Like travelling back in time, only more complicated. Oh, and no water. Note that the length of year is set by EDPERY in CONST.f.

9) HOW-TO move to Wonderland

The wonderland model is a sector version of the GCM that is used for very fast or very long runs that do not have to be as true to the real climate as normal. Essentially, the globe is split into three sectors, and only one sector is prognostically calculated. You must alter the resolution (IM will be 24), change the definition of DLON (in GEOM_B.f), and make sure that the radiation calculation has been modified to synchronise the diurnal cycle over the sector. Note that this has never been tested.

10) HOW-TO get help or report a problem

There is a mailing list 'giss-gcm-users-l@giss.nasa.gov' where queries, problems and comments on the gcm can be aired. Sending a message here is probably more likely to get a quick response than emailing any one person.

Subscribe to this by sending a message to Majordomo@giss.nasa.gov with a line "SUBSCRIBE giss-gcm-users-l" in the body of the text.

11) HOW-TO keep up-to-date

Periodically, we will release new, and hopefully improved versions of the model code. You will be notified through the mailing list, and you can decide whether to upgrade your code, or wait until a more suitable time. None of your changes will be lost if you upgrade. However, some conflicts may occur, though they should be minor. Detailed instructions will be given at the time.

12) HOW-TO move the model offsite

The model code itself is self-contained. Thus it should be transportable offsite. However, certain directories and scripts need to be configured for the local set up. Use 'gmake config' to create the '.modelErc' file in your home directory. Edit this to customise your structure, including the central directory (which is /u/cmrun/ by default). The input data files are not part of the release, and will need to be downloaded separately.

The biggest problem may be if the machine you wish to run the code on has a different fortran compiler. Some variation has been fully allowed for (i.e. the standard SGI or IBM compilers, Absoft Profortran on a Linux system and the DEC/COMPAQ version), however, we do not have a stable testbed for any other compilers and therefore cannot guarantee it will always work.

General problems will arise due to arbitrary compiler decisions on what to name compiled module files (which affects the determination of the inter-routine dependencies). Edit model/Rules.make to fix this. The code is compilable without using the 'static' option, which removes many of the problems that used to occur. Most of the compiler dependent code (that we have discovered) can be found in SYSTEM.f. These involve system calls to the CPU clock, the calculation of pseudo-random numbers, and the setting of exit codes. We would be very interested in further compiler dependencies if found. Different compilers are dealt with using preprocessor directives from the Makefile.

13) HOW-TO add to this documentation

This documentation is part of the model code, and so changes to files in the modelE/doc directory can be uploaded to the main repository. Please contact one of the developers if you have some changes that you would like to commit.

14) HOW-TO keep up-to-date with the latest releases of the model

From time to time relatively stable versions of the model will be "released", i.e. that they will be marked and assigned some specific name, or "tag" by which they can be checked out from CVS. The typical name for such released versions will be "modelE-x-y", where x and y are some numbers. It is recommended that those who are not involved into active development on the main trunk of the CVS use only these "released" versions.

If you are currently working on your private branch and want to upgrade the model to a new release, you can do it automatically, provided that for your previous update you also used a "released" version. To upgrade your model to the release modelE-a-b go to decks directory and execute

    gmake update RELEASE=modelE-a-b
This will merge into your branch all the changes which were introduced into the model since your last update until the release modelE-a-b. Automatic update doesn't mean automatic resolution of all conflicts. If such arise they have to be checked and fixed manually.

9) HOW-TO nudge the model

The model can be run in a nudged mode, whereby the horizontal wind components are nudged towards reanalysis data sets (NUDGE.f). The following nudging formula is used:

f(n+1) = f(n) + a delta b / (1 + a delta)
where f(n+1) is the variable to be nudged at the new timestep (n+1) and at the old timestep f(n), respectively. delta is the model time step and b is the reanalysis value. a is the nudging coefficient. a depends on the model timestep and a nudging timescale ANUDGEU, ANUDGEV (for the u and v-velocities, respectively)

ANUDGEU and ANUDGEV are set in the input parameter list in the rundeck file. ANUDGEU=0.01 ANUDGEV=0.01 was tested to give reasonable results. The preprocessor directive #define NUDGE_ON must be set to turn the nudging on.

The reanalysis data sets must be available at the horizontal grid resolution of the model an a 6 hourly time step. The vertical interpolation is done in the model, so that the same reanalysis input files can be used for different vertical model resolutions. Netcfd input is required.


Part II: Getting information from the diagnostics

1) HOW-TO look at the output

You can look at the output in two distinct ways: using the run-time printout, or using the post-processing program, pdE.

  1. The run-time printout.

    If you run the model using 'runE' (as opposed to 'runE -q' which suppresses the printout) the model will produce a $RUNID.PRT file which can contain copious amounts of diagnostic information. It is split into a number of sections: (see next question for a list). This is produced each month (but can be controlled using the KDIAG switches and NMONAV in the NAMELIST).

  2. Post-processed output:

    Each month the program produces an 'acc' accumulated diagnostics file which contains all of the diagnostic information from the previous month. The program 'pdE' (in the exec directory) is an alternate entry point into the model that reads in any number of these files and a) creates a printout (as above) for the time period concerned and b) creates binary output of many more diagnostics. This can be used simply to recreate the monthly printout, but also to create longer term means (annual, seasonal, monthly climatologies etc.).

    For example, to recreate the printout in /u/cmrun/$RUNID;

    • for a single month: pdE $RUNID JAN1987.acc$RUNID
    • for all Januaries: pdE $RUNID JAN*.acc$RUNID
    • for a whole year: pdE $RUNID *1987.acc$RUNID

    For pdE to work properly, the directory /u/cmrun/$RUNID has to exist and contain at least ${RUNID}ln ${RUNID}uln ${RUNID}.exe . The output files will end up in the PRESENT WORKING DIRECTORY which may be /u/cmrun/$RUNID or any other directory; names and order of the inputfiles are irrelevant (as long as the format of the files is compatible with the model ${RUNID}).

    It is possible to use pdE to average acc-files from several runs, e.g. average over an ensemble of runs. Although the numbers that are produced are fine, subroutine aPERIOD will not be able to create the proper labels: the runID will be taken from the last file that was read in and the number of runs averaged will be interpreted as successive years, so averaging years 1951-1955 of runs runA runB runC runD will produce the label ANN1951-1970.runD rather than ANN1951-1955.runA-D. Some titles will also suffer from that 'bug', but it should be easy to fix it manually afterwards, if anybody cares.

Note that the output can be controlled (a little) by the settings in 'Ipd' (which is created if it does not yet exist in the present working directory). A number of files will be created whose names contain the accumulated time period. (monyear[-year] where mon is a 3-letter acronym for a period of 1-12 consecutive months).

        monyear.PRT      the printout
        monyear.j$RUNID  zonal budget pages (ASCII Aplot format)
        monyear.jk$RUNID latitude-height binary file
        monyear.il$RUNID longitude-height binary file
        monyear.ij$RUNID lat-lon binary file
        monyear.wp$RUNID Wave power binary file
        monyear.oij$RUNID lat-lon binary file for ocean diagnostics
        monyear.ojl$RUNID lat-depth binary file for ocean diagnostics
        monyear.oht$RUNID lat ASCII file for ocean heat transports 

which can be read using the appropriate graphical software.

2) HOW-TO change what is included in the printout

The switches KDIAG which are set in the NAMELIST control which subgroups of diagnostics are calculated and output. At present, there is no simple way to pick and choose exactly which diagnostics appear, although we are working on a scheme to do just that. Some of the following options are rarely used, and may soon be made obsolete. These switches can also be used in 'Ipd' input for pdE to control which output to produce. If QDIAG=.true. some binary files are created accompanying the printed output. If QDIAG_ratios=.true. fields which are the products of q1 and q2 and whose titles are of the form "q1 x q2" are replaced by the ratios field/q2, the title being shortened to "q1" (default).

Current options are:

KDIAG(1)  < 9 print Zonal/Regional budget pages
   = 8 only print regional diagnostics
   2-7 only print global zonal means
   = 1 print all zonal diagnostics

KDIAG(2)  = 0 print Latitude-height diagnostics
          = 1 print at most fields listed in file Ijk
              (rearranging the lines of Ijk has no effect)
          = 8 Only print spectral analysis of standing eddies
          = 9 skip all Latitude-height diagnostics
          2-7 same as 1

KDIAG(3)  = 0 standard printout, all binary files if QDIAG=.true.
          = 1 ij-fields as in list Iij, all ijk-fields
          = 2 ij and ijk fields are handled according to list Iij
          = 7 ij-fields as in list Iij, no ijk-fields are produced
          = 8 full list Iij of available fields is produced; this list may
              be edited : don't touch lines starting with 'List' or 'list'
                          you may delete any other lines
                          you may rearrange the remaining ij-fields
                          you may add blank maplets
          = 9 no ij and ijk diagnostics are produced
          3-6 same as 2

KDIAG(4)  < 9 print time history table of energies
KDIAG(5)  < 9 print spectral analysis tables

KDIAG(6)  < 9 print diurnal cycle diagnostics at selected points (up to 4)
          1-4 print out for first 4-KDIAG(6) points (will soon be obsolete)
        -1 - -4 print out for -KDIAG(6) point only.

KDIAG(7)  < 9 print wave power tables

KDIAG(8)  < 9 print tracer diagnostics
          = 0 print all tracer diagnostics
          = 1 skip lat/lon tracer diagnostics
          = 2 skip tracer conservation diagnostics and lat/lon diagnostics

KDIAG(9)  < 9 print out conservation diagnostics
KDIAG(10) < 9 print out Longitude-height diagnostics
KDIAG(11) < 9 print out river runoff diagnostics
KDIAG(12) < 9 print out ocean diagnostics (if there is an ocean)

There is also one section of special lat-lon diagnostics which can be optionally accumulated and output. These are the isccp_diags which calculate some cloud properties as if they were being observed by satellite. This is set as an option in the rundeck (isccp_diags=1) to calculate them. By default these diags are not calculated.

3) HOW-TO produce daily/monthly or seasonal diagnostics

Controlling the frequency of output of the ACC files is done using the NAMELIST parameter NMONAV. This sets the number of months over which the diagnostics is accumulated. If it is set to 3, seasonal acc files are produced. If set to 12, you will get annual means. Higher numbers could be chosen, but are probably not very useful.

Producing more frequent diagnostics is controlled by the SUBDAILY module. The frequency of saving is controlled by the variable Nsubdd which is the number of hours between each snapshot (i.e. Nsubdd=1 implies hourly diags, =24 daily). The variables saved are controlled by the string SUBDD, which is a space seperated list of names for the required diagnostics. These are currently limited to instantaneous values of some standard fields (SAT, PS, PREC, SLP, QS, LCLD, MCLD, HCLD, PTRO, QLAT, QSEN, SWD, SWU, LWD, STX, STY, ICEF, SNOWD, TCLD, SST, SIT), the geopotential height, relative humidity and temperature on any (or all) of the following fixed pressure levels (if available): 1000, 850, 700, 500, 300, 100, 30 , 10, 3.4, 0.7, .16, .07, .03, and velocities on any model level. The fixed pressure level diags are set using Z850, R700 and T100, for instance, to request the 850 mb geopotential height, 700 mb relative humidity and 100 mb temperatures, respectively. Requests for ZALL or TALL, will produce individual files of all available levels. The velocities are set using U1, U5, U12, for instance, to get the 1st, 5th and 12th layer U velocities, respectively. If "ALL" levels are requested for a velocity, only one file per variable will be output, containing all requested levels. To limit the number of output levels, set the LmaxSUBDD parameter in the rundeck, and only levels L=1,LmaxSUBDD will be output.

More options can be added as cases in subroutine get_subdd in DIAG.f if required. For example, including the following string in the rundeck (i.e. SUBDD="SLP SAT Z500 R850 U5 V5" will produce diags of the sea level pressure, surface air temperature, 500mb geopotential heights, 850 mb level relative humidity and the 5th layer U, and V fields. Each timeseries will be output one month at a time in files named SLPmmmyyyy and SATmmmyyyy (for example) for each month and year. Restarts will find the appropriate file and continue to append records as would be expected.

You can easily keep track of the running average of all the printout diagnostics using the NIPRNT switch in the parameter database. By setting this to a positive number, the diagnostics will be printed out that number of hours. (i.e. setting NIPRNT=10, will print out the running average of the accumulated diagnostics at the end of each of the next ten hours).

4) HOW-TO control the format of the binary output (i.e. GISS/netcdf/hdf etc.)

The binary output created from 'pdE' can be in a number of formats. Currently there are two options, the traditional GISS formats, and netcdf output. This is set by choosing to compile the model with POUT (for GISS-style output) or POUT_netcdf (for netcdf output). This is set in the rundeck. Note that the choice is made AT THE TIME OF COMPILATION OF THE MODEL EXECTUABLE. If you subsequently change your mind, you must edit the rundeck and recreate the executable (using 'gmake exe RUN=E001xyz').

Other formats (HDF?) could be defined as you like with a suitable POUT_xyz.f file.

5) HOW-TO calculate a model score

There is an RMS.f program in the aux/ directory which calculates a 'score' for a model run based on the RMS difference with some selected observations. Additionally, it calculates the Arcsin Mielke score (AMS), which is a non-dimensional statistic (between -1,1 (1 being perfect)) which also includes effects of mean bias and variance. It is compiled whenever the 'gmake aux RUN=....' command is run. Like CMPE001 or qc, it is technically model dependent (since it uses im,jm etc.), but in practice you will not need to recompile it for every version. The executable will be put into the RUNNAME_bin directory (i.e. for gmake aux RUN=E001xyz, the executable will be E001xyz_bin/RMS). Running RMS on its own gives its usage.

You should produce diagnostic files over at least a five year mean. So for a particular run, use pdE to create the JAN and JUL averages

      cd E001xyz
      pdE E001xyz JAN195[1-5].accE001xyz    # => JAN1951-1955.ijE001xyz
      pdE E001xyz JAN195[1-5].accE001xyz    # => JUL1951-1955.ijE001xyz

Make sure that the TOPO file is linked:

      E001xyzln

and then run RMS

      ~/modelE/decks/E001xyz_bin/RMS E001xyz JAN1951-1955.ijE001xyz JUL1951-1955.ijE001xyz

This will create files 'E001xyz' in the directories /u/cmrun/modelE/RMS and /u/cmrun/modelE/AMS, which contain the RMS statistics and the Arcsin Mielke score for each selected field,respectively. You can then compare these scores with other existing versions.


Part III: The model code

1) HOW-TO find where a variable is defined

Variables (other than purely local variables) are defined in three main ways. Physical constants (real constants, not tuning parameters), are defined in const.f, model related variables (i.e. variables to do with how the model is run) are defined in MODEL_COM.f, and all other variables are defined in the physics MODULES. Variables that are on the model grid will be defined in the PHYS_COM module, while parameters and options for the local physics will be defined in the PHYS module. Thus decide to which physics module your variable is most related to and look there.

For more brute force efforts, run the command 'gmake htmldoc RUN=RUNNAME'. This will create a html document that indexes and defines all important variables (well not all, but most, ok some!). Alternatively grep -i "varname.*=" *.[fSU] will probably find it for you.

2) HOW-TO follow the program structure

The model is essentially on three levels. MAIN in MODELE.f is the top level and controls the time-stepping and deals with most of the model overhead. The time-step loop in MAIN is the principal path that the model follows. This is a very linear loop, and therefore is not difficult to understand.

The level below this is the level of the drivers. These are routines that translate model variables (ie. those defined on the model grid), to local variables (usually, but not always, vertical arrays), save diagnostics and save variables/fluxes for later use.

Finally the third level is the level at which the actual physics gets done. This level generally is oblivious to the previous two levels and is code that could be used in many different contexts. It is very important to keep this true. DO NOT write directly to model arrays within a local physics subroutine. (Some outstanding cases may appear to do that, but this is being written out of the model).

3) HOW-TO use or add to the automatic documentation

You will have noticed that modelE has multiple documentation directives such as !@var or !@sum. These are key words that can be read by a script and turned into useful documentation. Each subroutine, and each important variable should have such a declaration (which should include a definition, units, etc.). The command 'gmake htmldoc' makes html output from a rundeck containing (and sorting out) all this information. The various directives are all defined in the main model common block. If you introduce new variables or subroutines, you MUST define similar documentation.

Currently defined keywords and syntax are:

@sum     UNITNAME Brief summary/description of the current program unit or file
@auth    Author
@ver     Version (a number plus any relevant comments)
@calls   List of routines called in this program unit
@cont    List of program units contained within a file or module + other info.
@fun     FUNCNAME Denotes a function
@param   PARAMNAME Denotes a parameter (in the FORTRAN sense)
@var     VARNAME Denotes a variable (in the FORTRAN sense)
@dbparam VARNAME Denotes a database parameter (in the modelE sense)
@nlparam VARNAME Denotes a NAMELIST parameter (in the modelE sense)
@+       Continuation line for @sum/@calls/@cont

4) HOW-TO ensure that your code meets the new standards

Please check the document Coding standards for ModelE and try to conform as nearly as possible to the (not very onerous) conditions. Unfortunately, we do not have anybody whose job it is to make sure that added code conforms. However, every time this happens it makes it harder to keep the model consistent, and leads to the creation of unnecessary complications in keeping things straight.

5) HOW-TO add functionality

  • A: Variations on existing code

    Since all code is now kept within CVS, code history is preserved and can be retrieved at any time. So for minor modifications to the code (i.e. slightly different formulation of a particular term), go straight ahead and modify your version. Compare the results from that with a standard control run. As you extend your modifications, make sure to 'commit' your changes at appropriate points (on your branch only) so that you can retrieve the code from any intermediate (but functional) point.

  • B: Introducing new sub-modules

    If you want to add a completely new physical module or radically change an existing one, you will be best off creating a new 'plug and play' option. Follow the example of the ATURB code (which can replace DRYCNV if required). The issue is mainly one of keeping the interface clean and consistent. Unless what you propose will directly influence multiple parts of the model, it is best to keep your new code as modular (single column for instance) as possible, and only interact with the main code through the relevant drivers.

    In particular, if there is an existing call to an equivalent routine, keep the call statement the same for the old and new versions (and you can add dummy arguments to the old code if required). If you want to add a new call for which there is no analog, you may want to add a dummy routine for use in the old version (i.e as in RES_M12.f with regard to some of the stratospheric code). Put this in a file that you are making obsolete, or even create a new dummy file.

    Another way to do this is by using pre-processing directives set from the rundeck. This is not encouraged except where it would make inserting a new option much more straightforward. If it is used, limit its use to the absolute minimum (i.e. to decide whether to call the new routine or not). We specifically do not want multiply threaded options with any of the physics routines since that makes maintenance much more problematic.

    In any file where the directive is to used, there must be an

    #include "rundeck_opts.h"

    statement at the top of the file. The coding then follows standard practice:

    #ifdef NEW_FUNCTION
    call new_function(a,b,c)
    #endif

    To set this option, add a line to the preprocessing option part of the rundeck.

    Preprocessor Options
    #define NEW_FUNCTION ! do new function
    End Preprocessor Options

    The gmake command will recompile the relevant routines every time you change the option. Note that if options become too commonplace, the entire model will need to be recompiled every time. Therefore, our preferred solution is 'plug and play'.

    Note that for each new sub-module or pre-processor directive you create, you should modify the OPTIONS.html file to document the details.

  • C: Introducing a new parameter

    Introduce a new parameter if you would like to be able to vary it's value at run time without recompiling the model (i.e. an adjustable parameter, or an option switch for instance). Firstly, decide who should own the parameter. That is likely to be one of the physics modules. Since it is likely to be a fundamental part of a local physics process, it should be declared and defined in the local physics module. You MUST give it a default value, and ensure it is tagged with a '@dbparam' comment that explains what it does.

    In the initialisation for this module, you need to add a line

     call sync_param("NEWPARAM",newparam)
    

    which will set the value of newparam from the parameter database, if it has been set from the input rundeck, or from a restart. (see the FAQ for more details on how this works).

    In the rundeck you need only add to the &&PARAMETERS...&&END_PARAMETERS block a line like this:

    NEWPARAM=1        ! NEWPARAM sets something for XYZ
    

    Of course, you should use a more obvious name! If no value is set in the rundeck, the default value will be used.

  • D: Adding new diagnostics

    The diagnostic system that the GCM uses is very sophisticated and is almost certainly sufficient for any new need for diagnostics that arises. However, it can seem complex and difficult to use. As improvements in ease of use occur, it will get easier to modify.

    Adding a new conservation diagnostic: These arrays keep track of the zonal average of any quantity, and note which routines change those values. They are useful in understanding the overall mass/energy/tracer budgets. To add a new conservation quantity, first write a routine that returns the zonally averaged quantity in question. Secondly, in the 'init' subroutine for the relevant physical module, insert a call to 'SET_CON' this sets up the points at which the diagnostic is called, and how it is output. (see init_DIAG for more detailed instructions). Thirdly, you must increase NQUANT by 1, and KCON by 2+number of points at which you are saving the value. Setting the equivalent for tracer conservation diagnostics is similar to the above.

    Adding a new lat-lon diagnostic: These are the most common diagnostics to add and are relatively straightforward. There are three components to a new AIJ diagnostic: the declaration, the definition, and the accumulation. Everything else is done automatically. First of all, define a variable in DIAG_COM.f which will be an index for this diagnostic such as ij_xyz (where xyz is some clear mnemonic for your new diagnostic). Secondly, in DEFACC.f (subroutine ij_defs), add a definition:

          k=k+1 ! new diagnostic XYZ 
          ij_xyz = k
          lname_ij(k) = 'XYZ DIAGNOSTIC'
          units_ij(k) = 'XYZ_UNIT'
          name_ij(k) = 'XYZ'
          ia_ij(k) = ia_src
          scale_ij(k) = 1.
    

    The counter k ensures that each ij diagnostic has a unique index (which you don't need to know) which is set to the variable you previously defined. The lname and name are a long and short name strings which are used in TITLEs and 'get-by-name' code respectively. The units should be the units for the scaled output. The ia_ij index denotes how often the diagnostic is accumulated (in this case once every source time step, although this can very depending on where in the code it is done).

    The ij diagnostics can also be weighted (by surface type area, cloud amount, pressure etc.). This is signified by a string such as ' x POICE' in the long name of the array for a variable weighting, or by setting iw_ij(k) for a fixed weighting. Look in ij_defs for examples.

    The ij diagnostics can optionally appear in the ascii printout as well as the binary output. For this to happen, you must set a 'colorbar' for the pritnout by setting ir_ij to a suitable bar. Look at the LEGEND array in DIAG_COM.f for possible values. For the output to appear, you need to specifically ask for that to happen in DIAG_PRT.f. The array 'iord' in DIAGIJ defines all the ij printouts. You can either add to it (and make sure to increase the number of 'kmaplets' accordingly), or simply replace an existing maplet.

    Finally, put in the code that accumulates your chosen variable somewhere in the model. Make sure to USE the correct index from DAGCOM. Remember that all ij diagnostics will appear in the *.ijE001xyz file after using 'pdE'.

    Adding a new zonal 'budget' page diagnostic: These diagnostics store the zonal mean of various quantities and fluxes. The routine 'j_defs' in DEFACC.f controls their printout. The diags are output in the order they appear in the listing, with the exception of derived composite ratios which appear where specified in DIAG_PRT.f. If you want to add a simple scaled diag, define the index 'J_XYZ' in DIAG_COM.f, set up the meta-information in DEFACC.f, and accumulate it at the relevant point in the code. The output format for the values can be optionally changed in DIAG_PRT.f using the name you give the diagnostic. Ratios are a little more complicated.

    Other diagnostics will be explained if and when we work them out.

  • E: Adding extra timing information

    If you want more information on how long each part of the code takes (for instance if you put in some tracer code, or extra physics) the way to do it is to increase the number of timing variables such as are output by the program 'qc'.

    Initially, call the routine

           CALL SET_TIMER("NEW PHYSICS",MPHYS)
           
    Where the text is what will appear in the printout, and MPHYS is a locally defined integer variable. At the beginning of code you want to profile, add a call to TIMER(MNOW,MORIG) where MORIG refers to the timing counter current at this point (in order to zero the counter for your routine. At the end of your code, call TIMER(MNOW,MPHYS). That's it.

  • F: Coupling an ocean model

    Please read the coupling documentation for details of the fluxes, fields and entry points required for any ocean code.

9) HOW-TO read in an external file

ModelE has introduced a FILEMANAGER system that controls the assignment of unit numbers and the opening of files. To add a new file to be read in, first define a short mnemonic for the file (i.e. "TRSRC" for the tracer source file). This is the name that you will assign in the rundeck to the file you wish to open. Secondly, prior to reading in the file, use the routine 'openunit' to assign a unit number for the file using the mnemonic you gave it. ie.

     USE FILEMANAGER
     INTEGER iu_trc1
     ....
     call openunit("TRSRC",iu_trc1,.TRUE.,.TRUE.)
     READ(iu_trc1) DATA1,DATA2....
     call closeunit(iu_trc1)

The first logical variable argument is true if the file is to be opened for unformatted (binary) read/write, .FALSE. for ASCII input. The second logical variable denotes whether this is an already existing file or not. These arguments are optional, and default to TRUE. The line in the rundeck should look something like (depending on the filename of course):

TRSRC=tracer_sources.1970_1980.dat

The setup script looks automatically in /u/cmrun, /u/raid and in the current directory. If the file is elsewhere an absolute or relative pathname is needed.

10) HOW-TO add new variables to the restart files

Each module now has only one place that needs to be changed to accommodate extra variables in the rsf files: io_xyz (where xyz is a descriptor for the module like ocean, rad, lakes etc.). All input/output is controlled by these routines. Thus to add a new variable, simply add it to the list of variables for both the read and the write statements. Note that you MUST change the MODULE_HEADER character string if you change what is in the rsf (ie. increment the number by 1 each time you do this). This is necessary to ensure that we can distinguish various versions.

There are some complications for variables that can be set in the NAMELIST, namely, you must decide how this variable gets set as a function of the initialisation method. Look at io_rad as an example to follow.

11) HOW-TO access variables that are already calculated somewhere else

One of the main changes to modelE was to replace common blocks with Fortran 90 modules. The functionality is much the same, but since they do not assume anything about where in memory variables are kept, many of the problems associated with common blocks are alleviated. In order to make use of a variable from somewhere else in the code, that variable must be in the relevant MODULE (usually called PHYS_COM, for ease of remembering). Then in the routine of the module where you wish to use it you must specifically declare what you want using the USE statement. For clarity, we are insisting upon using 'USE, only' so that each variable that you require is explicitly declared, ie.

    USE PHYS_COM, only : var1,var2

If the name of the variable conflicts with a local variable, you can use the 'points-to' feature to map the PHYS_COM variable to a new name (as you may have done with common blocks), i.e.

    USE PHYS_COM, only : var1,var2_phys=>var2

The arrow should be read as 'points to', and thus var2_phys will be an alias for var2 in this routine/module only.

12) HOW-TO deal with the inevitable bugs you introduce

Nobody's code is perfect (least of all ours) and so when coding it is inevitable that errors are introduced. Some are minor and cause no obvious sign of distress (these are very difficult to find!), while others cause the model to crash and burn.

Let's say you've changed something in the model that seemed innocuous, and then the model crashes. There are three common points that seem to trap errors: the dynamics, the radiation and the PBL scheme. These errors will be flagged as either "A>1 in AADVTX" or similar (for crashes in the dynamics), "In Radia: TL out of range" or similar (in the radiation), and finally a "stop rtsafe", for a crash in the PBL.

Do not go looking for errors in any of these routines. What has generally happened is that you have introduced some wild fluctuation in some main variable (the temperature, humidity or velocities), or overwritten some the model variables that control advection and the like. Check that coming out of the routine you modified, everything is as it should be (set QCHECK in the rundeck, and include a call to CHECKT immediately before and after). Also do not bother commenting out the routine that actually crashed to see if the model can continue on okay. This routine is the symptom, not the disease.

One additional problem that arises mainly with tracer code is that the advection code can sometimes produce wild oscillations of the tracer concentration, which for sensitive tracer physics, can cause no end of problems. Again, the advection is not to blame! What has happened here is that the second order moments for the tracer field (the first and second derivatives at the centre of the grid-box) have somehow become uncoupled from the mean tracer amount, giving rise to an implied sub-grid scale profile that is seriously wrong. The biggest mistake is not to reduce the gradients by the same fraction by which the mean was reduced in some process. See the document Using the GISS GCM tracer moments for more details on how to do this properly. The CLOUDS.f tracer code is a good example of what to do.

13) HOW-TO deal with restart problems (non-reproducibility of results)

Occasionally after making coding changes the model ceases to be able to reproduce results. For instance, after a crash, a restart from the last fort.[12] file does not crash at all, or does so at a different point. This is a symptom of a restart problem. It should definitely be fixed, since it is clear evidence that the prognostic code has a real bug (one that could change the results).

These problems can arise in a number of circumstances:

  1. the rsf file is incomplete (a prognostic variable has not been saved). This is quite easy to spot and fix.
  2. a variable is not being properly initiallised. This is quite tricky to find. What happens is that the first time it is used, it is zero, but subsequently it is not. Thus at a restart it is set to zero again, changing the results. Try compiling with an option (-DEBUG:trap_uninitialized=ON on the SGI compiler) that sets all un-initialised numbers to NaN, therefore causing a crash if they are used.
  3. out of bounds memory addressing. Occasionally this can also result in a restart problem if the memory that is being overwritten is used prior to the over-writing. Thus the first time through it is clean, but subsequently has the value from the incorrectly addressed array. Unfortunately, you cannot set the compiler to catch this, since overwriting of arrays, is sometimes done on purpose in the code (though we are trying to eliminate that).
  4. parallelisation issues. This is a little trickier and is to do with the essentially random order in which calculations are done across the processors. For instance, if a random number is used inside a parallel loop, the results are irreproducible since random numbers are produced in sequence and thus are dependent on the order in which they are used.

Restart problems can be caught though. There are three times at which they occur, on the source time step cycle, on the radiation time-step cycle, or at the daily cycle. Occasionally problems can arise mid-month if any climatologies are being used. However, most problems can be detected within the first two hours.

Start from a any fort.[12] file. Save it, and call it rsf0. In the rundeck (the 'I' file in the run directory), set NDISK=1 in the database parameter list. This will produce an rsf file every source time step. Set the end time to be two hours from later than rsf0. Copy rsf0 to both fort.1 and fort.2. Run the model and save the two final rsf files as rsf1 and rsf2. Copy rsf1 to both fort.1 and fort.2, and run the model again. Save the new restart files as rsf1a and rsf2a.

Now you have two copies of the first and second hour rsf files. In order to compare them, use the program CMPE001 (compiled using 'gmake aux RUN=E001xyx').

CMPE001 rsf1 rsf1a

will go through the rsf files array by array and output any discrepancies. If no numbers are output, the arrays are identical. If the first hour rsf files are different, the error must be within INPUT (or one of the init_XYZ routines). Look at the relevant routine for the arrays that are highlighted. If however, the first hour checks out fine, but the second does not, then the error is in the main loop somewhere. Again, judge by the arrays that are affected which routine it is likely to be in. If neither set of rsf files differ, then you need to do this procedure again either starting at hour 4 (assuming a radiation related problem and a 5 hour radiation time step), or at hour 23 (assuming a daily related problem).

14) HOW-TO get an innovation included in everybody's versions

Since you are working on the code within CVS, it is possible to upload your changes to the main development branch. However, this is highly unrecommended for the average user, and so I am not going to tell you how to do it! Instead, please discuss it with one of the development team. If you wish to make a lot of changes and be part of the ongoing development of modelE, then please read the developers guide and then come and talk to me (gavin@giss.nasa.gov).

Appendix: Some Technical Notes

1) HOW-TO create and manage cvs revisions

Every so often when the code is at a relatively stable stage, a revision will be declared and users will have access to any fixes/new physics/ etc. up to that point. This is done using (in the working directory):

cvs tag modelE-x

However, it sometimes happens that bugs are discovered that affect an already frozen version, but that upgrading to a new frozen version is impractical for some reason. Thus we will make an offshoot of the previously frozen version that only contains the fixes. This is done by checking out the latest frozen version (as outlined in Part I). Use a branch name such as update-x. Once you are on the branch, make the fixes that you require and commit them. Then, declare a revision (with a suitable revision name). cvs tag modelE-x-1

Now people should be able to update to the fixed release with little or no problems using (in their working dir.):

cvs update -j modelE-x-1