GISS GCM ModelE
Note that this model documentation refers to the ca. 2004 version of GISS modelE. More up-to-date documentation will be available at the main ModelE page.
HOW-TO Document for the GISS GCM
This document outlines how to achieve some commonly sought results with the new modelE. Some of the procedures will be the same as previously, but it is more likely that the details will have become slightly more straightforward (hopefully!). However, due to the modularization that has been done, you are no longer free to do things in an arbitrary way, see the notes below for more details. Please feel free to add to this document.
This is only part of the documentation, you can find additional information in the glossary of terms (not yet available), the Frequently Asked Questions (FAQ) file, the description of current functionality and options, or the index of all prognostic variables (determined using 'gmake htmldoc' (see below).
The document is set up in four parts: Part 0 deals with getting the code, Part I deals with running the model for basic users who do not need to know how to alter the code. Part II dealing with examining the diagnostics, while Part III deals with more technical issues involved in adding functionality or diagnostics.
PART 0: Getting the code and configuring the model
- HOW-TO get the code from the CVS repository
- HOW-TO configure modelE on your machine
- System requirements
PART I: Basic user information
- HOW-TO setup and run the GISS GCM
- HOW-TO set rundeck parameters
- How to tune a specified ocean run (using U00ice/HRMAX)
- How to set up a Q-flux model with a single ocean layer
- How to set up a Q-flux model with diffusion into a deep ocean
- How to use them to control numerical instabilities
- How to use IRANDI to create an ensemble of runs
- How to fall back to a smaller time step automatically
- How to choose which ISTART to use
- HOW-TO stop and restart the GCM
- HOW-TO change boundary conditions
- HOW-TO change trace gas amounts
- HOW-TO have a perpetual winter
- HOW-TO travel back/forward in time
- HOW-TO move to Mars
- HOW-TO move to Wonderland
- HOW-TO report a problem
- HOW-TO keep up-to-date
- HOW-TO move the model offsite
- HOW-TO add to this documentation
Part II: Getting information from the diagnostics
- HOW-TO look at the output
- HOW-TO change what is included in the printout
- HOW-TO produce daily/monthly or seasonal diagnostics
- HOW-TO control the format of the binary output (i.e. GISS/netcdf/hdf etc.)
- HOW-TO calculate a model score
- HOW-TO find where a variable is defined
- HOW-TO follow the program structure
- HOW-TO use or add to the automatic documentation
- HOW-TO ensure that your code meets the new standards
- HOW-TO add functionality
- HOW-TO read in an external file
- HOW-TO add new variables to the restart files
- HOW-TO access variables that are already calculated somewhere else
- HOW-TO deal with the inevitable bugs you introduce
- HOW-TO deal with restart problems (non-reproducibility of results)
- HOW-TO get an innovation included in everybody's versions
Appendix: Some Technical details
Finally, CREDITS where CREDITS are due:
- For ongoing development (who to blame for the current shambles):
Gavin Schmidt, Reto Ruedy, Jean Lerner, Max Kelley, Igor Aleinov, Gary Russell, Ye Cheng, Andy Lacis, Valdar Oinas, M. S. Yao., Larissa Nazarenko, Nick Tausnev, Shan Sun, Greg Faluvegi, Susanne Bauer, Dorothy, Koch, Nadine Bell, Jan Perlwitz - Historical development (stuff we haven't rewritten yet):
Bob Suozzo, Suki Sheth, Hans-Peter Zinn, Rick Healy, Frank Abramopoulos, Greg Hartke - Inspiration (the people paying for this):
David Rind, Jim Hansen, Tony Del Genio, NASA GSFC
Part 0: Getting the code and configuring the model
1) HOW-TO get the code from the CVS repository
The modelE code includes source code, scripts and some documentation. This is stored through the CVS revision control system, and copies can be checked-out to your home directory using the following commands:
export CVSROOT=/u/cmrun/cvs/repository
cvs checkout -r modelE1 modelE
This will create a directory modelE/ in your directory. The code
will correspond to modelE1 - the frozen version for 2004. Please check
to see if this actual is the latest release. To place the model in a
different directory, use the -d dirname
option. Depending
on your level of skill in using CVS, and whether you are a developer,
you may want to set up a private branch where you can store your own
changes and additions without interfering with the main development
process. Be careful here, because committing changes from your private
branch is difficult and error prone, and so if you anticipate
contributing to the group effort, you are advised to set up at least
one version that is on the main development tree (i.e. with no
branch). Branches are useful for keeping track of special code that
you need for particular sensitivity experiments you might want to
perform.
To set up a private branch:
cd modelE
cvs tag -b your_tag_here
cvs update -r your_tag_here
Please subscribe to the mailing list 'giss-gcm-users-l@giss.nasa.gov' by sending a message to Majordomo@giss.nasa.gov with a line
SUBSCRIBE giss-gcm-users-l
in the body of the text. This is to ensure that you are notified of
bug fixes and new releases. Also this is a forum for you to ask
questions about the model. If you are a developer, then subscribe to
the giss-gcm-develop-l
list. This will make sure that you are
notified of all changes to the code and any specific messages related
to the development process.
2) HOW-TO configure modelE on your machine
For modelE to work properly you have to create a configuration file
.modelErc
in your home directory. This file keeps some system-wide
information (such as your directory structure) which is common to all
modelE runs you do on current machine. To create this file go to the
directory modelE/decks
and execute:
gmake config
This will create .modelErc
file in your home directory.
It will contain default settings, which
are adjusted for ra.giss.nasa.gov
. So if you are working on
GISS ra
or poncho
then you are set. On all other
machines you should edit the file ~/.modelErc
to describe your
working environment. .modelErc
also allows you to set some
personal preferences like to which email address you want the notifications
to be sent when the program stops or whether you want a verbose
output during the compilation. It contains short descriptions for all
configurable parameters.
Here is the list of currently supported options (with their default values)
which one can set in ~/.modelErc
:
DECKS_REPOSITORY
- a directory for permanent storage of run info. All rundecks that you create will be copied to this directory. This directory exists mostly for archival purposes.
(default:DECKS_REPOSITORY=/u/cmrun/modelE/decks
)-
CMRUNDIR
- directory to which all run directories will be linked. This directory will be searched by most scripts for locations of specific runs. It can coincide withSAVEDISK
(default:CMRUNDIR=/u/cmrun
) -
GCMSEARCHPATH
- directory to search for gcm input files. All necessary input files should be copied or linked to this directory.
(default:GCMSEARCHPATH=/u/cmrun
) -
EXECDIR
- path to directory with modelE scripts. This directory should contain the scripts frommodelE/exec
. Auxiliary binaries which are not run-dependent should also be put into this directory.
(default:EXECDIR=/u/exec
) -
NETCDFHOME
- path to directory which contains netcdf library. If netcdf is not installed on your computer you should leave it blank. modelE will work fine without netcdf but you will not able to compile some auxiliary binaries.
(default:NETCDFHOME=/usr/local/netcdf-3.4/lib64
) -
SAVEDISK
- a directory where all run directories will be created. These run directories will contain all files written by the model during the run (i.e. rsf, acc, PRT etc.). SoSAVEDISK
should be big enough to accommodate all model output.
(default:SAVEDISK=/raid1
) -
MAILTO
- email address of the user. When the program ends/crashes all notifications will be sent to this address. If not specified `whoami` will be used.
(default:MAILTO=
) -
OVERWRITE
- if set toYES
it will allow gmake to overwrite certain files (like rundecks in repository) which by default are protected. It may be useful when you do a lot of testing and don't want to addOVERWRITE=YES
on the command line all the time. In general it should be always set toNO
. Leave it at its default unless you know what you are doing.
(default:OVERWRITE=NO
) -
OUTPUT_TO_FILES
- if set toYES
all errors and warnings will be sent to files with the namessource_name.ERR
. This is user's preference and can be set according to personal taste.
(default:OUTPUT_TO_FILES=YES
)
-
VERBOSE_OUTPUT
- if set toYES
gmake will show compilation commands and some other information while compilation is in process. Otherwise most of the output will be suppressed. Set it according to your taste.
(default:VERBOSE_OUTPUT=NO
) -
MP
- multiprocessing support. If set toYES
gmake will compile the code with OpenMP instructions. Don't set it toYES
unless you are really planning to run the model on several processors, otherwise it will be unnecessary waste of resources (and your model can actually run slower on one cpu). Also keep in mind that one can not compile OpenMP objects with an non-OpenMP linker, so if you change fromMP=YES
toMP=NO
you have to recompile the entire model. This option has effect currently on SGI and Compaq only, but can be set to work for dual-processor Linux systems depending on the compiler.
(default:MP=NO
)
3) System requirements
The set-up code and automatic compilation rely heavily on
gmake
(version 3.79 or higher) and perl
(version
5.005 or higher). Both of these programs are public domain and
available for almost all platforms. Up-to-date versions are required
to be able to set up and compile the model.
The code requires a FORTRAN 90/95 compiler. It has been tested with the SGI, IBM, and COMPAQ/DEC compilers on their respective workstations. For Linux, Macs or PCs, the choice of compiler is wider, and we have not been able to test all possibilities. The Absoft ProFortran compiler works well, as do the Lahey/Fujitsu and Portland Group compilers. We have been unable (as yet) to get the model to compile with the Intel or VAST compilers due to internal errors in those products. Please let us know if you succeed (or more importantly, fail) in compiling the code using any other compiler.
Note that the parallelization used in the code is based on the OpenMP shared memory architecture. This is not appropriate for multi-processing on a distributed memory platform (such as a Beowlf cluster). For machines that share a number of processors per board, the OpenMP directives will work up to that number of processors. We are moving towards a domain decomposition/MPI approach (which will be clear if you look at the code), but this effort is not yet complete or functional.
Many GCM subroutines declare temporary arrays which, in FORTRAN, reside in
a part of the computer's memory known in computer parlance as the "stack."
It is essential that a user's stack size be large enough to accommodate these
arrays; a minimum value of 8192 kilobytes is recommended, with stack requirements
increasing with model resolution.
Default stack sizes vary among different computing environments, but
can be easily adjusted. Unfortunately this parameter must be
explicitly set by the user rather than by the GCM setup scripts.
Stack size is set by commands native to a user's $SHELL;
for example, a bash
user would type ulimit -s 32768
to get a stack
size of 32768 kb. Mysterious crashes in the radiation are quite often related to
a too small stack.
Part I: Basic user tasks
1) HOW-TO setup and run the GISS GCM
In order to run the GCM a 'rundeck' file (with a suffix .R) must be created. This contains information about the run you would like to perform, the fortran code that you intend to use, pre-processing options that are required, the boundary and initial conditions and any run-specific parameters (such as the time step, length of run etc.). Most of the specifics have been automated using 'gmake'.
The sub-directory 'decks' is the main control directory, most commands will be issued in this directory. Typing 'gmake' will produce some documentation.
There are a number of sample rundecks in the model/ directory (E001*.R). These represent runs with the standard fixed SST atmosphere only run (E001.R), a Qflux run (E001q.R), a coupled ocean-atmosphere run (E001o.R), a varying SST (AMIP style) run (E001a.R), a Qflux with deep diffusion (E001qd.R), a stratospheric model (E001M23.R), a tracer model (E001tr.R) etc. These can be used as templates for any runs you do.
- How to select a rundeck name
In order to make a rundeck that you can use and modify, you must first select a unique name for your run. Note that there is a 16 character limit on the length of the rundeck name due to limitations in the batch processing system we normally use. There is a central database for storing user's rundecks (along with other pertinent information such as the time of compilation, the compiler version, and the current CVS branch) in
ra:/u/cmrun/modelE/decks
(configurable in~/.modelErc
). This allows us to automatically tell whether a rundeck name has already been used. However, in order to minimise any confusion, we propose the following naming convention:- Each rundeck should start with 'E' (to distinguish it from other models in the building)
- The next three characters should be a counter (starting at '001') which the individual user will update as they perform successive runs.
- The next part consists of the users initials (or other unique identifier.
- If the user is experimenting with variable resolutions, it may be helpful to add 'M12' or 'M23' or 'F32' etc. to the end of the name to distinguish between the Medium resolution 12 or 23 layer model, or the Fine resolution, 32 layer model etc.
Note that this implies that there would be not necessarily be any connection between runs E034xxx and E034yyy, since the numbers now refer to the run sequence, rather than the model version sequence.
- How to compile the model
Once a rundeck name has been chosen (for example E001xyz), the rundeck will be created using
gmake rundeck RUN=E001xyz
This will make a rundeck E001xyz.R, that will be a copy of E001.R. To make a copy based on another of the sample decks, use, for instance,
gmake rundeck RUN=E001xyz RUNSRC=E001q
A copy of the new rundeck will be stored in the rundeck depository on ra. If there is a clash with an existing rundeck name (including perhaps one of your previous runs), there will be an error message. If you want to insist on overwriting a previous version say, then give the command
gmake rundeck RUN=E001xyz OVERWRITE=YES
If you create a rundeck in any other fashion, ie. by directly copying one. There is no chance to verify its uniqueness. However, a secondary check is made when you attempt a 'setup' (see below) which should catch any inadvertent duplication.
When making a new rundeck, please add a documentation line, and edit the label to reflect your model experiment!
The list of fortran files within the rundeck determines which files are compiled into the executable. The compilation is automatic, and will take account of all dependencies, so that if a file is modified it will be re-compiled, along with any files that directly or indirectly depend on modules or routines within it. The executable is made using
gmake gcm RUN=E001xyz
To actually run the model, you must first 'setup' the run by running the first hour (or specifically the first source time step).gmake setup RUN=E001xyz
If this runs successfully there will be a message saying so. Likewise if there is a problem.
- How to interact with CVS to make sure
the run is reproducible
It is of some importance that runs be reproducible, either by other people, or later on by the users themselves. With version control now using the CVS system, it implies that CVS MUST know about any code variations that make it into any production run (i.e. one that is not simply a compilation test or debugging). We know that test runs and production runs are not necessarily easily distinguishable a priori (i.e. a run is a test run until it works, at which point it retroactively turned out to be the production run). So the best way to ensure thats the relevant information is saved is through frequent 'commits' to the cvs repository of all substantial changes you make to the code. By this we mean at minimum all changes that have effects on the prognostic variables (as opposed to diagnostic printout).
You can also define tags within the CVS tree that can be personal to you, which just mark a particular set of code as a branch point, which can then be retrieved at any point.
There is no cost to this procedure (although you should be careful to ensure that code at least compiles before any commits are made.)
At the 'setup' stage (or at the 'exe' stage if you are recreating an executable files) further information is added to the rundeck depository indicating any changes to the rundeck since it was made, along with full CVS information on what was actually compiled. To retrieve this exact code, it is sufficient to know the time the run was compiled, and the CVS branch that the user was on.
- How to run the model
Once a successful 'setup' is done, use the command
runE E001xyz
to set the model running for the specified length of the run.
If you need to stop the run at any point, use the 'sswE E001xyz' command to do it in a controlled fashion (ensuring that all variables are saved correctly, and that no printout gets stuck in a buffer). Using 'runE E001xyz' subsequently picks up from where you left off.
If you would like to update the executable (putting in new printout, or fixing an error etc.) without starting from 'setup' all over again, you can use
gmake exe RUN=E001xyz
which will recompile the fortran, and replace the executable file in the run directory. Then 'runE E001xyz' will pick up from where you stopped previously, but with the new code. Any changes to the CVS information between the initial setup and the new code will be recorded in the central rundeck repository.
- How to run the model on multiple
processors
ModelE uses OpenMP application program interface. It consists in a set of instructions (starting with
C$OMP
) which tell the compiler how to parallelize the code. To be able to run the model on multiple processors one has to compile it with enabled OpenMP. This is done by appendingMP=YES
to the compile line, i.e.gmake gcm RUN=E001xyz MP=YES
It is important to keep in mind that one can't mix OpenMP objects with a non-OpenMP compilation. This means that one has to do
gmake vclean
when switching from OpenMP to non-OpenMP compilation. The optionMP=YES
can be set in~/.modelErc
file (see Part 0.2). In that case one can skip it on the command line.In the setup stage, the number of processors is defined by the relevant environmental variable
$MP_SET_NUMTHREADS
. However, when usingrunE
, it is similar to the way it is used for the non-OpenMP model, except that one has to specify the number of processors as a second argument torunE
. For instance if one wants to run the model E001xyz on 4 processors one starts it withrunE E001xyz 4
On batch systems (such as halem or charney), the number of processors is more automatic, and currently, only four can be used (since that is the number of processors per shared-memeory node. No additional argument is therefore needed. Slightly different run commands are available on those systems related to the different possible queues. - How to specify individual compilation
options for certain files
In some circumstances you may need to specify individual compilation options for certain files. For example, you may want to compile some files with debugging information or with
REAL*8
for default reals etc. This can be done inside the rundeck by adding all necessary options in vertical slashes (||
) right after the name of the file. All options specified between the slashes will be appended to the compile line during compilation. For instance, the following line in the rundeckGHY_COM GHY_DRV|-g| GHY|-g -r8|
will tell gmake to compile
GHY_COM.f
with default options, compileGHY_DRV.f
with debugging information and compileGHY.f
with debugging information and with default reals set toREAL*8
.One should keep in mind that changing an option inside the rundeck doesn't force automatic recompilation. One needs to 'touch' corresponding source files to tell gmake that they need to be recompiled.
2) HOW-TO set rundeck parameters
There are 2 types of parameters that may be set in the rundeck: DATA BASE (dp) parameters and NAME LIST (nl) parameters. The dp-params are set between &&PARAMETERS and &&END_PARAMETERS. The nl-params are set between &INPUTZ and &END.
The dp-params are saved in record 3 of the save files (acc,rsf).
They are tuning or control parameters.
the nl-params are not saved except for IYEAR1 (which is saved in
record 1). Most of the others set start and stop
date for the run: (YEAR/MONTH/DATE/HOUR)I or E or IHOURE
or determine the (debug) printout: KDIAG,QDIAG,IWRITE,...,QCHECK
ISTART determines whether this is an initial- or restart
IRANDI may be used to perturb the current temperatures.
See NL-PARAMS
for more information.
When a run starts, parameters not set in the rundeck take their default values. On restarts, YEAR/MONTH/DATE/HOURI are ignored; all other parameters are determined first by the rundeck, then by the restart file if they were saved, else use the defaults. So the priorities are: 1. rundeck, 2. restart file, 3. defaults.
To start a run, the following parameters HAVE to be explicitly set: ISTART=2 | 4 5 6 7 or 8 to start from obs | model Initial State YEARI = year in which model starts YEARE (or IHOURE) = year (or hour) at which model stops
To change parameters after a run was started, do the following: 1: edit the file 'I' in the run directory, 2: stop and restart the run, 3: re-edit I to ready it for the next restart (If the change affects the results, attach a log of all such changes to the end of the rundeck)
A: How to TUNE a specified ocean run (using U00ice/HRMAX)
It is desirable to run a specified ocean model in such a way that it is in RADIATIVE BALANCE, i.e. that the net flux of energy into the ocean (which has no effect, since ocean data are specified) is zero if averaged over a whole year. One way to roughly achieve this goal is to pick the right values for U00ice/U00wtrX/HRMAX. These parameters influence the cloud formation where the temperature is below/above 0C and the cloud formation in the boundary layer respectively, and can therefore be used to change the radiative balance.
Run the model for a year and find the annual mean net heating at the ground (GLOBAL: NET HEAT AT Z0, or OCEAN: NET HEAT Z0+RVR budget pages). That quantity can change from year to year within .5 W/m2. Increasing U00ice by .01 decreases the cloud cover and planetary albedo. Increasing U00wtrX to be larger than 1 reduces cloud cover in the low latitudes. HRMAX can vary between 200 and 1000 m. Increasing HRMAX increases the cloud cover in the boundary layer which has a warming effect. Note HRMAX is only affective if the do_BLU00 option is set (which is not be default). Rerun that year with the new values and hope that the annual mean net heating at the ground averages out to less than .3 W/m2 in absolute value over the next few years.
Other obsolete ways to handle the rad. balance problem: Change the solar constant until rad. balance results (most people hate that way), change the solar irradiance into the ocean when running the Qflux model (using the old SRCOR parameter) offsetting the imbalance (advantage: no need to repeat the control run, disadvantage: Jim hated it, no longer part of the code).
B: How to set up a Q-flux model with a single ocean layer
To run the Q-flux model it is necessary to supply the implied horizontal heat transports and proper initial condition as input files. To calculate the horizontal ocean heat transports it is necessary to run the model with prescribed ocean mixed layer temperature and sea ice extent for 5-6 years. It is desirable to have this control run well balanced (Net Heat at z0 is supposed to be close to 0 for annual mean. This Net Heat at z0 can be found on the first budget page for the global quantities. To get the balanced control run, you can adjust U00ice/HRMAX (see 1b). In your control run-decks, you have to have Kvflxo=1. Then the model will save the data for the heat transport. Note that if you would like to have dynamic ice in the Q-flux model, you must include the dynamic ice modules in the climatological SST run, otherwise the implied fluxes will be wrong.
Based on the saved VFLXOmmmyyyy data, a new input file (OTSPEC...) has to be created that specifies the horizontal heat transports in the ocean; in addition, the ocean temperatures at the annual maximum mixed layer depth and the mean temperature between that level and the bottom of the current mixed layer depth have to be computed for the final time of the preliminary run and added to the final restart file of that preliminary run. That augmented restart file will serve as initial state file for the new run.
For the ocean heat transport calculation, it is necessary to skip the first year of this control run because of the spin-up of the atmosphere. For this calculation you need to do next (suppose your control run is RunIDcontrol):
- Go to the ../decks directory and do gmake auxqflux RUN=RunIDcontrol. This will create executables for the ocean heat transport calculation and it will put them to the directory ../modelE/decks/RunIDcontrol_bin.
- Go to the ../aux directory and run the script
mkOTSPEC.E001.M250D RunIDcontrol first_month first_year last_month last_year
where months and years are integers and span the months that you wish to be used for the heat transport calculation. Usually first_month=1, first_year=1951, last_month=12, last_year=1955 if your control model went for 6 years starting Jan.1 1950. Note that the surface heat imbalance number is the ratio of the incoming energy to the outgoing. A number less than one denotes a net heating of the ocean. If the number is with 1% of unity, there is no need to worry. - After this you will get two files (in the original
run directory) which you will use as input data for your Q-flux
model:
-
OTSPEC.RunIDcontrol.M250D.first_year-last_year
This file is the implied ocean heat transport data. In your Q-flux run-deck the input file OHT has to have this name, in other words OHT=OTSPEC.RunIDcontrol.M250D.first_year-last_year. You should move this file to /u/cmrun directory. -
IMJAN.RunIDcontrol.O250D
is the modified restart for the Q-flux model. This restart is basically is the same restart as your control model restart but it also has corrected temperatures of the ocean which were not used in the control run. These temperatures are the temperature between the mixed layer and the annual maximal mixed layer depths (TOCEAN(2,im,jm)) and the temperature at annual maximal mixed layer depth (TOCEAN(3,im,jm)). You have to specify AIC=IMJAN.RunIDcontrol.O250D in your Q-flux run-deck. You have to move this file to /u/cmrun directory as well. - In your aux directory you will also get the line-plot
file
ohtRunIDcontrol.first_year-last_year.lpl
which you can plot and compare with observational data if you like. You will also get few PRT files which you can remove.
- specify KOCEAN=1;
- make sure that U00ice/HRMAX are the same as in your control run. Then your Q-flux model will be as well balanced as your control run.
- specify AIC and OHT as described above.
During the first few years of the q-flux run, monitor the time series of global annual means of Ocean Ice Cover and Surface Air Temperature. Hopefully, they will no longer change after a few years (10-25). The model is said to be in balance as soon as that happens. If this is not the case, it may be possible to adjust some of the ice-related parameters to improve the sea ice distribution and hence the stability. Two parameters can be used 'oi_ustar0' and 'silmfac'. These control the basal and lateral melt of sea ice. Since they are not used in a fixed SST/sea ice run they can be used to tune the q-flux model. Increasing 'oi_ustar0' (default=5d-3) reduces the temperature difference between the ocean and sea ice and increases the basal heat flux, decreasing ice thickness and ice extent. Increasing 'silmfac' (default=1.4d-8) increases lateral melt and therefore ice concentration. Note that 'oi_ustar0' is only effective with non-dynamic ice.
-
C: How to set up a Q-flux model with diffusion into a deep ocean
Qflux model with Diffusion into the deep ocean
Such a run needs a preliminary Q-flux model running for 10 years in equilibrium, saving (KCOPY=3) the data necessary for diffusion computations. A new file has to be created ('TG3M' in the rundeck) using 'gmake auxdeep RUN=E001xyz' and then running 'mkdeep *.oda.*'. This creates the initial conditions and climatology at the base of the mixed layer. Use E001qd.R as a rundeck template.
The deep ocean layers 2-LMOM (currently = 12) start at the annual maximal mixed layer depth. The thickness of layer 2 (DZ2) is 10m and the layer thicknesses increase geometrically s.t. DZ1X+DZ2+DZ3+...+DZlmom=5000m (lmom=12); i.e. DZ1X=10/Q, DZ3=10*Q, DZ4=DZ3*Q, ... and DZ1=ann.max.mixed layer depth DZ1X is not used; the dzo's to compute the vertical gradients are given by : dzo1=10/sqrt(Q) , dzo2=dzo1*Q , dzo3=dzo2*Q , etc Q=1.701...
The diffusion into the deep ocean is computed once every day using the temperature anomaly with respect to 'climatology' as tracer. The climatology is found by taking a 10-year mean of the Qflux model without a deep ocean. That run is also used to initialize the deep ocean run.
More detailed description:
T1 = temperature of mixed layer (C) =TOCEAN(1) T2 = T between mixed layer and ann.max mixed layer depth =TOCEAN(2) T3 = temperature at annual maximal mixed layer depth. =TOCEAN(3) RTGO(L=1->lmom) is set to 0 at the beginning of the run RTGO(L=2->LMOM) = temperature anomaly for deep ocean layer L RTGO(1)=12-month-mean anomaly of T3 w.r. to climatology, i.e.: before the diffusion computation it is reset to the difference dT3ann="mean_T3_over_last_12_full_months - 10year_mean_of_prel.run"; dT3ann is kept constant throughout the month. The diffusion computation changes RTGO(1); that change is added to T1,T2,T3, Before the next diffusion computation, RTGO(1) is reset to dT3ann. The heat flux at interface of layers L and L+1 is given by EDO*(RTGO(L)-RTGO(L+1))/dzo(L) L=1,...,8 however, a semi-implicit scheme is used to compute these fluxes. The following auxiliary arrays are used to carry out this scheme: STG3 is set to 0 at the beginning of each month, at the end of each day T3 is added; i.e. STG3=SumD_T3 (summed over days) TG3M(M) initially contains a 10-year mean of SumD_T3 for month M (the sum is taken over each day of month M, not averaged yet) of the Q-flux run without deep ocean that is used to start up the current run, at the end of month M it is set to STG3. From month 13 on, TG3M contains SumD_T3 for each of the last 12 months. DTG3 is set to 0 initially and is updated at the end of month M by adding (STG3-TG3M(M)). Then STG3 overwrites TG3M(M), so that next year this STG3 will be subtracted and cancels out: after 1 year: DTG3 = (sumD_T3_for_yr1 - SumM orig.TG3M) and after that the oldest month is subtracted, the newest added, hence DTG3 = (sumD_T3 over last 12 months - SumM orig.TG3M); so: ADTG3 = DTG3/365 = 12-month running mean of T3 - T3clim ; this is used to initialize RTGO(.,.,1) before the diffusion.
Compared to the Qflux model without deep ocean, the model with deep ocean needs one extra input file (EDDY = diffusivity coefficients) and uses the ODEEP module instead of OCNML. If starting from a restart file from the qflux run, you must use ISTART=4.
D: How to use parameters to control numerical instabilities
Parameter affects: depends on:
DT stability at poles hor. resolution, AVRX
NIsurf stability at the surface vert.resolution near surf
XcdLM stability at the top model top pressure PMTOP
M/NFILTR checker board patterns dynamics scheme
E: How to use IRANDI to create an ensemble of runs
If ISTART<10 and IRANDI/=0, all atmospheric initial temperatures are randomly perturbed by at most 1C . The perturbations depend on the value of IRANDI.
So to create an ensemble of N runs whose only difference is the initial state they start up from, pick N (odd) numbers and create N copies of the rundeck that only differ in the runID and the value set for IRANDI.
It is good practice to put IRANDI on the same line as ISTART so it will not affect later extensions or reruns, since it is then automatically removed from the file 'I'.
F: How to fall back to a smaller time step automatically
The atmospheric dynamics code uses a fixed time step which is defined by the rundeck parameter DT. If the program stops because of the atmospheric instability due to the fact that DT is too big, one has to restart it with a smaller time step, then stop and restart it again with a normal time step.
This process can be done automatically. To set the model for an automatic restart one has to add to the rundeck a line
DTFIX=new_time_step
after the label but before the &&PARAMETERS instruction. If such line is present and the model stops due to too big DT then the model will restart with DT=new_time_step, compute 48 hours of model time, then stop and restart with the normal time step. Corresponding .PRT files will be saved.
F: How to choose which ISTART to use
ISTART controls how the model first picks up the initial conditions to start the run. We have a number of options depending on how much information is already available.
-
ISTART=1
Default start. This sets atmospheric variables to constants and requires input files for ground values (a GIC file), and ocean values (OIC) if required. -
ISTART=2
Observed start. This sets atmospheric values to observations, based on a particular format of AIC file. As for ISTART=1, input files are required for ground and ocean variables. -
ISTART=3
Not used. -
ISTART=4
A restart from an rsf file from a previous run, but the ocean is reinitialised. Needs an initial OIC file (for fully coupled models). -
ISTART=5
A restart from an rsf file from a previous run, but no tracers. This is only useful for tracer runs that need to be initialised with a particular model state. -
ISTART=6
A restart from an rsf file from a previous run that might not have had the same land-ocean mask. This makes sure to reset snow values, pbl values and ocean values accordingly. -
ISTART=7
A restart from an rsf file from a previous run with the same land-ocean mask. This still makes sure to set snow values and ocean values. This is used mainly for converted model II' data. -
ISTART=8
This is a restart from a model configuration identical to the run now starting. This is for perturbation experiments etc. -
ISTART=9
This is a restart from this model run. (i.e. a continuation, or a backtrack to an older rsf file). -
ISTART=10
This is used internally to pick up from the instantaneous rsf file (the later of fort.1 and fort.2). Does not ever need to be set in the rundeck. -
ISTART=11
This is used internally to pick up from an instantaneous rsf file (fort.1). Only rarely used. -
ISTART=12
This is used internally to pick up from an instantaneous rsf file (fort.2). Only rarely used. -
ISTART=13
This is used internally to pick up from the instantaneous rsf file (the earlier of fort.1 and fort.2). Only rarely used. -
ISTART<0
This option is used by the post-processing program to run the model to generate nice diagnostics. This should never need to be set manually.
3) HOW-TO stop and restart the GCM
Use the command 'sswE E001xyz' to stop the model run in a controlled fashion. Restart it with the command 'runE E001xyz'. If you would like the run-time printout to be suppressed (to save filespace) use the command 'runE -q E001xyz'. The monthly/seasonal/annual printout can always be recreated if needed.
4) HOW-TO change boundary conditions
Simply alter the file declarations in the rundeck appropriately!
- HOW-TO alter the topography file
consistently
The topography file (TOPO) in the rundeck, must have a degree of internal consistency. Therefore if changes are made to the ocean mask (FOCEAN), for instance, other changes are required for the other fields (FLAKE,FGRND,FGICE etc.). Note that in the model FGRND=FEARTH, and FLICE=FGICE for historical reasons not worth going into. This file is made up of 9 fields (in the standard TITLE*80,DATA(IM,JM) format):
- FOCEAN fraction of ocean
- FLAKE fraction of lake
- FGRND (or FEARTH) fraction of non-glaciated land
- FGICE (or FLICE) fraction of glaciated land
- ZATMO mean topography (m)
- HOCEAN depth of ocean (m) (not used except for GISS ocean)
- HLAKE depth of lake (m)
- HGICE depth of glacier (m)(not yet used)
- ZSOLID not sure! (not used except for GISS ocean)
IF (FOCEAN(I,J).gt.0) THEN FLAKE(I,J)=0. ! lake and ocean cannot co-exist at the same grid point HLAKE(I,J)=0. FLAND=1.-FOCEAN(I,J) ELSE FLAND=1.-FLAKE(I,J) HLAKE(I,J)=MAX(1.,HLAKE(I,J)) END IF FGRND(I,J)=MAX(0.,FLAND-FGICE(I,J))! land ice fraction takes precedence over ground FGICE(I,J)=FLAND-FGRND(I,J)
If land boxes are added close to the Antarctica or Greenland ice sheets, FGICE should be set to FLAND (i.e. 1-FOCEAN), FGRND, FLAKE, and HLAKE to zero. Note as well that the polar boxes must be homogeneous (i.e. FOCEAN(2:IM,JM)=FOCEAN(1,JM) etc.)
Note that if FLAKE(I,J)>0, then HLAKE must be greater than 1.
- HOW-TO run the model with a new land mask
If you changed the topography file you may have created new land cells. These cells are missing both boundary conditions and prognostic variables related to the land surface model. One has to extend the land surface data to these cells before the model can run.
To remedy this problem we created a set of "extended" boundary conditions files. These files are the same as original ones except the data was spread all over the ocean cells using a nearest-neighbor algorithm, so that each cell contains some reasonable land surface data. If you are working on Ra then these files are at the same location and have the same name as originals with the suffix
.ext
appended to them. Here is the list of the files which needed extension as they would appear in the rundeck:CDN=CD4X500S.ext ! - roughness length RVR=RD4X525.RVR.ext ! - river routing SOIL=S4X50093.ext ! - soil data VEG=V72X46.1.cor.ext ! - vegetation data TOP_INDEX=top_index_72x46.ij.ext ! - topographic index
The above files provide boundary conditions. One also needs to use a compatible restart file with extended prognostic variables, currentlyAIC=DEC1958.rsfB394M12.modelE.17.ext ! - initial conditions
One should keep in mind that the data should be extended in a compatible way. At least the data inSOIL
andAIC
should be extended from the same "nearest neighbors". To preserve the information about this "extension" process the fileconv_index.ext
was created (located on Ra in/u/cmrun
). It contains the coordinates of "nearest neighbors" used when extending the data. More precisely, the dataarray
is extended as follows:integer :: conv_index(im,jm,2) open(1,file='conv_index.ext') read(1) conv_index do j=1,jm do i=1,im if ( conv_index(i,j,1) .ne. 0 ) then i0 = conv_index(i,j,1) j0 = conv_index(i,j,2) array(i,j)=array(i0,j0) endif end do end do
One can use this algorithm to extend any land surface data if necessary. To create a new extendedAIC
file withconv_rsf
one has to pass to itconv_index.ext
as a third parameter. For current setup on Ra it will look:conv_rsf /u/cmrun/DEC1958.rsfB394M12 new_rsf_file /u/cmrun/conv_index.ext
To summarize, when using a non-standard land mask one has to replace the above mentioned files in the rundeck with their
.ext
equivalents. One also has to use ISTART=6 (instead of ISTART=7) to force regeneration of PBL data.
5) HOW-TO change trace gas amounts
Trace gas amounts can be set in two ways. Firstly from a scenario (or historical data) as a function of time. This is useful for transient experiments. Secondly, you can set most trace gas amounts as multiples of the default values in order to look at equilibrium sensitivities for instance.
- Setting the Trace Gas Year:
The file GHG in the rundeck can be used to define a trace gas scenario (the default file gives observed values from 1850 to 2000). However, any other scenario can be defined and used. If the variable 'ghg_yr' is set in the rundeck, then that year's values are used throughout the run. If a transient scenario is required, set ghg_yr=0 and the years will be taken from the model year. (A similar procedure works for aerosols (Aero_yr), solar forcing (sol_yr), volcanoes (volc_yr) and ozone (O3_yr) as well). Note that stratospheric water vapour amounts are related to the ghg_yr (and the methane concentration) if H2ObyCH4=1 (the default is not to include the methane related flux).
- Changing fixed trace gas concentrations:
If you want to test the sensitivity to various hypothetical trace gas amounts, there are some multiplier parameters that can be used for the three principle greenhouse gases (CO2, CH4 and stratospheric water vapour). These variables CO2X, CH4X and H2OstratX are by default 1. Since the default ghg_yr is 1951, changing the values of these multipliers gives a tracer gas concentration of (CO2X) times the 1951 CO2 level. These defaults for reference are, CO2=311.098 and CH4=1.10298 (but really depend on the 1951 values in the GHG file). H2OstratX multiplies whatever the stratospheric water vapour amount is.
- For other gases (CFCs, N2O, O3) there are no rundeck parameters available, but they can be changed based on the 'FULGAS' array in init_RAD in a similar way to CO2.
6) HOW-TO have a perpetual winter
Orbital parameters must be fixed and so 'ORBIT' in DAILY should be given the same JDAY every day. (i.e. JDAY0=1 for JAN 1st conditions). Though this will have consequences for the polar regions (since there will be no seasonality there). Possibly there is another way....
7) HOW-TO travel back/forward in time
A number of things can change depending on the time period concerned.
- orbital parameters. These can be changed directly in CONST.f, or by setting 'calc_orb_par=1' in the rundeck, and then using 'paleo_orb_yr' to define a year (Before Present (i.e. 1950)) from which the values can be calculated.
- trace gas amounts. (see Q6) Note that for really large changes, the mean atmospheric composition will change (and thus so will RVAP etc. See CONST.f for details)
- continental configuration. Change the boundary condition files in the run deck. (Also for land use/cover changes, ice sheets etc.)
- Mean atmospheric pressure. Change PSF etc. in the RES_xyz file.
- Speed of rotation. A little tricky since this effects OMEGA (in CONST.f) but also SDAY, and depending on how you define an 'hour', the number of seconds in an hour, or the number of hours in the day. Remember that dtsrc must divide into sday exactly.
- Solar constant. This can be set by the multiplicative factor S0X within the rundeck.
- Aerosol/dust amounts. These can be changed within RAD_DRV.f or by changing the RADN input files.
8) HOW-TO move to Mars
Like travelling back in time, only more complicated. Oh, and no water. Note that the length of year is set by EDPERY in CONST.f.
9) HOW-TO move to Wonderland
The wonderland model is a sector version of the GCM that is used for very fast or very long runs that do not have to be as true to the real climate as normal. Essentially, the globe is split into three sectors, and only one sector is prognostically calculated. You must alter the resolution (IM will be 24), change the definition of DLON (in GEOM_B.f), and make sure that the radiation calculation has been modified to synchronise the diurnal cycle over the sector. Note that this has never been tested.
10) HOW-TO get help or report a problem
There is a mailing list 'giss-gcm-users-l@giss.nasa.gov' where queries, problems and comments on the gcm can be aired. Sending a message here is probably more likely to get a quick response than emailing any one person.
Subscribe to this by sending a message to Majordomo@giss.nasa.gov with a line "SUBSCRIBE giss-gcm-users-l" in the body of the text.
11) HOW-TO keep up-to-date
Periodically, we will release new, and hopefully improved versions of the model code. You will be notified through the mailing list, and you can decide whether to upgrade your code, or wait until a more suitable time. None of your changes will be lost if you upgrade. However, some conflicts may occur, though they should be minor. Detailed instructions will be given at the time.
12) HOW-TO move the model offsite
The model code itself is self-contained. Thus it should be transportable offsite. However, certain directories and scripts need to be configured for the local set up. Use 'gmake config' to create the '.modelErc' file in your home directory. Edit this to customise your structure, including the central directory (which is /u/cmrun/ by default). The input data files are not part of the release, and will need to be downloaded separately.
The biggest problem may be if the machine you wish to run the code on has a different fortran compiler. Some variation has been fully allowed for (i.e. the standard SGI or IBM compilers, Absoft Profortran on a Linux system and the DEC/COMPAQ version), however, we do not have a stable testbed for any other compilers and therefore cannot guarantee it will always work.
General problems will arise due to arbitrary compiler decisions on what to name compiled module files (which affects the determination of the inter-routine dependencies). Edit model/Rules.make to fix this. The code is compilable without using the 'static' option, which removes many of the problems that used to occur. Most of the compiler dependent code (that we have discovered) can be found in SYSTEM.f. These involve system calls to the CPU clock, the calculation of pseudo-random numbers, and the setting of exit codes. We would be very interested in further compiler dependencies if found. Different compilers are dealt with using preprocessor directives from the Makefile.
13) HOW-TO add to this documentation
This documentation is part of the model code, and so changes to files in the modelE/doc directory can be uploaded to the main repository. Please contact one of the developers if you have some changes that you would like to commit.
14) HOW-TO keep up-to-date with the latest releases of the model
From time to time relatively stable versions of the model will be "released", i.e. that they will be marked and assigned some specific name, or "tag" by which they can be checked out from CVS. The typical name for such released versions will be "modelE-x-y", where x and y are some numbers. It is recommended that those who are not involved into active development on the main trunk of the CVS use only these "released" versions.
If you are currently working on your private branch and want to upgrade
the model to a new release, you can do it automatically, provided that for
your previous update you also used a "released" version. To
upgrade your model to the release modelE-a-b go to decks
directory
and execute
gmake update RELEASE=modelE-a-b
This will merge into your branch all the changes which were introduced into the model since your last update until the release modelE-a-b. Automatic update doesn't mean automatic resolution of all conflicts. If such arise they have to be checked and fixed manually.
15) HOW-TO nudge the model
The model can be run in a nudged mode, whereby the horizontal wind components are nudged towards reanalysis data sets (NUDGE.f). The following nudging formula is used:
f(n+1) = f(n) + a delta b / (1 + a delta)
where f(n+1) is the variable to be nudged at the new timestep (n+1) and at the old timestep f(n), respectively. delta is the model time step and b is the reanalysis value. a is the nudging coefficient. a depends on the model timestep and a nudging timescale ANUDGEU, ANUDGEV (for the u and v-velocities, respectively)
ANUDGEU and ANUDGEV are set in the input parameter list in the rundeck file. ANUDGEU=0.01 ANUDGEV=0.01 was tested to give reasonable results. The preprocessor directive #define NUDGE_ON must be set to turn the nudging on.
The reanalysis data sets must be available at the horizontal grid resolution of the model an a 6 hourly time step. The vertical interpolation is done in the model, so that the same reanalysis input files can be used for different vertical model resolutions. Netcfd input is required.
Part II: Getting information from the diagnostics
1) HOW-TO look at the output
You can look at the output in two distinct ways: using the run-time printout, or using the post-processing program, pdE.
-
The run-time printout.
If you run the model using 'runE' (as opposed to 'runE -q' which suppresses the printout) the model will produce a $RUNID.PRT file which can contain copious amounts of diagnostic information. It is split into a number of sections: (see next question for a list). This is produced each month (but can be controlled using the KDIAG switches and NMONAV in the NAMELIST).
-
Post-processed output:
Each month the program produces an 'acc' accumulated diagnostics file which contains all of the diagnostic information from the previous month. The program 'pdE' (in the exec directory) is an alternate entry point into the model that reads in any number of these files and a) creates a printout (as above) for the time period concerned and b) creates binary output of many more diagnostics. This can be used simply to recreate the monthly printout, but also to create longer term means (annual, seasonal, monthly climatologies etc.).
For example, to recreate the printout in /u/cmrun/$RUNID;
- for a single month: pdE $RUNID JAN1987.acc$RUNID
- for all Januaries: pdE $RUNID JAN*.acc$RUNID
- for a whole year: pdE $RUNID *1987.acc$RUNID
For pdE to work properly, the directory /u/cmrun/$RUNID has to exist and contain at least ${RUNID}ln ${RUNID}uln ${RUNID}.exe . The output files will end up in the PRESENT WORKING DIRECTORY which may be /u/cmrun/$RUNID or any other directory; names and order of the inputfiles are irrelevant (as long as the format of the files is compatible with the model ${RUNID}).
It is possible to use pdE to average acc-files from several runs, e.g. average over an ensemble of runs. Although the numbers that are produced are fine, subroutine aPERIOD will not be able to create the proper labels: the runID will be taken from the last file that was read in and the number of runs averaged will be interpreted as successive years, so averaging years 1951-1955 of runs runA runB runC runD will produce the label ANN1951-1970.runD rather than ANN1951-1955.runA-D. Some titles will also suffer from that 'bug', but it should be easy to fix it manually afterwards, if anybody cares.
Note that the output can be controlled (a little) by the settings in 'Ipd' (which is created if it does not yet exist in the present working directory). A number of files will be created whose names contain the accumulated time period. (monyear[-year] where mon is a 3-letter acronym for a period of 1-12 consecutive months).
monyear.PRT the printout monyear.j$RUNID zonal budget pages (ASCII Aplot format) monyear.jk$RUNID latitude-height binary file monyear.il$RUNID longitude-height binary file monyear.ij$RUNID lat-lon binary file monyear.wp$RUNID Wave power binary file monyear.oij$RUNID lat-lon binary file for ocean diagnostics monyear.ojl$RUNID lat-depth binary file for ocean diagnostics monyear.oht$RUNID lat ASCII file for ocean heat transports
which can be read using the appropriate graphical software.
2) HOW-TO change what is included in the printout
The switches KDIAG which are set in the NAMELIST control which subgroups of diagnostics are calculated and output. At present, there is no simple way to pick and choose exactly which diagnostics appear, although we are working on a scheme to do just that. Some of the following options are rarely used, and may soon be made obsolete. These switches can also be used in 'Ipd' input for pdE to control which output to produce. If QDIAG=.true. some binary files are created accompanying the printed output. If QDIAG_ratios=.true. fields which are the products of q1 and q2 and whose titles are of the form "q1 x q2" are replaced by the ratios field/q2, the title being shortened to "q1" (default).
Current options are:
KDIAG(1) < 9 print Zonal/Regional budget pages = 8 only print regional diagnostics 2-7 only print global zonal means = 1 print all zonal diagnostics KDIAG(2) = 0 print Latitude-height diagnostics = 1 print at most fields listed in file Ijk (rearranging the lines of Ijk has no effect) = 8 Only print spectral analysis of standing eddies = 9 skip all Latitude-height diagnostics 2-7 same as 1 KDIAG(3) = 0 standard printout, all binary files if QDIAG=.true. = 1 ij-fields as in list Iij, all ijk-fields = 2 ij and ijk fields are handled according to list Iij = 7 ij-fields as in list Iij, no ijk-fields are produced = 8 full list Iij of available fields is produced; this list may be edited : don't touch lines starting with 'List' or 'list' you may delete any other lines you may rearrange the remaining ij-fields you may add blank maplets = 9 no ij and ijk diagnostics are produced 3-6 same as 2 KDIAG(4) < 9 print time history table of energies KDIAG(5) < 9 print spectral analysis tables KDIAG(6) < 9 print diurnal cycle diagnostics at selected points (up to 4) 1-4 print out for first 4-KDIAG(6) points (will soon be obsolete) -1 - -4 print out for -KDIAG(6) point only. KDIAG(7) < 9 print wave power tables KDIAG(8) < 9 print tracer diagnostics = 0 print all tracer diagnostics = 1 skip lat/lon tracer diagnostics = 2 skip tracer conservation diagnostics and lat/lon diagnostics KDIAG(9) < 9 print out conservation diagnostics KDIAG(10) < 9 print out Longitude-height diagnostics KDIAG(11) < 9 print out river runoff diagnostics KDIAG(12) < 9 print out ocean diagnostics (if there is an ocean)
There is also one section of special lat-lon diagnostics which can
be optionally accumulated and output. These are the isccp_diags
which calculate some cloud properties as if they were being observed
by satellite. This is set as an option in the rundeck
(isccp_diags=1
) to calculate them. By default these diags are
not calculated.
3) HOW-TO produce daily/monthly or seasonal diagnostics
Controlling the frequency of output of the ACC files is done using the NAMELIST parameter NMONAV. This sets the number of months over which the diagnostics is accumulated. If it is set to 3, seasonal acc files are produced. If set to 12, you will get annual means. Higher numbers could be chosen, but are probably not very useful.
Producing more frequent diagnostics is controlled by the SUBDAILY
module. The frequency of saving is controlled by the variable
Nsubdd
which is the number of hours between each snapshot
(i.e. Nsubdd=1
implies hourly diags, =24 daily). The
variables saved are controlled by the string SUBDD
, which is
a space seperated list of names for the required diagnostics. These
are currently limited to instantaneous values of some standard fields
(SAT, PS, PREC, SLP, QS, LCLD, MCLD, HCLD, PTRO, QLAT, QSEN, SWD,
SWU, LWD, STX, STY, ICEF, SNOWD, TCLD, SST, SIT), the
geopotential height, relative humidity and temperature on any (or all)
of the following fixed pressure levels (if available): 1000, 850, 700,
500, 300, 100, 30 , 10, 3.4, 0.7, .16, .07, .03, and velocities on any
model level. The fixed pressure level diags are set using Z850, R700
and T100, for instance, to request the 850 mb geopotential height, 700
mb relative humidity and 100 mb temperatures, respectively. Requests
for ZALL or TALL, will produce individual files of all available
levels. The velocities are set using U1, U5, U12, for instance, to get
the 1st, 5th and 12th layer U velocities, respectively. If "ALL"
levels are requested for a velocity, only one file per variable will
be output, containing all requested levels. To limit the number of
output levels, set the LmaxSUBDD
parameter in the rundeck,
and only levels L=1,LmaxSUBDD will be output.
More options can be
added as cases in subroutine get_subdd in DIAG.f if required. For
example, including the following string in the rundeck
(i.e. SUBDD="SLP SAT Z500 R850 U5 V5"
will produce diags of
the sea level pressure, surface air temperature, 500mb geopotential
heights, 850 mb level relative humidity and the 5th layer U, and V
fields. Each timeseries will be output one month at a time in files
named SLPmmmyyyy and SATmmmyyyy (for example) for each month and
year. Restarts will find the appropriate file and continue to append
records as would be expected.
You can easily keep track of the running average of all the printout diagnostics using the NIPRNT switch in the parameter database. By setting this to a positive number, the diagnostics will be printed out that number of hours. (i.e. setting NIPRNT=10, will print out the running average of the accumulated diagnostics at the end of each of the next ten hours).
4) HOW-TO control the format of the binary output (i.e. GISS/netcdf/hdf etc.)
The binary output created from 'pdE' can be in a number of formats. Currently there are two options, the traditional GISS formats, and netcdf output. This is set by choosing to compile the model with POUT (for GISS-style output) or POUT_netcdf (for netcdf output). This is set in the rundeck. Note that the choice is made AT THE TIME OF COMPILATION OF THE MODEL EXECTUABLE. If you subsequently change your mind, you must edit the rundeck and recreate the executable (using 'gmake exe RUN=E001xyz').
Other formats (HDF?) could be defined as you like with a suitable POUT_xyz.f file.
5) HOW-TO calculate a model score
There is an RMS.f program in the aux/ directory which calculates a 'score' for a model run based on the RMS difference with some selected observations. Additionally, it calculates the Arcsin Mielke score (AMS), which is a non-dimensional statistic (between -1,1 (1 being perfect)) which also includes effects of mean bias and variance. It is compiled whenever the 'gmake aux RUN=....' command is run. Like CMPE001 or qc, it is technically model dependent (since it uses im,jm etc.), but in practice you will not need to recompile it for every version. The executable will be put into the RUNNAME_bin directory (i.e. for gmake aux RUN=E001xyz, the executable will be E001xyz_bin/RMS). Running RMS on its own gives its usage.
You should produce diagnostic files over at least a five year mean. So for a particular run, use pdE to create the JAN and JUL averages
cd E001xyz pdE E001xyz JAN195[1-5].accE001xyz # => JAN1951-1955.ijE001xyz pdE E001xyz JAN195[1-5].accE001xyz # => JUL1951-1955.ijE001xyz
Make sure that the TOPO file is linked:
E001xyzln
and then run RMS
~/modelE/decks/E001xyz_bin/RMS E001xyz JAN1951-1955.ijE001xyz JUL1951-1955.ijE001xyz
This will create files 'E001xyz' in the directories /u/cmrun/modelE/RMS and /u/cmrun/modelE/AMS, which contain the RMS statistics and the Arcsin Mielke score for each selected field,respectively. You can then compare these scores with other existing versions.
Part III: The model code
1) HOW-TO find where a variable is defined
Variables (other than purely local variables) are defined in three main ways. Physical constants (real constants, not tuning parameters), are defined in const.f, model related variables (i.e. variables to do with how the model is run) are defined in MODEL_COM.f, and all other variables are defined in the physics MODULES. Variables that are on the model grid will be defined in the PHYS_COM module, while parameters and options for the local physics will be defined in the PHYS module. Thus decide to which physics module your variable is most related to and look there.
For more brute force efforts, run the command 'gmake htmldoc RUN=RUNNAME'. This will create a html document that indexes and defines all important variables (well not all, but most, ok some!). Alternatively grep -i "varname.*=" *.[fSU] will probably find it for you.
2) HOW-TO follow the program structure
The model is essentially on three levels. MAIN in MODELE.f is the top level and controls the time-stepping and deals with most of the model overhead. The time-step loop in MAIN is the principal path that the model follows. This is a very linear loop, and therefore is not difficult to understand.
The level below this is the level of the drivers. These are routines that translate model variables (ie. those defined on the model grid), to local variables (usually, but not always, vertical arrays), save diagnostics and save variables/fluxes for later use.
Finally the third level is the level at which the actual physics gets done. This level generally is oblivious to the previous two levels and is code that could be used in many different contexts. It is very important to keep this true. DO NOT write directly to model arrays within a local physics subroutine. (Some outstanding cases may appear to do that, but this is being written out of the model).
3) HOW-TO use or add to the automatic documentation
You will have noticed that modelE has multiple documentation directives such as !@var or !@sum. These are key words that can be read by a script and turned into useful documentation. Each subroutine, and each important variable should have such a declaration (which should include a definition, units, etc.). The command 'gmake htmldoc' makes html output from a rundeck containing (and sorting out) all this information. The various directives are all defined in the main model common block. If you introduce new variables or subroutines, you MUST define similar documentation.
Currently defined keywords and syntax are:
@sum UNITNAME Brief summary/description of the current program unit or file @auth Author @ver Version (a number plus any relevant comments) @calls List of routines called in this program unit @cont List of program units contained within a file or module + other info. @fun FUNCNAME Denotes a function @param PARAMNAME Denotes a parameter (in the FORTRAN sense) @var VARNAME Denotes a variable (in the FORTRAN sense) @dbparam VARNAME Denotes a database parameter (in the modelE sense) @nlparam VARNAME Denotes a NAMELIST parameter (in the modelE sense) @+ Continuation line for @sum/@calls/@cont
4) HOW-TO ensure that your code meets the new standards
Please check the document Coding standards for ModelE and try to conform as nearly as possible to the (not very onerous) conditions. Unfortunately, we do not have anybody whose job it is to make sure that added code conforms. However, every time this happens it makes it harder to keep the model consistent, and leads to the creation of unnecessary complications in keeping things straight.
5) HOW-TO add functionality
- A: Variations on existing code
Since all code is now kept within CVS, code history is preserved and can be retrieved at any time. So for minor modifications to the code (i.e. slightly different formulation of a particular term), go straight ahead and modify your version. Compare the results from that with a standard control run. As you extend your modifications, make sure to 'commit' your changes at appropriate points (on your branch only) so that you can retrieve the code from any intermediate (but functional) point.
- B: Introducing new sub-modules
If you want to add a completely new physical module or radically change an existing one, you will be best off creating a new 'plug and play' option. Follow the example of the ATURB code (which can replace DRYCNV if required). The issue is mainly one of keeping the interface clean and consistent. Unless what you propose will directly influence multiple parts of the model, it is best to keep your new code as modular (single column for instance) as possible, and only interact with the main code through the relevant drivers.
In particular, if there is an existing call to an equivalent routine, keep the call statement the same for the old and new versions (and you can add dummy arguments to the old code if required). If you want to add a new call for which there is no analog, you may want to add a dummy routine for use in the old version (i.e as in RES_M12.f with regard to some of the stratospheric code). Put this in a file that you are making obsolete, or even create a new dummy file.
Another way to do this is by using pre-processing directives set from the rundeck. This is not encouraged except where it would make inserting a new option much more straightforward. If it is used, limit its use to the absolute minimum (i.e. to decide whether to call the new routine or not). We specifically do not want multiply threaded options with any of the physics routines since that makes maintenance much more problematic.
In any file where the directive is to used, there must be an
#include "rundeck_opts.h"
statement at the top of the file. The coding then follows standard practice:
#ifdef NEW_FUNCTION call new_function(a,b,c) #endif
To set this option, add a line to the preprocessing option part of the rundeck.
Preprocessor Options #define NEW_FUNCTION ! do new function End Preprocessor Options
The gmake command will recompile the relevant routines every time you change the option. Note that if options become too commonplace, the entire model will need to be recompiled every time. Therefore, our preferred solution is 'plug and play'.
Note that for each new sub-module or pre-processor directive you create, you should modify the OPTIONS.html file to document the details.
- C: Introducing a new parameter
Introduce a new parameter if you would like to be able to vary it's value at run time without recompiling the model (i.e. an adjustable parameter, or an option switch for instance). Firstly, decide who should own the parameter. That is likely to be one of the physics modules. Since it is likely to be a fundamental part of a local physics process, it should be declared and defined in the local physics module. You MUST give it a default value, and ensure it is tagged with a '@dbparam' comment that explains what it does.
In the initialisation for this module, you need to add a line
call sync_param("NEWPARAM",newparam)
which will set the value of newparam from the parameter database, if it has been set from the input rundeck, or from a restart. (see the FAQ for more details on how this works).
In the rundeck you need only add to the &&PARAMETERS...&&END_PARAMETERS block a line like this:
NEWPARAM=1 ! NEWPARAM sets something for XYZ
Of course, you should use a more obvious name! If no value is set in the rundeck, the default value will be used.
- D: Adding new diagnostics
The diagnostic system that the GCM uses is very sophisticated and is almost certainly sufficient for any new need for diagnostics that arises. However, it can seem complex and difficult to use. As improvements in ease of use occur, it will get easier to modify.
Adding a new conservation diagnostic: These arrays keep track of the zonal average of any quantity, and note which routines change those values. They are useful in understanding the overall mass/energy/tracer budgets. To add a new conservation quantity, first write a routine that returns the zonally averaged quantity in question. Secondly, in the 'init' subroutine for the relevant physical module, insert a call to 'SET_CON' this sets up the points at which the diagnostic is called, and how it is output. (see init_DIAG for more detailed instructions). Thirdly, you must increase NQUANT by 1, and KCON by 2+number of points at which you are saving the value. Setting the equivalent for tracer conservation diagnostics is similar to the above.
Adding a new lat-lon diagnostic: These are the most common diagnostics to add and are relatively straightforward. There are three components to a new AIJ diagnostic: the declaration, the definition, and the accumulation. Everything else is done automatically. First of all, define a variable in DIAG_COM.f which will be an index for this diagnostic such as ij_xyz (where xyz is some clear mnemonic for your new diagnostic). Secondly, in DEFACC.f (subroutine ij_defs), add a definition:
k=k+1 ! new diagnostic XYZ ij_xyz = k lname_ij(k) = 'XYZ DIAGNOSTIC' units_ij(k) = 'XYZ_UNIT' name_ij(k) = 'XYZ' ia_ij(k) = ia_src scale_ij(k) = 1.
The counter k ensures that each ij diagnostic has a unique index (which you don't need to know) which is set to the variable you previously defined. The lname and name are a long and short name strings which are used in TITLEs and 'get-by-name' code respectively. The units should be the units for the scaled output. The ia_ij index denotes how often the diagnostic is accumulated (in this case once every source time step, although this can very depending on where in the code it is done).
The ij diagnostics can also be weighted (by surface type area, cloud amount, pressure etc.). This is signified by a string such as ' x POICE' in the long name of the array for a variable weighting, or by setting iw_ij(k) for a fixed weighting. Look in ij_defs for examples.
The ij diagnostics can optionally appear in the ascii printout as well as the binary output. For this to happen, you must set a 'colorbar' for the pritnout by setting ir_ij to a suitable bar. Look at the LEGEND array in DIAG_COM.f for possible values. For the output to appear, you need to specifically ask for that to happen in DIAG_PRT.f. The array 'iord' in DIAGIJ defines all the ij printouts. You can either add to it (and make sure to increase the number of 'kmaplets' accordingly), or simply replace an existing maplet.
Finally, put in the code that accumulates your chosen variable somewhere in the model. Make sure to USE the correct index from DAGCOM. Remember that all ij diagnostics will appear in the *.ijE001xyz file after using 'pdE'.
Adding a new zonal 'budget' page diagnostic: These diagnostics store the zonal mean of various quantities and fluxes. The routine 'j_defs' in DEFACC.f controls their printout. The diags are output in the order they appear in the listing, with the exception of derived composite ratios which appear where specified in DIAG_PRT.f. If you want to add a simple scaled diag, define the index 'J_XYZ' in DIAG_COM.f, set up the meta-information in DEFACC.f, and accumulate it at the relevant point in the code. The output format for the values can be optionally changed in DIAG_PRT.f using the name you give the diagnostic. Ratios are a little more complicated.
Other diagnostics will be explained if and when we work them out.
- E: Adding extra timing information
If you want more information on how long each part of the code takes (for instance if you put in some tracer code, or extra physics) the way to do it is to increase the number of timing variables such as are output by the program 'qc'.
Initially, call the routine
CALL SET_TIMER("NEW PHYSICS",MPHYS)
Where the text is what will appear in the printout, and MPHYS is a locally defined integer variable. At the beginning of code you want to profile, add a call to TIMER(MNOW,MORIG) where MORIG refers to the timing counter current at this point (in order to zero the counter for your routine. At the end of your code, call TIMER(MNOW,MPHYS). That's it.
- F: Coupling an ocean model
Please read the coupling documentation for details of the fluxes, fields and entry points required for any ocean code.
9) HOW-TO read in an external file
ModelE has introduced a FILEMANAGER system that controls the assignment of unit numbers and the opening of files. To add a new file to be read in, first define a short mnemonic for the file (i.e. "TRSRC" for the tracer source file). This is the name that you will assign in the rundeck to the file you wish to open. Secondly, prior to reading in the file, use the routine 'openunit' to assign a unit number for the file using the mnemonic you gave it. ie.
USE FILEMANAGER INTEGER iu_trc1 .... call openunit("TRSRC",iu_trc1,.TRUE.,.TRUE.) READ(iu_trc1) DATA1,DATA2.... call closeunit(iu_trc1)
The first logical variable argument is true if the file is to be opened for unformatted (binary) read/write, .FALSE. for ASCII input. The second logical variable denotes whether this is an already existing file or not. These arguments are optional, and default to TRUE. The line in the rundeck should look something like (depending on the filename of course):
TRSRC=tracer_sources.1970_1980.dat
The setup script looks automatically in /u/cmrun, /u/raid and in the current directory. If the file is elsewhere an absolute or relative pathname is needed.
10) HOW-TO add new variables to the restart files
Each module now has only one place that needs to be changed to accommodate extra variables in the rsf files: io_xyz (where xyz is a descriptor for the module like ocean, rad, lakes etc.). All input/output is controlled by these routines. Thus to add a new variable, simply add it to the list of variables for both the read and the write statements. Note that you MUST change the MODULE_HEADER character string if you change what is in the rsf (ie. increment the number by 1 each time you do this). This is necessary to ensure that we can distinguish various versions.
There are some complications for variables that can be set in the NAMELIST, namely, you must decide how this variable gets set as a function of the initialisation method. Look at io_rad as an example to follow.
11) HOW-TO access variables that are already calculated somewhere else
One of the main changes to modelE was to replace common blocks with Fortran 90 modules. The functionality is much the same, but since they do not assume anything about where in memory variables are kept, many of the problems associated with common blocks are alleviated. In order to make use of a variable from somewhere else in the code, that variable must be in the relevant MODULE (usually called PHYS_COM, for ease of remembering). Then in the routine of the module where you wish to use it you must specifically declare what you want using the USE statement. For clarity, we are insisting upon using 'USE, only' so that each variable that you require is explicitly declared, ie.
USE PHYS_COM, only : var1,var2
If the name of the variable conflicts with a local variable, you can use the 'points-to' feature to map the PHYS_COM variable to a new name (as you may have done with common blocks), i.e.
USE PHYS_COM, only : var1,var2_phys=>var2
The arrow should be read as 'points to', and thus var2_phys will be an alias for var2 in this routine/module only.
12) HOW-TO deal with the inevitable bugs you introduce
Nobody's code is perfect (least of all ours) and so when coding it is inevitable that errors are introduced. Some are minor and cause no obvious sign of distress (these are very difficult to find!), while others cause the model to crash and burn.
Let's say you've changed something in the model that seemed innocuous, and then the model crashes. There are three common points that seem to trap errors: the dynamics, the radiation and the PBL scheme. These errors will be flagged as either "A>1 in AADVTX" or similar (for crashes in the dynamics), "In Radia: TL out of range" or similar (in the radiation), and finally a "stop rtsafe", for a crash in the PBL.
Do not go looking for errors in any of these routines. What has generally happened is that you have introduced some wild fluctuation in some main variable (the temperature, humidity or velocities), or overwritten some the model variables that control advection and the like. Check that coming out of the routine you modified, everything is as it should be (set QCHECK in the rundeck, and include a call to CHECKT immediately before and after). Also do not bother commenting out the routine that actually crashed to see if the model can continue on okay. This routine is the symptom, not the disease.
One additional problem that arises mainly with tracer code is that the advection code can sometimes produce wild oscillations of the tracer concentration, which for sensitive tracer physics, can cause no end of problems. Again, the advection is not to blame! What has happened here is that the second order moments for the tracer field (the first and second derivatives at the centre of the grid-box) have somehow become uncoupled from the mean tracer amount, giving rise to an implied sub-grid scale profile that is seriously wrong. The biggest mistake is not to reduce the gradients by the same fraction by which the mean was reduced in some process. See the document Using the GISS GCM tracer moments for more details on how to do this properly. The CLOUDS.f tracer code is a good example of what to do.
13) HOW-TO deal with restart problems (non-reproducibility of results)
Occasionally after making coding changes the model ceases to be able to reproduce results. For instance, after a crash, a restart from the last fort.[12] file does not crash at all, or does so at a different point. This is a symptom of a restart problem. It should definitely be fixed, since it is clear evidence that the prognostic code has a real bug (one that could change the results).
These problems can arise in a number of circumstances:
- the rsf file is incomplete (a prognostic variable has not been saved). This is quite easy to spot and fix.
- a variable is not being properly initiallised. This is quite tricky to find. What happens is that the first time it is used, it is zero, but subsequently it is not. Thus at a restart it is set to zero again, changing the results. Try compiling with an option (-DEBUG:trap_uninitialized=ON on the SGI compiler) that sets all un-initialised numbers to NaN, therefore causing a crash if they are used.
- out of bounds memory addressing. Occasionally this can also result in a restart problem if the memory that is being overwritten is used prior to the over-writing. Thus the first time through it is clean, but subsequently has the value from the incorrectly addressed array. Unfortunately, you cannot set the compiler to catch this, since overwriting of arrays, is sometimes done on purpose in the code (though we are trying to eliminate that).
- parallelisation issues. This is a little trickier and is to do with the essentially random order in which calculations are done across the processors. For instance, if a random number is used inside a parallel loop, the results are irreproducible since random numbers are produced in sequence and thus are dependent on the order in which they are used.
Restart problems can be caught though. There are three times at which they occur, on the source time step cycle, on the radiation time-step cycle, or at the daily cycle. Occasionally problems can arise mid-month if any climatologies are being used. However, most problems can be detected within the first two hours.
Start from a any fort.[12] file. Save it, and call it rsf0. In the rundeck (the 'I' file in the run directory), set NDISK=1 in the database parameter list. This will produce an rsf file every source time step. Set the end time to be two hours from later than rsf0. Copy rsf0 to both fort.1 and fort.2. Run the model and save the two final rsf files as rsf1 and rsf2. Copy rsf1 to both fort.1 and fort.2, and run the model again. Save the new restart files as rsf1a and rsf2a.
Now you have two copies of the first and second hour rsf files. In order to compare them, use the program CMPE001 (compiled using 'gmake aux RUN=E001xyx').
CMPE001 rsf1 rsf1a
will go through the rsf files array by array and output any discrepancies. If no numbers are output, the arrays are identical. If the first hour rsf files are different, the error must be within INPUT (or one of the init_XYZ routines). Look at the relevant routine for the arrays that are highlighted. If however, the first hour checks out fine, but the second does not, then the error is in the main loop somewhere. Again, judge by the arrays that are affected which routine it is likely to be in. If neither set of rsf files differ, then you need to do this procedure again either starting at hour 4 (assuming a radiation related problem and a 5 hour radiation time step), or at hour 23 (assuming a daily related problem).
14) HOW-TO get an innovation included in everybody's versions
Since you are working on the code within CVS, it is possible to upload your changes to the main development branch. However, this is highly unrecommended for the average user, and so I am not going to tell you how to do it! Instead, please discuss it with one of the development team. If you wish to make a lot of changes and be part of the ongoing development of modelE, then please read the developers guide and then come and talk to me (gavin@giss.nasa.gov).
Appendix: Some Technical Notes
1) HOW-TO create and manage cvs revisions
Every so often when the code is at a relatively stable stage, a revision will be declared and users will have access to any fixes/new physics/ etc. up to that point. This is done using (in the working directory):
cvs tag modelE-x
However, it sometimes happens that bugs are discovered that affect
an already frozen version, but that upgrading to a new frozen version
is impractical for some reason. Thus we will make an offshoot of the
previously frozen version that only contains the fixes. This is done
by checking out the latest frozen version (as outlined in Part I). Use a branch name such as
update-x
. Once you are on the branch, make the fixes that you
require and commit them. Then, declare a revision (with a suitable
revision name).
cvs tag modelE-x-1
Now people should be able to update to the fixed release with little or no problems using (in their working dir.):
cvs update -j modelE-x-1