Dr. Gavin A. Schmidt

Accumulated GCM Wishlists

Here are a selection of comments by various people that you might like to browse through. There is some duplication, but there are also some very good ideas.

For reference here are the links to the NCAR CCM4 standards for coding and netCDF output. There may be some ideas that we could use.

  • CCM4 Coding Standard
  • CSM netCDF Convention

    Finally, a quote from Duane:

    Just because it's called code, it doesn't mean it has to be cryptic

    gavin@giss.nasa.gov


    
    Preliminary wishlist:
    =====================
    
    DOCUMENTATION:
    
    - definitions of every variable,
    - a description of the purpose of each subroutine
    - a step map through the GCM's normal code operation (e.g. through a 
    one month  simulation) and, finally,
    - a manual describing how to actually run the model. (Mark)
    
    DIAGNOSTICS:
    
    - a routine should exist to transfer diagnostics in the TITLE,DATA
    format to netcdf for easier transfer of data and more efficient use of
    standard graphics packages. Possibly this should be an option in the
    diagnostic programs (Duane)
    
    - a systematic way should be devised to output regional data at
    relatively high frequency for use in initiallising regional models (Len)
    
    - we need an easy way to change the model to produce daily or 6 hourly
    diagnostics (Ron/Reha)
    
    - the diagnostic (accumulation or restart files) output should be able
    to be readable in a boot-strap way (i.e. all information necessary
    should be contained in a standard header) (Gavin)
    
    - it should be easy to alter which variables are saved, at what frequency 
    they are collected, and over what region the variables are 
    accumulated. For users (not developers) there should be a simple way 
    to select various options for a specified set of standard 
    diagnostics. (Mark)
    
    - the standard diagnostic accumulation files (.acc files) need to
    include header information at the beginning of the file AND labels
    within the file, the purpose being to make these files much easier to
    work with when it comes to extracting and averaging subsets of
    information. (Mark)
    
    - it would be useful to alter the standard diagnostic ".PRT" files
    so that different viewers can be used to look at the output, and make
    it less cumbersome for higher resolution models. (Mark)
    
    - Include surface pressure as an IJ diagnostic (Jean)
    
    MODEL COMPILATION AND SETUP:
    
    - possibly we should abandon the priority system for linking
    subroutines, this can lead to errors if non-standard compilation is
    used (Shan)
    
    - Month/day/year input should be used instead of Tau for model start
    and finish times (Drew)
    
    - An automatic makefile should be created during the setup process to
    prevent linking of out of date object files. (Gavin)
    
    - use version control software so that all changes to the code are
    formally tracked and documented at the time that changes are
    made. (Mark/Igor/Duane) 
    
    - Create a GISS-wide database that collects all pertinent info about 
    simulations (with a short description). (Mark)
    
    - The GISS Model setup needs a local environment that is not easily
    translated. Either this should be made easier, or it should be slimmed
    down and replaced with standard UNIX tools. (Mark)
    
    MODEL ROUTINES:
    
    - Calculations for fluxes etc. should only be done once where they can
    be easily adjusted or replaced (Shan)
    
    - main model should contain gravity wave drag + subsequent
    parameterizations (David)
    
    - all physical constants should be in a parameter common block (Gavin/Reto)
    
    - Fortran 90 modules could be used instead of includes (better
    control of exactly which variables are wanted etc.) (Max)
    
    - All physical quantities should have meaningful names - the blocks
    O/G/GH/BL DATA etc. should only be used for I/O and inter-routine
    standardization. (Jean/Gavin)
    
    - All hangovers from Fortran 66 should be eliminated (ie. JMM1=JM-1,
    etc.) (Jean)
    
    - atmospheric mass, P**kapa and P*T**kapa should be in common blocks
    and kept up to date. (Jean)
    
    - the standard model should have all the parallelization optimisations
    as standard features (Jean)
    
    - WORK common blocks should not be used to pass variables (Jean)
    
    - All common blocks should be named (Max)
    
    - PTOP, PLE, SIGE should have more correct names (any other mis-named
    variables?) (Jean)
    
    - INPUT should be written to allow more flexible starting options (Jean)
    
    - Need JMONTH variable (Jean)
    
    - Naming conventions should be defined and adhered to! 
     
    - roughness length should be read in only once in input (Gavin)
    
    - pbl and cloud arrays should be part of the model common block (Gavin)
    
    - SURFCE, PRECIP, GROUND should be split up and used to call separate
    routines for each surface type (Gavin)
    
    - Lakes (open water + ice on land) should be completely divorced from
    open ocean water/ice and kept as part of atmospheric model. (Gavin)
    
    - Drastically reduce the number of GO TO statements in the model (Mark)
    
    - try to rewrite the radiation code to avoid using ENTRY points (Mark)
    
    
    EXTRA Stuff added in afterthought:
    
    - fixed arrays should not need to be calculated all the time (Max)
    
    - implicit none should be standard
    
    - diagnostic arrays should not be referenced by number but by integer
    variable which explains what they are (i.e. IDSLP for the SLP
    diagnostic).
    
    - equivalences should only be over a whole common block or not at all!
    
    

    From shosein@giss.nasa.gov Mon May  8 14:55:45 2000
    Received: from babylon.giss.nasa.gov (babylon.giss.nasa.gov [192.42.70.14]) by isis.giss.nasa.gov (AIX4.3/UCB 8.8.8/8.8.8) with ESMTP id OAA23390 for ; Mon, 8 May 2000 14:55:45 -0400
    Received: from ranee (ranee.giss.nasa.gov [198.116.18.21])
    	by babylon.giss.nasa.gov (8.8.8/8.8.8) with SMTP id OAA03882
    	for ; Mon, 8 May 2000 14:55:44 -0400
    Message-Id: <3.0.3.32.20000508144916.00942870@babylon.giss.nasa.gov>
    X-Sender: opsgh@babylon.giss.nasa.gov
    X-Mailer: QUALCOMM Windows Eudora Pro Version 3.0.3 (32)
    Date: Mon, 08 May 2000 14:49:16 -0400
    To: gavin@isis.giss.nasa.gov
    From: Mail Delivery Subsystem  (by way of Sabrina Hosein )
    Subject: Returned mail: User unknown
    Mime-Version: 1.0
    Content-Type: text/plain; charset="us-ascii"
    Status: RO 

    Hi Gavin, I do believe that the GCM would benefit from the inclusion of gravity wave drag parameterizations, as we have in the stratospheric model. In addition, the version of the stratospheric model Jeff's running includes the new high cloud parameterization based on the presence/absence of (parameterized) gravity waves passing through. Even if one doesn't want to use the gravity wave drag per se, to use this parameterization requires that the gravity wave source/propagation be included. Of course, neither of these parameterizations are necessary - they just improve the model's temperature structure tremendously. David /////////////////////////////////////////////////////////////////////////// ///////////////////////////////////////////////////////////////////////////


    Fromrwebb@cdc.noaa.gov Mon May  8 15:49:37 2000
    Received: from server2.giss.nasa.gov (server2.giss.nasa.gov [192.42.70.179]) by isis.giss.nasa.gov (AIX4.3/UCB 8.8.8/8.8.8) with ESMTP id PAA27892 for ; Mon, 8 May 2000 15:49:36 -0400
    Received: from sphinx.giss.nasa.gov (firewall.giss.nasa.gov [192.42.70.1])
    	by server2.giss.nasa.gov (980427.SGI.8.8.8/8.8.8) with SMTP id PAA45803
    	for ; Mon, 8 May 2000 15:50:04 -0400 (EDT)
    Received: from manager.cdc.noaa.gov ([140.172.156.210]) by sphinx.giss.nasa.gov; Mon, 08 May 2000 19:49:34 +0000 (UTC)
    Received: from [140.172.156.43] (crdmac13 [140.172.156.43])
    	by cdc.noaa.gov (8.9.3/8.9.3) with ESMTP id NAA15787
    	for ; Mon, 8 May 2000 13:49:33 -0600 (MDT)
    Mime-Version: 1.0
    X-Sender: rwebb@cdc.noaa.gov
    Message-Id: 
    Date: Mon, 8 May 2000 13:49:34 -0600
    To: Gavin Schmidt 
    From: "Robert S. Webb" 
    Subject: model enhancements
    Content-Type: text/plain; charset="us-ascii" ; format="flowed"
    Status: RO 

    Gavin, Since you asked. I sort of gave up trying to simulate the impact of orbital forcing on the Asian Monsoon system because the persistence of a Tibetian snowfield in the control run of Model 2' messed up the sensitivity of the model. There was a response but it was shifted to the east and complicated any data model comparison. I am not sure if this problem still exists in the model but if it does, trying to identify and then correct the underlying processes would be of benefit to me. thanks -- Robin Robert S. Webb NOAA/OAR/CDC 325 Broadway Boulder, CO USA 80303 Office ph. (303) 497 6967 Fax. (303) 497 7013 e-mail: rwebb@cdc.noaa.gov


    From LDruyan@giss.nasa.gov Tue May  9 11:44:12 2000
    Received: from server2.giss.nasa.gov (server2.giss.nasa.gov [192.42.70.179]) by isis.giss.nasa.gov (AIX4.3/UCB 8.8.8/8.8.8) with ESMTP id LAA24916 for ; Tue, 9 May 2000 11:44:12 -0400
    Received: from [192.42.70.103] (storms.giss.nasa.gov [192.42.70.103])
    	by server2.giss.nasa.gov (980427.SGI.8.8.8/8.8.8) with ESMTP id LAA60021;
    	Tue, 9 May 2000 11:44:44 -0400 (EDT)
    Mime-Version: 1.0
    X-Sender: cdlmd@horus.giss.nasa.gov
    Message-Id: 
    Date: Tue, 9 May 2000 11:46:17 -0400
    To: gavin@giss.nasa.gov
    From: Len Druyan 
    Subject: GCM codes
    Cc: jhansen@giss.nasa.gov, cddhr@giss.nasa.gov (David Rind),
            cdpgl@nasagiss.giss.nasa.gov,
            Matthew Fulakeza 
    Content-Type: text/plain; charset="us-ascii" ; format="flowed"
    Status: RO 

    Gavin- In response to Sabrina's note: I have been working with Matthew Fulakeza and Pat Lonergan to make regional model simulations that can use GISS GCM results as lateral boundary conditions. This technique "downscales" GISS GCM results to 50 km horizontal resolution over selected domains. The potential contribution is significant for many research programs at GISS. Matthew and Patrick have developed procedures to glean the needed data from GCM simulations on an ad hoc basis. I suggest that more permanent code be written which, when activated, would automatically save the GCM data for selected target areas and at the high temporal frequency required by the regional model (perhaps six times daily). If this option is selected when making a GCM run, the resulting data would be immediately available for a regional model run. Patrick and Matthew can contribute to this effort should there be a consensus to implement it. Len Druyan Rm 516 ext: 5564


    Fromjlerner@giss.nasa.gov Wed May 10 16:21:52 2000
    Received: from heka.giss.nasa.gov (heka.giss.nasa.gov [198.116.18.235]) by isis.giss.nasa.gov (AIX4.3/UCB 8.8.8/8.8.8) with ESMTP id QAA04212 for ; Wed, 10 May 2000 16:21:52 -0400
    Received: from giss.nasa.gov (loopback.giss.nasa.gov [127.0.0.1]) by heka.giss.nasa.gov (980427.SGI.8.8.8/980728.SGI.AUTOCF) via ESMTP id QAA19272 for ; Wed, 10 May 2000 16:21:35 -0400 (EDT)
    Sender: cdjal@giss.nasa.gov
    Message-ID: <3919C4CF.802450F2@giss.nasa.gov>
    Date: Wed, 10 May 2000 16:21:35 -0400
    From: Jean Lerner 
    Organization: Goddard Institude for Space Studies
    X-Mailer: Mozilla 4.51C-SGI [en] (X11; I; IRIX 6.5 IP32)
    X-Accept-Language: en
    MIME-Version: 1.0
    To: Gavin Schmidt 
    Subject: model remodeling
    Content-Type: text/plain; charset=us-ascii
    Content-Transfer-Encoding: 7bit
    Status: RO 

    Gavin, I hope that the 'frozen' model does not mean we have to freeze in errors. The last time I talked to Reto he was not using the latest CB265.S. I found my old wish list. Here are some things on it. note: When I say the 'C' array, I am talking about the combined real, integer, and character components. *) The names of source modules (.S files) should only reflect resolution dependence if they are in fact resolution dependent. For example, DB112M9.S has nothing in it that depends on 9 layers, so the '9' should not be in the name. Likewise, the 'M' probably does not belong..(?). *) Get rid of LMM1, JMM1, etc. in COMMON. Remove from code wherever possible. If usage is impossible to eliminate, put in PARAMETER. *) Expand the 'C-array'. This will make obsolete dozens of current progams. Think about a way to maintain backward compatibility. *) Most constants in the C-array should really be put into PARAMETER statements. Actually, this may eliminate the need to expand C. *) We need an integer in C that tells us what month we are in. This is currently computed in DAILY but is not available to other routines. *) ISTART. Perhaps it could be an array or mask, so that different starting conditions could be combined. This may require two different variables, one for restartiubg (istart=10,11,12 (and 4?)) and one for initial conditions. Currently, INPUT is a confusing tangle of this and that, depending on such and such, with repetitous code and confusing GOTOs. I'm thinking more in terms of a 'chineese menu'. *) Rename PLE as PLB, so it is clear that it refers to the bottom edge. *) Rename SIGE as SIGB for the same reason. *) PTOP used to refer to the actual top of the model, but no longer. It would be useful, maybe in input, to somewhere print the the actual top with 3 significant digits. This may even be saved in C so that off-line programs don't have to calculate it. *) Arrays like GDATA, GHDATA, BLDATA, FDATA, ODATA should be equivalenced to arrays with more meaningful names. In fact, they should appear in the code only for I/O. *) PBLPAR, PBLOUT commons should probably be put into an INCLUDE, since they appear in more than 2 routines. *) the names of COMMON's should be systemetized and regulated. 'WORK' means 'WORK'. This could result in some wasted memory space, but not necessarily. Currently, people sometimes avoid putting arrays in a 'work' common because they are afraid to clobber something, so they make up another common or just use DIMENSION. *) Add a 3-d air mass array. It seems silly in the parts of the atmosphere where there is constant pressure, but it would greatly simplify the code. Also, code that is optimized for parallel processing would be the same as code that is optimized for non-parallel processing. *) P**kapa and P*T**kapa are frequently recalculated. Maybe they should be in common. *) If we continue to group several related routines into modules, I suggest that we separate the dynamics routines from MAIN and INPUT. Do ORBIT and DAILY stay with MAIN??. AVRX should be grouped with DYNAM, PGF, AFLUX, etc. Since GWDRAG and DEFORM are for stratosphere model only, it's possible they should be on their own...on the other hand, maybe not. *) No unnamed files, except, perhaps, fort.99. Or maybe a rule that says no unnamed files in routines other than MAIN and INPUT. *) Radiation should be initialized in INPUT. *) Currently some versions of the model have 3 different namelists, called from 3 different routines. I'm not sure this is a good idea. Maybe all this should be done in INPUT, in which case seperate namelists would not be necessary. However, this would mean the user would need to cook up a way to get the namelist info to the routine that needs it. *) I have a note that 'Shapiro filter should have arg. list'. I don't remember my train of thought on this. *) we sometimes need the surface pressure or air mass in the layers for post processing. The best we can do now is use the apj array, which is zonal. It may be a good idea to save on the acc file an IJ array of mean surface pressure as well. *) Apropos the remodelling project, if we are to produce documentation we should all get on the same page. For example, Gary's recent 'MODELDATA.TXT', which is fine and good for Gary, but does not speak for everyone, and does not even acknowledge the need for the infor- mation needed in a file to produce a plot from data with an irregular grid. If standards are to be adopted, they should not be unilateral. I personally dislike -999999. for missing data, and prefer something like -1.e20. In any case, the exact number for missing data should also depend on the data itself. You don't want a number that can be confused with the data (like zero, which many nin- compoops use). The existance of Reto's wonderful 'fcop' program, unix commands like 'cat', and the many other file query and manipulation tools we've developed over the years make strict rules like 'Model output should be organized with separate datafiles for each climate variable' totally unnecessary. They are reminiscent of the days of MVS--the dark ages before unix. I don't know if you were planning on getting into these kinds of issues.... *) processing of model output, the techniques I use in the 'pd' procedure to obtain 'plot-ready' output that is fully scaled can be extended and even put 'on line', so that the model writes the files at the end of each month. Personally, I think this is very wasteful, since usually people are interested in annual and seasonal means, which are best obtained by averaging the acc files. It would be a simple matter to 'turn on' this option with a namelist array. I could work on it. The same could be done with netcdf, but that would be even more wasteful. Actually, one could replace the acc files completely with scaled output, such as pd does. The hang up is the non-linearity of many of the quantities that are printed. Do you average the individual terms over many years, or the monthly printout? Gavin Schmidt wrote: > > I agree. I think we should probably retain the acc files in something > similar to today's format and concetrate on giving the post-processing > programs as much flexibility as possible (including netcdf output/seasonal > annual means/ regional etc). > > Since so many people have brought up issues with the diagnostics, we should > think about this very carefully before we make the changes. It is unlikely > that everyone will be satisfied, but we can but try! > > Thanks > > Gavin The beauty of the 'pd' process is that it is so flexible. Usually, all you need is to know the diagnostics module that was used. But certainly, making a new pd is confusing for the uninitiated, and we could work on that. netcdf could be added to pd as an option. -Jean


    Fromsun@venus2.giss.nasa.gov Mon May 15 14:27:40 2000
    Received: from venus2.giss.nasa.gov (venus2.giss.nasa.gov [192.42.70.111]) by isis.giss.nasa.gov (AIX4.3/UCB 8.8.8/8.8.8) with ESMTP id OAA24398 for ; Mon, 15 May 2000 14:27:40 -0400
    Received: (from sun@localhost) by venus2.giss.nasa.gov (AIX4.3/UCB 8.8.8/8.7) id OAA25572 for gavin@isis.giss.nasa.gov; Mon, 15 May 2000 14:28:30 -0400
    From: Shan Sun 
    Message-Id: <200005151828.OAA25572@venus2.giss.nasa.gov>
    Subject: Re: GCM wishlists
    To: gavin@isis.giss.nasa.gov
    Date: Mon, 15 May 2000 14:28:30 -0400 (EDT)
    X-Mailer: ELM [version 2.5 PL3]
    MIME-Version: 1.0
    Content-Type: text/plain; charset=us-ascii
    Content-Transfer-Encoding: 7bit
    Status: RO 

    Gavin, Thank you for making such an attempt! I would like to emphasize on modularity and user manual. I know it would be not practical to expect a highly complicated atmospheric model to be "point and click", but there is definitely room there to be improved. NCAR CCM3 has done a lot over the years toward making it a community model, so that a person with basic Fortran knowledge can download CCM3 code from the website and run it guided with the manual. This may or may not be our goal, given the shortage of our personnel, but how they handled this problem can be a guideline for us. For example, in CCM3, the latent heat calculation is in a subroutine where the formula can be replaced easily by the choice of a user. Same applies to the convection or radiation scheme. A few minor suggestions: (1) some commonly used fields are not in arrays, for example, wind stresses are expressed as "RTAUUS" and "RTAUVS". (2) some subroutines, after updated, occurred twice with both the new and old version, and when I compile differently (alphabetically), I may link to the old version by mistake. For example, setsur appears in both R99G and R99E, where R99G has the latest version. So the order of linking all subroutines is crucial as the program picks up the first one and ignores the rest with the same name. I have been a victim for several times, and consider this a potential hazard. I suggest eliminating all unused routines from the program. That is all I can think of for now. Let me know if I can help. Shan -- ---------------------------------------------------------------------------- Shan Sun, Ph.D NASA/Goddard Institute for Space Studies tel: 212-678-6031 2880 Broadway fax: 212-678-5622 New York, NY 10025 email: ssun@giss.nasa.gov ----------------------------------------------------------------------------


    Frommchandler@giss.nasa.gov Tue May 16 10:52:47 2000
    Received: from server2.giss.nasa.gov (server2.giss.nasa.gov [192.42.70.179]) by isis.giss.nasa.gov (AIX4.3/UCB 8.8.8/8.8.8) with ESMTP id KAA13468 for ; Tue, 16 May 2000 10:52:47 -0400
    Received: from sphinx.giss.nasa.gov (firewall.giss.nasa.gov [192.42.70.1])
    	by server2.giss.nasa.gov (980427.SGI.8.8.8/8.8.8) with SMTP id KAA61225;
    	Tue, 16 May 2000 10:53:47 -0400 (EDT)
    Received: from mail1.doit.wisc.edu ([144.92.9.40]) by sphinx.giss.nasa.gov; Tue, 16 May 2000 14:52:44 +0000 (UTC)
    Received: from [128.104.52.193] by mail1.doit.wisc.edu
              id SAA150640 (8.9.1/50); Mon, 15 May 2000 18:43:43 -0500
    Mime-Version: 1.0
    X-Sender: machandl@facstaff.wisc.edu (Unverified)
    Message-Id: 
    Date: Mon, 15 May 2000 18:52:28 -0500
    To: Gavin Schmidt 
    From: Mark Chandler 
    Subject: Re: GCM wishlists
    Content-Type: text/plain; charset="us-ascii" ; format="flowed"
    Status: RO 

    Sorry to be slow on this. Thanks for putting this stuff together. Documentation and standard means of working with the model: It seems to me that the most important forms of documentation required are also the most basic. These include: - definitions of every variable, - a description of the purpose of each subroutine - a step map through the GCM's normal code operation (e.g. through a one month simulation) and, finally, - a manual describing how to actually run the model. Some of the above already exist, but the items need to be augmented and brought together in a consistent (and concise) format. The process becomes more involved if the purpose of this excercise includes making it easier for individuals outside of GISS to operate and alter the model (as opposed to a cleaning excercise designed to optimize model code for use by experienced GISS GCM programmers). Many operations involving the GCM are invoked by running scripts that are specific to the systems at GISS. The scripts that deal with rundecks, update decks, etc. do not translate well outside of the GISS local environment. Some of these long-time GISS specific techniques date back to the days of the mainframe and could be replaced by with more conventional methods of working with multi-routine software that is developed by many people. Simple changes would involve the use of make files for linking and compiling the code, while more detailed adjustments would involve employing version control software to track the many changes that are made to the GCM on a regular basis. Diagnostics: It would be very useful to alter the way diagnostics are saved. Three changes are needed most: 1) the ability to alter which variables are saved, at what frequency they are collected, and over what region the variables are accumulated. For users (not developers) there should be a simple way to select various options for a specified set of standard diagnostics. By "various options" I do not mean "budget pages", JK tables, and IJ maps, rather, there should be a means by which a user could select specific, common climate variables that are then saved in a readily accesible format. 2) Improving the current "standard" output: the standard diagnostic accumulation files (.acc files) need to include header information at the beginning of the file AND labels within the file, the purpose being to make these files much easier to work with when it comes to extracting and averaging subsets of information. 3) In addition, it would be useful to alter the standard diagnostic ".PRT" files that are intended for the line printers. Restrictions of the line printer format make these standard files cumbersome for higher resolution versions of the model. This has, so far, been dealt with by printing maps on multiple pages or by "skipping" or averaging the grid cells that are actually reported (e.g. the budget pages for the 2x2.5 GCM report only every fourth latitude zone.). Furthermore, though the files are in text format, the necessity of having control characters and extra lines in order to create "overstrikes" on the line printers makes the files illegible if viewed electronically (as opposed to printing them on the line printers). Other suggestions (some are wishful thinking, of course): Remove excessive data statements from the code (e.g. radiation) and substitute input files. Create a user-friendly (JAVA-based?) technique for running the model and for extracting basic climate variable information. Have GISS scientists who work on model development use version control software so that all changes to the code are formally tracked and documented at the time that changes are made. This is the only absolute way to avoid having to repeat the current excercise again in ten years. Drastically reduce the number of GO TO statements in the model, in general, and try to rewrite the radiation code to avoid using ENTRY points. These things make it very difficult for others (besides the authors) to evaluate the GISS GCM paramterizations. Make sure that a "standard version" of the GCM includes all code necessary to run with any of the major ocean parameterizations used regularly at GISS (including the qflux, the qflux w/deep ocean, Gary's dynamic ocean, and the modified MOM model). Create a GISS-wide database that collects all pertinent info about simulations (with a short description). The last time we had such a thing was when Reto made everyone fill out a page in a notebook describing your run BEFORE you were assigned a run number. This was a minor inconvience that kept things far more organized. Finally, make sure that changes are communicated to, and implemented in, other versions of the model (stratosphere, coupled O-A, Mars (egad!)) Those are the major things that come to mind. We're continuing to try and put together a version of the GCM that runs on a Mac. Model II was fairly simple, but si99 is proving to be more stubborn - the radiation code is by far the most difficult thing to move to a new platform it seems. Bye, Mark


    Fromialeinov@simplex.giss.nasa.gov Tue May 16 21:38:11 2000
    Received: from simplex.giss.nasa.gov (simplex.giss.nasa.gov [198.116.18.160]) by isis.giss.nasa.gov (AIX4.3/UCB 8.8.8/8.8.8) with ESMTP id VAA19726 for ; Tue, 16 May 2000 21:38:10 -0400
    From: ialeinov@simplex.giss.nasa.gov
    Received: (from ialeinov@localhost)
    	by simplex.giss.nasa.gov (8.9.3/8.9.3) id VAA31948;
    	Tue, 16 May 2000 21:35:35 -0400
    Message-Id: <200005170135.VAA31948@simplex.giss.nasa.gov>
    Subject: GCM structure suggestions
    To: gavin@isis.giss.nasa.gov (Gavin Schmidt)
    Date: Tue, 16 May 2000 21:35:35 -0400 (EDT)
    Cc: rruedy@giss.nasa.gov, jhansen@giss.nasa.gov
    Reply-To: ialeinov@giss.nasa.gov
    In-Reply-To: <3.0.3.32.20000516191143.00aaf9b0@babylon.giss.nasa.gov> from "Gavin Schmidt" at May 16, 2000 07:11:43 PM
    X-Mailer: ELM [version 2.5 PL0pre8]
    MIME-Version: 1.0
    Content-Type: text/plain; charset=us-ascii
    Content-Transfer-Encoding: 7bit
    Status: RO 

    Gavin, Jim, Reto, Here are some of my thoughts on how to improve the structure of GISS GCM program and on some techniques which are used to work with it. I can explain it in more detail if there is an interest. Igor --------------------------------------------------------------------- General The entire program should be split into several logical modules. Each module is characterized by the specific functions it performs and has its own data which is hidden from other modules. The exchange of data between the modules is performed by passing parameters to corresponding subroutines (using of common blocks for this purpose should not be allowed) Proposed list of modules: - MAIN - main program Performs the general management: calls input/output, initialization procedures, does the main loop over the time steps. Should be very short and very simple. Basically should look like: program main ! read parameters for current run call read_run_params ! read all input data and store it in a database call input_data ! initialize all the modules call init_module_timestep call init_module_soils call init_module_radiation ............. call init_module_... ! now do the main loop do while ( time < time_end ) ! the following program should call all the subroutines ! which do the computations during the time step call time_step ! the following should write restart file if ( some_condition ) call write_restart_file ! maybe write some diagnostics here if ( some_condition_1 ) call output_diagnostics enddo end main Basically that's all that should be present in the MAIN module. All the computations should be hidden inside TIME_STEP module and all input/output should be performed by DATABASE module. - DATABASE - performs reading and writing of data files and also maintains the database of all the data. The data is read into internal structure of ``DATABASE'' and is provided to other modules upon request. Such requests should be made when modules are initialized. All memory allocation should be performed by ``DATABASE'' so that it knows which data to dump when it is writing a restart file. - TIME_STEP - this is the module where all the computational subroutines are called. All the global data which has to be exchanged between the modules should be stored here. It should be requested from the ``DATABASE'' when init_module_timestep is called. All data exchange between the modules should be performed here by means of formal parameters. No COMMON blocks should be allowed. - computational modules - i.e. soils, radiation, atmosphere dynamics, e.t.c. At current stage for most parts of the GCM such modules will be just wrappers which fill common blocks when init_module is called and copy data from formal parameters to common blocks and back when some of the module programs are called from the TIME_STEP module. Such common blocks should be gradually replaced by direct passing of parameters to subroutines or by use of global data in Fortran90 style. On format of restart files and other data files. Some general format should be adopted for all binary data files being read / written by GCM and data processing programs. Those files should have a structure of a simple database. One possible example of such a structure are ij.* files which are currently used by diagnostics utilities. The following format is proposed: - each data unit ( like an array ) should be written as a separate record - the structure of the record should be approximately as follows: | label area | description | binary data | +------------+---------------+------------------------------+ where the fields are: label - some short information describing the type and the length of the data ( may be just two integer numbers ) description - human readable text describing the data, say, 80 characters long ( for example: ``snow water content (m)'' ) binary data - the data itself Such format will have the following advantages: - allows easy extraction of data from any data file without looking into the code of the program which has written such a file. - provides easy way to add data to the restart file while preserving compatibility with other versions. - simplifies writing post-processing utilities - allows more flexible diagnostic output, since data can be easily added to diagnostic output or removed from it without changing post-processing routines Some notes on programming languages and other computing tools Compilation of the program should be done using standard ``makefile'' approach. This will eliminate the danger of using obsolete object files. It will also make the process of compilation much more flexible in terms of specifying options, directories, libraries e.t.c Some version control utilities can also be included into the makefile. I have a script which creates a Makefile from a *.R file if that can make the transition easier. About .U files: As far as I understand they were introduced to spare the disk space so that only the difference between the current version and control version is stored on the disk. The method which is being used now relies on specific information in the files (numbers after 72nd position) and is non-portable and very inconvenient. It will not even work with native Fortran90 format (free format). I suggest that diff / patch commands be used for this purpose, as it is done almost universally in UNIX. They work with any text file and their output is portable between all UNIX systems. Also, since the disk space is not as much an issue now, I would suggest that the full text of programs is kept for the current version (and may be for some recent versions) to eliminate possible confusion. I would strongly suggest that ``upper level'' modules (MAIN, DATABASE, TEME_STEP) should be written on some modern language, preferably C/C++. Fortran 90 would be a minimum requirement, since Fortran 77 doesn't support any serious data management (pointers, memory allocation, structures, global data e.t.c.). I want to stress that once some ``upper level'' program is written, all programs which are called from it have to be of the same level of abstraction or lower. I.e if the MAIN program is written in C++ then all languages (Fortran 90, Fortran 77, C, C++) can be used in the package. But if MAIN is written in Fortran 90 the use of C++/C becomes much more difficult. ------------------------------------------------------------------------


    From rhealy@whoi.edu Wed May 17 10:14:20 2000
    Received: from server2.giss.nasa.gov (server2.giss.nasa.gov [192.42.70.179]) by isis.giss.nasa.gov (AIX4.3/UCB 8.8.8/8.8.8) with ESMTP id IAA19860 for ; Wed, 17 May 2000 08:39:49 -0400
    Received: from sphinx.giss.nasa.gov (firewall.giss.nasa.gov [192.42.70.1])
    	by server2.giss.nasa.gov (980427.SGI.8.8.8/8.8.8) with SMTP id HAA81892
    	for ; Wed, 17 May 2000 07:30:29 -0400 (EDT)
    Received: from nosferatu.whoi.edu ([128.128.17.23]) by sphinx.giss.nasa.gov; Wed, 17 May 2000 11:29:20 +0000 (UTC)
    Received: from mail.whoi.edu (dns0.whoi.edu [128.128.16.1])
    	by nosferatu.whoi.edu (Postfix) with ESMTP id 091B52846
    	for ; Wed, 17 May 2000 07:29:21 -0400 (EDT)
    Received: from [128.128.24.144] by mail.whoi.edu
              (Netscape Messaging Server 3.6)  with ESMTP id AAA6C5B
              for ; Wed, 17 May 2000 07:29:18 -0400
    Mime-Version: 1.0
    X-Sender: rhealy@mail3.whoi.edu
    Message-Id: 
    In-Reply-To: <3.0.3.32.20000516191143.00aaf9b0@babylon.giss.nasa.gov>
    References: <3.0.3.32.20000516191143.00aaf9b0@babylon.giss.nasa.gov>
    Date: Wed, 17 May 2000 07:34:24 -0400
    To: Gavin Schmidt 
    From: Richard Healy 
    Subject: Re: To all GCM users: GCM Wishlists - Initial response and
     reminder
    Content-Type: text/plain; charset="us-ascii" ; format="flowed"
    Status: RO 

    Gavin, This pretty much covers most of the things I wished for working with the model over the years. I might add the following: - The paleo year (year Before Present) as an input variable (in NAMELIST if we keep that) so that the insolation is automatically calculated. - Get rid of NAMELIST as in an input and use a standard text file configuration so that all parameterizations are explicit and explanations of each are accessible inside the file. - I like the ideas about the acc and rsf files. To expand it I suggest creating multiple records inside the rsf and acc files and each record contains standard header information describing the length and type of quantities in the rest of the record so that one can read the acc/rsf file no matter what model generated it. The first record will tell how many records are in the file. Of course, strict adherence to this format would be necessary for it to work. - The database of models should be just that - a database. There is a free version of a SQL standard database called PostgreSQL which is easily installed on unix and can have a web interface. I'd like to attend this meeting. As I mentioned I'll be coming down Memorial Day weekend and I can drop by Tuesday after. Gavin, I just thought of a couple more suggestions (I've been writing up some of the modifications to the DEC version of the model). - Change array INDEX in DB112M9.S to something else (NDEX?). INDEX is an intrinsic function in Fortran. - Setup platform specific meta blocks (#ifdef - #endif) in the code and use the precompiler option -D when compiling. - Use a file manager system as in the GFDL MOM 2 model for opening and closing files and put the file names inside the configuration file (with an appropriate conf name). The configuration file is the same as I mentioned in the previous email. - Let the tracegas constants be defined as an input parameter. There may be more as I think of them. -Rick


    From smenon@giss.nasa.gov Wed May 17 10:30:33 2000
    Received: from server2.giss.nasa.gov (server2.giss.nasa.gov [192.42.70.179]) by isis.giss.nasa.gov (AIX4.3/UCB 8.8.8/8.8.8) with ESMTP id KAA24278 for ; Wed, 17 May 2000 10:30:33 -0400
    Received: from polaris.giss.nasa.gov (polaris.giss.nasa.gov [192.42.70.107])
    	by server2.giss.nasa.gov (980427.SGI.8.8.8/8.8.8) with ESMTP id JAA79312
    	for ; Wed, 17 May 2000 09:26:51 -0400 (EDT)
    Received: from anan (anan.giss.nasa.gov [198.116.18.124]) by polaris.giss.nasa.gov (AIX4.3/UCB 8.8.8/8.8.8) with SMTP id JAA04030 for ; Wed, 17 May 2000 09:14:33 -0400
    Message-Id: <3.0.3.32.20000517094002.0097de80@polaris.giss.nasa.gov>
    X-Sender: smenon@polaris.giss.nasa.gov
    X-Mailer: QUALCOMM Windows Eudora Pro Version 3.0.3 (32)
    Date: Wed, 17 May 2000 09:40:02 -0400
    To: gavin@isis.giss.nasa.gov
    From: Surabi Menon 
    Subject: GCM
    Mime-Version: 1.0
    Content-Type: text/plain; charset="us-ascii"
    Status: RO 

    Hi Gavin, The list seems pretty good. One suggestion is that it would be a good idea to know what eveyone does with the GCM. e.g the users could provide a 4-5 lines stating what they use it for. this could go in the documentation manual. I am quite unaware of what each person does and what modules do what. e.g We have 5 tracers for sulfate chemistry and we have organics and seasalt sources.. which we use to study the indirect effect. This would be helpful in that if people were interested in some aspect of the code we have or likewise we know that it is available right here and can be incorporated. Thanks and good that this is being worked on. Cheers surabi ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Surabi Menon NASA GISS/Columbia Univ 2880 Broadway New York, NY 10025 Tel: 212 678 5592 Fax: 212 678 5552 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


    From acyxc@kirk.giss.nasa.gov Thu May 18 11:05:34 2000
    Received: from kirk.giss.nasa.gov (kirk.giss.nasa.gov [192.42.70.51]) by isis.giss.nasa.gov (AIX4.3/UCB 8.8.8/8.8.8) with SMTP id LAA17938 for ; Thu, 18 May 2000 11:05:33 -0400
    Received: by kirk.giss.nasa.gov (AIX 3.2/UCB 5.64/4.03)
              id AA16364; Thu, 18 May 2000 11:05:47 -0400
    Date: Thu, 18 May 2000 11:05:47 -0400
    From: acyxc@kirk.giss.nasa.gov (Ye Cheng)
    Message-Id: <0005181505.AA16364@kirk.giss.nasa.gov>
    To: gavin@isis.giss.nasa.gov
    Status: RO 

    Hi, Gavin, About the GCM documentation, perhaps we can develop the following scheme (similar to javadoc): (1) Keep the doc and code in one place (in the source code). In the beginning of every subroutine, the writer of the subroutine are required to write comments in a given format, like the following: c/** c @summary general comment here c @author name(s) of author(s) c @version versioning c @param list and explain each parameter c @param list continued c @calling list all subroutines to be called by this subroutine c @calling list continued c @other stuff c*/ In the comments, HTML tags can be used, including links. All the commnets between c/** and c*/ will be retrieved by a utility (see below); while other comments will not. (2) Develop a doc utility (call it, say, gcmdoc) to automatically retrieve the formated comments (like the one above) from all subroutines, and turn them (together with other info) into a nicely structured HTML file. Any user can run gcmdoc on any rundeck any time to get an updated , professional-looking HTML document and view it using a browser, and with the built-in hyper-link, jump freely from on module to another. (3) Traditional documentation suffers from the problem that the code and the doc diverge over time. But the above scheme is free of this problem. Also, imposing on the writers a given format of comments will effectively (I hope) drive the writers to add more decent comments (realizing that their comments will be easily retrieved and actually used by other user). (4) The proposed doc utility may first look into a rundeck, get the development hierarchy, get all the modules, and all subroutines' names and signatures and the formated comments. Putting together, it will be a HTML document ready on the browser. The useage may be: gcmdoc B567M12 and in a few moments, B567M12.html will be ready. When viewing it, you will first see a tree structure, indicating B567M12's parent, grand, and grand-grand parents. By clicking a link, you will see all the modules used by B567M12 (together with other resources used like initial files), and see their relation among each other. You continue to click a module, then you see all the subroutines, when clicking a subroutine, you see all its parameters' definitions, which calls which, and so on. Regards, Ye, 5-19-00


    Fromkelley@giza.giss.nasa.gov Thu May 18 16:19:01 2000
    Received: from giza.giss.nasa.gov (giza.giss.nasa.gov [192.42.70.33]) by isis.giss.nasa.gov (AIX4.3/UCB 8.8.8/8.8.8) with ESMTP id QAA18888 for ; Thu, 18 May 2000 16:19:01 -0400
    Received: from localhost (kelley@localhost)
    	by giza.giss.nasa.gov (8.8.8/8.8.8) with SMTP id QAA33138
    	for ; Thu, 18 May 2000 16:19:01 -0400
    Date: Thu, 18 May 2000 16:19:00 -0400 (EDT)
    From: Maxwell Kelley 
    To: gavin@isis.giss.nasa.gov
    Subject: GCM wishlist
    Message-ID: 
    MIME-Version: 1.0
    Content-Type: TEXT/PLAIN; charset=US-ASCII
    Status: RO 

    Hi Gavin, Finished that list trying not to sound like too much of a complainer. You can open this in your browser I hope.

    GCM $.02 USD
    Maxwell Kelley

    IMPLICIT NONE
    Mortal programmers such as myself are frequently burned by implicit typos. The introduction of IMPLICIT NONE could be done on a subroutine-by-subroutine basis, as part of any efforts to document them; the best way to document the model is to write self-documenting code, I think. Those who believe in the principle of parsimony should consider how much time they have to spend explaining verbally to people how the GCM works.

    There are a couple of ways to automate the tedious task of enumerating the untyped variables in a particular subroutine. The simplest method is to the compiler do the work. Compile a subroutine using IMPLICIT NONE, and utilize the compiler's list of complaints to compile a list of untyped variables in that subroutine, which is usually stored in a file subroutine.ERR:

    f90-113 mfef90: ERROR MYSUB, File = mysub.f90, Line = 19, Column = 3 
    IMPLICIT NONE is specified in the local scope, therefore an explicit
    type must be specified for data object "X".
    
    With grep and a search-and-replace action in a text editor one can strip away all but the names of the offending variables, and then use a utility such as sort to alphabetize the names and deduce their types. That's the easy part. Then it of course takes someone who actually knows the subroutine to describe the purpose of the variables.

    air mass array
    Mortal programmers such as myself make mistakes when having to compute pij*dsig(l) all the time. It would be nice to have the GCM provide this as a courtesy to subroutines which are not changing the air masses. There is absolutely no reason for any subroutine which is not changing the air masses to know how they were computed in the first place. This would probably be advantageous if GISS ever adopted a vertical coordinate which would not allow the air mass to be computed so simply, e.g. a smoothly varying hybrid sigma-pressure coordinate instead of the current scheme which has an abrupt transition between the two at the "tropopause." It would also be nice if the array had mks units instead of millibars. I've been burned in the past when I forgot to convert between the two. Introduction of p(i,j,l) and/or pk(i,j,l)=p**kapa arrays would also be nice. RAM is large enough now that these arrays would not be a burden.

    workspace
    It would be nice if important two-dimensional arrays like prec,tprec,precss,cosz existed in a sensible location, rather than in workspace common blocks. The relative cost of storing a few extra two-dimensional arrays outside of workspace becomes negligible as the vertical resolution of the model increases.

    An example of (my) confusion arising from the storage of important arrays like precss in workspace may be found in subroutine EARTH. The amount of supersaturation precipitation is a quantity passed by EARTH to the land surface routine. Although subroutine CONDSE goes to the trouble of saving the amount of supersaturation precipitation in addition to the total precipitation, it saves the precss array in a workspace common block which is subsequently overwritten in the radiation code before it can be accessed by EARTH. The latter subroutine declares an array for this purpose but the array is never accessed by CONDSE and (probably) contains nothing but zeros. Did someone decide that perhaps it was better that the land surface routine receive all convective precipitation?

    physical constants
    It would be nice if physical "constants" such as lhe,grav,stbo,twopi were made into centrally declared PARAMETERs rather than residing in common blocks or being declared (and duplicated) in individual subroutines. When I am postprocessing GCM output and need to call GEOM, it would be convenient if GEOM did not expect to find constants like twopi stored in a common block.

    There will of course be a discussion on which "constants" are actually constants. I think it is safe to say that variable "constants" will not improve the performance of the current GCM by much, excepting those "constants" having to do with snow and ice.

    saturated vapor pressure function
    It would be nice if a function qsat returning saturation vapor pressure (or mixing ratio) existed accessible to all subroutines, rather than having each subroutine declare their own favorite version of it.

    array indexing
    In programming tracer advection I have found it necessary to optimze memory access. When performing operations upon a set of variables, the closer the variables are to one another in physical memory, the better. Consider the example of the QUS advection of a variable q(i,j,l), which in addition to the mean (zeroth order moment). carries 9 first and second order moments

    qx,qy,qz,qxx,qyy,qzz,qxy,qyz,qzx(i,j,l)

    The scheme was programmed so that the moments of a given gridbox are separated from one another in memory by im*jm*lm units. The separation grows proportionally worse when the moments for several variables such as different tracers are stored together in a common block. I reprogrammed the scheme so that the moments for a variable q are stored as:

    qmom( (/mx,my,mz,mxx,myy,mzz,mxy,myz,mzx/) ,i,j,l)

    where mx,my,mz etc. are integer indices identifying the moments. I have programmed the tracers such that the tracer index is the innermost rather than outermost index as it is currently. This leads to considerable speedup when the same operation is being carried out on all the tracers. The rearrangement generally leads to more compact code when a similar operation is being carried out upon all the moments (especially in FORTRAN 90).

    A related issue, about which I have only an opinion: why are so many arrays (like odata,gdata,etc.) stored and referenced as

       do j=1,jm
         do i=1,im
    C
    C lots of stuff going on here...
    C and then:
           x11 = gdata(i,j,11)
           x16 = gdata(i,j,16)
    C and then do something with x11,x16
    C
    C Or, like in the convection/condensation routines:
           do l=1,lm
             cldij(l)=cloud(i,j,l)
           enddo
    C and then do something to cldij
    C and then store cldij back in cloud
           do l=1,lm
             cloud(i,j,l)=cldij(l)
           enddo
    C
         enddo
       enddo
    
    It seems to me, not knowing that much about how the cache operates on workstations, that this kind of code results in inefficient memory access and hinders parallelizations done over i,j. If gdata/cloud are always involved in loops where i,j are the outermost loop indices, then why aren't i,j the outermost indices of gdata/cloud? Most parameterizations in the gcm have to do with _vertical_ processes. I realize that it is often convenient to read or write a latitude-longitude "map" of a particular outermost index, particularly when the array in question is a diagnostic array. And the problem might be alleviated somewhat if gdata for example were not such a large all-purpose storage array for all manner of ground variables, only some of which are being accessed at a given time. But I think for the benefits of parallelizing the vertical parameterizations over latitude-longitude outweigh any disadvantages of reordering the storage of certain arrays.

    pocean,poice,plice,pearth->ptype
    There is no reason why routines which do not change these numbers should have to recompute them all the time. It would be nice if an array ptype explicitly containing these numbers were created, and some global integer parameters like iocean,ioice,ilice,iearth were used to refer to the different surface types.

    NDYN is everywhere!
    Maybe this is just an matter of aesthetics, but why do all the routines have to follow the beat of NDYN? If NDYN*DT always has to equal an hour, why bother computing it? Couldn't there be a coupling timestep called DTGCM which could be an hour (or perhaps shorter as the resolution increases)? The individual routines (like SURFCE) could then choose their internal timesteps based on DTGCM, not NDYN*DT.

    MB/PB contain too many subroutines
    With different people working on different subroutines, it is easier to incorporate their updates if they were each working on files that did not share unrelated subroutines. That way, automatic generation of rundecks would not have to employ artificial intelligence to decide the (perhaps impossible) order in which to link the object files to utilize the desired version of each subroutine. Manual inspection of rundecks to determine what version of subroutines are being used would be somewhat faster. I don't know how many times I've looked at rundecks with about 4 different MB's and PB's, and had go look at each update deck about 4 times to finally get straight what was used. Maybe that's just me. Similarly, makefiles (if adopted) would be somewhat more transparent.

    PRECIP/SURFCE/EARTH/GROUND
    Subroutine SURFCE needs to be split up; it is trying to do too many tasks on different levels. and with different people working on the different tasks, it has been a hassle for me to merge the changes happening in all the different tasks. Specifically, I'm talking about my hassles of having to merge various people's updates of the ice coding with updates to the PBL/turbulence scheme. This is a problem similar to MB/PB containing unrelated subroutines. SURFCE is trying to be both a high-level driver routine for the PBL and a low-level code which handles the gory details of implicit time differencing for sea/land ice etc. It is true that the PBL is closely coupled to the calculation of surface fluxes, and a short coupling timestep may be needed. But that doesn't mean everything needs to be in the same kitchen-sink subroutine. Couldn't some of the ocean/ice code in SURFCE be merged with that in PRECIP and/or GROUND?

    Meanwhile, subroutine EARTH also acts as a driver routine for the PBL code. Thus to maintain the PBL code, one has to keep track of two driver routines. It would be nice if PBL were a separate latitude-longitude subroutine, called perhaps from SURFCE, which stores latitude-longitude arrays needed to calculate surface fluxes over the different surface types. It would get its input data from, say, the gdata array. That way, one does not have to keep track of different driver code for different surface types (currently, one has to watch both SURFCE and EARTH, and if the ocean/ice code were split up, say, into OCEAN, SEAICE, LNDICE then there would be that much more driver code).

    static compilation
    I hear Jeff has worked at doing away with this necessity in the subroutines that he has parallelized.

    array bounds
    It would be nice if programmers did not refer to variables without using their names. Except in very special cases when referring to an entire common block at once with equivalence, when the common block contains too many variable names to conveniently enumerate.

    input/output
    It would be nice if when new variables (such as tracers) are added to the reading/writing of restart files, they did not have to go on the same lines as variables having nothing to do with them. Each continuation line of the read/write statement could be given to a separate group of variables, as is done already to some extent. Also, I dislike the practice of equivalencing arrays to common blocks for succinctness of input/output code. There are very few common blocks which have too many variables to list on more than a couple of lines. And I doubt we'll reach the limit of 99 continuation lines anytime soon.

    MODULE
    What is a MODULE? At the very simplest level, it is a FORTRAN 90 construct which improves upon and replaces INCLUDE files. When used only at this level, it is not an earth-shattering development in the history of programming languages. Implementation of it at this level would not change the look and feel of the GCM that much. Instead of INCLUDEing a file, a subroutine USEs a module. For example, INCLUDE 'BBxxx.COM' becomes USE BBxxx_COM. All the declarations within BBxxx.COM are now stored in a file BBxxx_COM.f90 which has been compiled into an object file BBxxx_COM.o. When a subroutine USEs the module BBxxx_COM, it has access to all the variables declared in it. In fact, any variable declared within BBxxx_COM no longer has to be stored in COMMON for its value to be preserved between subroutine calls; the module is a mega-COMMON block.

    In this sense, a module is just a glorified hybrid of INCLUDE files and COMMON blocks. So what would be the point of adopting it? At the current time, one could argue that perhaps there is no need, the same way that no one ever thought that early computer software would ever be used past the year 1999. But I think that as the GCM grows (modularly!) to encompass ever more processes, the communications between its component subroutines will grow in complexity to a point where the INCLUDE/COMMON combination will no longer be practical. Modules offer some additional features over the INCLUDE/COMMON combination which greatly facilitate the development and maintenance of code and foolproof it against those who often forget or don't know what they are doing, like me:

    • the USE statement:
      Often the communication between any two components of the GCM only involves a few key variables. Is there a way for subroutines to communicate without telling their whole life story? There may be a number of variables declared in a module you are USEing which you do not want your program to see, or perhaps you have variables with the same names and wish to avoid conflicts. You can specify exactly which variables you wish to access from a particular module, and the rest are not visible. On the other hand, if you try to USE a variable which doesn't exist in the module you are USEing, the compiler will tell you. This feature is really practical in conjunction with IMPLICIT NONE. Finally, if you wish to see variables from a particular module, but your subroutine refers to them by different names, you can easily map between them in the USE statement without employing EQUIVALENCEs. In summary, the USE statement is a robust way to pass specific information between subroutines without resorting to (long) argument lists which which have to be enumerated by the calling programs as well. An additional feature of modules is that one can declare subroutines and functions within them, leading to a foolproofing capability in which variables in modules are either PUBLIC (accessible outside the module) or PRIVATE (accessible only to functions and subroutines within the module).
    • Nesting:
      One module USEs other module(s) to create a module customized for a particular purpose. For example, the quadratic upstream scheme INCLUDE file (SOMTQ.COM) declares arrays with dimensions IM,JM,LM, but does not itself define IM,JM,LM. The reason this works is that all the routines INCLUDEing SOMTQ.COM also have to include BBxxx.COM on a previous line, even if they care only about IM,JM,LM and nothing else from BBxxx.COM. In fact, the actual quadratic upstream scheme advection code does not want to see anything in BBxxx.COM, so it has to go to the trouble of redeclaring IM,JM,LM itself. Now, if BBxxx.COM and SOMTQ.COM were made into modules BBxxx_COM SOMTQ_COM , then SOMTQ_COM could USE BBxxx_COM to get the values of IM,JM,LM (only), and any routine which USEd SOMTQ_COM but not BBxxx_COM would still see the values of IM,JM,LM.

    There are only a few main issues that would arise if simple modules were used in the GISS model:

    • How to implement within the GISS update system. This should be pretty easy.
    • How to compile an executable when modules are used.
    • implicit real*8 's have to be declared within each subroutine
    • U,V,T,P,Q are not in a named common block in current BBxxx.COM, so one has to redeclare their dimensions when they are used as arguments to subroutines. Also, one has to delete the COMMON U,V,T,P,Q declarations. I guess these variables were not put into a named common block because then any routine using the main include file (BBxxx.COM) could not receive any arguments with the names U,V,T,P,Q.

    From adelgenio@giss.nasa.gov Fri May 19 09:43:08 2000
    Received: from polaris.giss.nasa.gov (polaris.giss.nasa.gov [192.42.70.107]) by isis.giss.nasa.gov (AIX4.3/UCB 8.8.8/8.8.8) with ESMTP id JAA23806 for ; Fri, 19 May 2000 09:43:07 -0400
    Received: from gewurz (gewurz.giss.nasa.gov [198.116.18.184]) by polaris.giss.nasa.gov (AIX4.3/UCB 8.8.8/8.8.8) with SMTP id JAA20448 for ; Fri, 19 May 2000 09:31:58 -0400
    Message-Id: <3.0.3.32.20000518214535.006fbad8@polaris.giss.nasa.gov>
    X-Sender: pdadd@polaris.giss.nasa.gov
    X-Mailer: QUALCOMM Windows Eudora Pro Version 3.0.3 (32)
    Date: Thu, 18 May 2000 21:45:35 -0400
    To: Gavin Schmidt  (by way of Sabrina Hosein )
    From: Anthony Del Genio 
    Subject: Re: To all GCM users: GCM Wishlists - Initial response and
      reminder
    In-Reply-To: <3.0.3.32.20000516191143.00aaf9b0@babylon.giss.nasa.gov>
    Mime-Version: 1.0
    Content-Type: text/plain; charset="us-ascii"
    Status: R
    
    Gavin,
    
    One thing I didn't see on the list, but perhaps possible with a complete
    redesign:  Allow for a single code to accommodate a variety of resolutions
    (at least vertical resolutions, which may be somewhat easier than
    horizontal) with a single parameter change or at least a minimum of
    changes.  As a fallback position (for both horizontal and vertical
    resolution) include in the documentation an explicit series of instructions
    that tells the user what has to be changed where to run the model at a
    different resolution.
    
    By the way, "definitions of every variable" is a useful exercise only if
    the units are given.  Trying to figure out whether some GCM variable is in
    actual physical units, multiplied by air mass, area, etc., is one of the
    biggest time-wasters I know.  An even better idea would be to relate the
    GCM variable to the units one would encounter in the real world, for
    example, how does one go from northward transport of dry static energy,
    which actually has units of (m/s)**3, i.e., (J/kg)*(m/s), to the GCM
    diagnostic unit of watts/dsig?
    
    								Tony
    
    Anthony D. Del Genio
    NASA Goddard Institute for Space Studies
    2880 Broadway
    New York, NY  10025
    Phone:  (212)678-5588
    Fax:      (212)678-5552
    

    
    From rschmunk@giss.nasa.gov Fri May 19 17:06:06 2000
    Received: from server2.giss.nasa.gov (server2.giss.nasa.gov [192.42.70.179]) by isis.giss.nasa.gov (AIX4.3/UCB 8.8.8/8.8.8) with ESMTP id RAA18696 for ; Fri, 19 May 2000 17:06:06 -0400
    Received: from akira.giss.nasa.gov (akira.giss.nasa.gov [192.42.70.139])
    	by server2.giss.nasa.gov (980427.SGI.8.8.8/8.8.8) with ESMTP id RAA31546;
    	Fri, 19 May 2000 17:07:31 -0400 (EDT)
    Received: from [192.42.70.36] (192.42.70.36) by akira.giss.nasa.gov with
     ESMTP (Eudora Internet Mail Server 3.0b10); Fri, 19 May 2000 17:06:47
     -0400
    Mime-Version: 1.0
    X-Sender: rschmunk@akira.giss.nasa.gov
    Message-Id: 
    Date: Fri, 19 May 2000 17:06:33 -0500
    To: gavin@giss.nasa.gov
    From: "Robert B. Schmunk" 
    Subject: Re: gcm wishlist
    Cc: mchandler@giss.nasa.gov
    Content-Type: text/plain; charset="us-ascii" ; format="flowed"
    Status: R
    
    Gavin,
    
    Regarding Chandler's wish
    
    >  Create a GISS-wide database that collects all pertinent info about
    >  simulations (with a short description). The last time we had such a
    >  thing was when Reto made everyone fill out a page in a notebook
    >  describing your run BEFORE you were assigned a run number. This was a
    >  minor inconvience that kept things far more organized.
    
    This is something that I have discussed with him a number of times
    as an appropriate and, I think, relatively simple use of a database.
    I have database software installed on web1 (actually two different
    dbs!) and all that is needed is to strap a web interface around
    appropriate tables for what I have referred to as a "rundeck database",
    and then GISS staff could just fire up Netscape to access the info.
    
    This is something I could probably put together in a short timeframe;
    I only require information on what data about each rundeck/simulation
    would be useful to track. I'm not sure why Chandler and I have not
    yet already pursued this idea (the topic first came up at least a year
    ago).
    
    >thanks Rob. Have a look also at the ideas outlined by Ye Cheng. That is
    >a self-documneting system based appartently on javadoc to produce html
    >output. I'd like to know what you think of that also.
    
    Haven't used javadoc, but the idea sounds similar to perldoc.
    
    Let's make a couple of assumptions:
    
    a) A webserver has been installed on a machine which has access to the
        rundeck and the code.
    
    b) Ditto a dbserver.
    
    Using some nifty-neato web interface, the user could use his browser
    to get into the db and look up a rundeck/simulation ID, eg. B567M12.
    Upon clicking the appropriate link, button, whatever, the interface
    script could execute a command (gcmdoc.pl?) which extracts the comments
    and creates the necessary HTML page(s) with the info Ye Cheng describes.
    The amount of HTML generated could vary, depending on how dynamic you
    want the script and whether or not some amount of caching is going on.
    
    Assuming Perl was used to create gcmdoc, then the comments that
    Ye Cheng describes as being included in the code may need not include
    embedded hyperlinks. The script could be made smart enough to figure
    much/most of that out itself.
    
    Without thinking about the topic for more than 15 seconds, the biggest
    problem with all the above that pops into my head is that since this
    all has to be on a machine which has access to the rundeck and the code,
    it has to be inside the firewall. This means anyone accessing GISS via
    the Net (as opposed to dialing up Baalbeck direct), including GISS-ters
    who opt to tele-commute, whether they be in Brooklyn or Wisconsin, would
    not be able to access the information. Perhaps some sort of very specific
    hole could be punched in the firewall for this, but I'm not sure that's
    a particularly good idea.
    
    rbs
    --
    Robert B. Schmunk: rschmunk@giss.nasa.gov
    Webmaster, NASA Goddard Institute, 2880 Broadway, New York, NY 10025 USA
    
    

    From acamh@kirk.giss.nasa.gov Mon May 22 11:28:20 2000
    Date: Mon, 22 May 2000 11:28:34 -0400
    From: acamh@kirk.giss.nasa.gov (Armando M. Howard)
    To: gavin@isis.giss.nasa.gov
    Subject: GCM code
    
    
            There was a lot of discussion about modularization of the sea-ice 
    sometime back, but I'm not sure what the final conclusion was. Will your
    overhauled GCM program have a sea-ice routine that communicates with the
    atmosphere and ocean only by passing fluxes and surface quantities? If such
    a thing were possible I think it would be desirable.
    
            Also after Hansen decided to eliminate partial grid boxes from the
    atmosphere as unnecessary, the issue arose that it make the transfer of
    fluxes from the atmosphere to an ocean with a different grid more innaccurate
    to do so. Is it possible to have as an option a fractional grid ocean gridboxes
    whose sizes are adjustable to deal with apportioning fluxes to possibly
    different ocean grids?
    
    -Armando