On the use of models in meteorology

Chuck Doswell

Created: 09 October 2000Updated:14 April 2005: fixed various outdated links

This represents what is for me a typical expression of my personal opinion. It is put out on the Web in hopes of stimulating discussion. If you want to send me an e-mail about any part of this diatribe, send it to: cdoswell@earthlink.net.

I. Introduction

This diatribe results from an accumulation of aggravations over a period of years. I've made some attempts to develop formal publications on this and they've languished for lack of hard results to back up my unsubstantiated assertions. Hence, this should be understood not in terms of a scholarly, objective look at the facts, but as my subjective impressions and opinions. Therefore, as usual, I'll welcome comments (see above e-mail address) and potential additions and revisions to this over time.

a. Historical summary

In this introduction, I want to review briefly the history of modeling in the atmospheric sciences. Prior to the dawn of the 20^th century, meteorology was a bit of a scientific backwater. Basic physical concepts were understood reasonably well and the contemporary physics of fluid flow had been applied to the atmosphere. That is, the basic equations of atmospheric flow had already been formulated.

It was also clear that solutions of these equations were not going to be easy to come by. As every beginning dynamics student knows, it is possible to develop analytic solutions to only a very small class of problems in atmospheric science. Those problems are determined by numerous limiting assumptions that produce primarily linear equations. The range of problems where analytic solutions of nonlinear equations apply is even more limited. Hence, pre-20^th century meteorology was a sort of patchwork quilt of isolated theoretical problems that could be solved analytically and a collection of vague empirical results that provided little insight and no systematic basis for forecasting. [See Kutzbach (1979)]

The development of telegraphy brought about a revolution, because weather observations at more or less the same time could be collected and turned into weather maps at something resembling what we now call "real time". Weather systems could be identified and tracked as they moved and evolved, so weather forecasting was becoming a real possibility. This was an important reason for what we now see as the origination of modern meteorological science: the "Bergen School" in Norway.

One of the participants in the Bergen revolution was Vilhelm Bjerknes, who was primarily a geophysical hydrodynamicist. His work was distinct from that of many of his contemporaries, focusing on the application of dynamical laws to the weather. By 1904, he had formulated the weather forecasting problem in terms of physical laws, in terms that we certainly would recognize. He was not alone in doing this, as we now know that others had done similar work, but his is the most direct connection with the Bergen School.

The next big advance was associated with Lewis Fry Richardson, who envisioned a way to solve the equations numerically. His forecasting system was essentially what we now call a "primitive equation" model, that even attempted to include the effect of decaying vegetation! Richardson actually solved the system of finite difference equations by hand for a real case, not realizing there were some serious problems with his approach. Recently, Lynch (1999) has reexamined the forecast that Richardson attempted and shown in considerable detail just what went wrong. Basically, Richardson did not know anything about numerical instabilities and his results were egregiously in error for a number of reasons. His computational proposals were also pretty unrealistic, so it was unlikely that anyone would ever have tried to implement his ideas even if the forecasts had been successful. Perhaps it is of some interest to note that anyone attempting to publish such results today would almost surely fail to get them accepted for publication (negative results)! Anyway, Richardson's efforts represented a watershed in thinking about the problem of weather forecasting.

The next big breakthrough came with the development of digital computers and the entrance of John von Neumann into the field, along with a number of collaborators, including Jule Charney, Ragnar Fjørtoft, John Freeman, and others. This team chose to work with a filtered system of equations rather than the primitive equations, and they also were much more knowledgeable than Richardson when it came to numerical methods. Hence, they were able to develop some reasonable forecasting results. The model they developed was essentially a barotropic model (see Charney et al. 1950) and so was pretty simple in its concepts. However, the ability to solve nonlinear forecast problems through the process of using digital computers to solve the finite approximations to a continuum mathematical model of an atmospheric flow was the key demonstration. If forms the backbone of all modern numerical weather prediction models and inspired a whole new subfield of meteorology: numerical simulations of atmospheric processes that would involve the "solution" of mathematical models.

Since then, we have seen an explosion in the field. In fact, the very term "dynamic" meteorology has come to be virtually synonymous with numerical solution of mathematical atmospheric models of all sorts. As the operational NWP models have improved, they have come to be the de facto standard by which operational forecasts are judged. In combination with statistical post-processing, the output from numerical models can be used to develop weather forecasts that can be used to create forecasts that are indistinguishable in form (if not in content) from those produced by human weather forecasters. After this post-processing system had been in place for a while, Leonard Snellman (1977) expressed concern for what he called "meteorological cancer" in public weather forecasting: the passing on of objective forecast guidance from the models (via the intermediary of MOS) without human intervention. Snellman's concern echoes sentiments expressed as early as 1956 by Sverre Petterssen and others.

b. Some personal perspectives

I (and my colleagues, notably Dr. Harold Brooks) have voiced related concerns here and here and here and here and here and here and here. It is abundantly clear that modeling of atmospheric processes and the development of solutions to those models via computers is here to stay. No one in their right mind would want to return to the days when the only solutions were analytic ones. The issue is not one of a choice between using numerical models and not using them; rather, the issue is one of how best to use them (see here for some discussion of an interesting alternative for the use of model forecasts).

Since I am writing what amounts to a diatribe, it should be obvious that I am not happy with the way we as a profession are using numerical models, on the average. Naturally, exceptions to the norm exist. As is typical in my diatribes, if the shoe doesn't fit you, relax. Therefore, I will be trying to enunciate in what follows what I think is wrong about our use of numerical models (hereafter, "models" will refer exclusively to numerical models and their solutions, and when another sort of model is being discussed, it will be identified explicitly ... e.g., a statistical model, or a mathematical model, or whatever) and to suggest what I believe to be more appropriate uses of models, in both forecasting and in research.

A common thread exists within the aggregate community of the model-users: when pressed, they all agree that their model is not a source of absolute truth. They profess to acknowledge the distinction between their model results and the real atmosphere. Nevertheless, their behavior often suggests strongly to me that these are mere words and their real belief is that their model results are essentially equivalent to describing the true nature of the real atmosphere. Let me say at the conclusion of this introduction that I will not welcome comments to the effect that everyone knows that the model results are not equivalent to reality. Even though that may be true in some abstract sense, it is not at all clear to me that all modelers and model users behave as if that is the truth.

Modeling has become so entrenched in our professional culture that it has become common for reviewers of papers presenting observational studies to call for "validation" of the observational results through the running of a model!! This is a grotesque inversion of the scientific method; we should not ask that observations fit some model but, rather, that the model fit the observations. My experiences along these lines obviously are a partial source of the frustration that has prompted this diatribe.

II. In forecasting

Numerical weather prediction has come to dominate everyone's thinking in the National Weather Service. Post-processing of the model output, notably using the regression technique called Model Output Statistics (or MOS), has produced the capability to forecast virtually any weather element, not just the variables used by the model. Note that either gridpoint or a spectral model describes the atmosphere in terms of the variables in the governing equations: typically pressure, temperature, humidity, and wind ... occasionally, the model will include any condensed water explicitly in some form or another. This is not necessarily the "weather", so there is some need to "translate" the model output in order to forecast the full range of "weather" as people sense it (fog, thunderstorms, etc.).

The system that has developed based on the models makes it possible to realize fully Snellman's nightmare of meteorological cancer: forecasts produced by computer that go out essentially untouched by human hands. The new weather data processing system (AWIPS) that recently has gone on line is specifically tailored to allow forecasters a great deal of flexibility in manipulation of model data (and post-processed products), but is severely limited in its capabilities to manipulation observed data. This is pretty easy to interpret: forecasters are being subtly (?) encouraged to look at model forecasts (and their derivatives) and discouraged from being involved with observed data.

The system is now sliding slowly toward computer-worded forecasts based on MOS and the other model output. Forecasters have long been discouraged from departing from MOS by very much ... doing so carries with it the implied threat, "You'd better not miss it, if you depart from MOS!" Now, with the implementation of AWIPS, there is less and less capability provided even to consider alternative approaches. Young forecasters, with their typical BS diploma in hand and little or no meaningful training, know only the models.

I read and listen to forecast discussions now where the total focus is on the models.

Roger Edwards says:

Want an example? This is so common it is frightening. Chuck, it took me all of 10 seconds of random clicking on AFD locations in NWX to find one: [Note... I have altered this slightly to prevent identification of the office being quoted.]

AREA FORECAST DISCUSSION

NATIONAL WEATHER SERVICE ccc ss

300 PM XDT THU OCT 26 2000

FORECAST IS ON TRACK...NOT MUCH IN THE WAY OF CHANGE FROM PREVIOUS PACKAGE. THE HIGHLY ADVERTISED UPPER LOW IS CURRENTLY DIGGING SOUTHWARD OVER THE CENTRAL CALIFORNIA COAST. 12Z MODEL RUNS SEEM TO BE IN BETTER AGREEMENT TODAY. AVN IS MORE PROGRESSIVE WITH EASTWARD MOVEMENT OF UPPER LOW AFTER 00Z SATURDAY. THE ETA SOLUTION IS VERY SIMILAR TO THE AVN...BUT ABOUT 6 HOURS SLOWER. WILL SIDE WITH THE ETA. BOTH MODELS SHOW THE LOW OPENING UP A BIT A KICKING OUT FAST TO THE EAST AFTER 12Z SATURDAY. IN FACT...UPPER TROUGH WILL BE NEGATIVELY TILTED AS IT APPROACHES AND PASSES OVER OUR CWA. ...rest deleted

The debate is mostly aimed at the absurd exercise of trying to decide which is the "model of the day". The fact is that there is essentially nothing in the way of systematic procedures to decide the question reliably of which model to believe on any given day, in spite of most forecasters spending an inordinate amount of time trying to do so. Rather than using their time to diagnose atmospheric processes, forecasters often waste hours of their precious time in a futile effort to choose a model in which to believe. An argument can be made that a simple consensus among the models (see Fritsch et al. 2000) is probably the best choice to make, in the long run. Even better would be a consensus weighted according to model performance in synoptic situations ... a concept that would be challenging to implement but probably would outperform any single model pretty consistently.

Young forecasters are not learning how to use observations to make forecasts ... instead, they see model output sliced and diced in a seemingly infinite variety. Even "observations" come to them via incorporation in model initial conditions (as with the RUC).

The "system" seems hell-bent to prevent forecasters from adding value to the model products (including model-derived products, like MOS). This not-very-subtle pressure is working! By and large, NWS forecasters are not adding much value to the model. There are several reasons for this:

Lack of education: it is absurd to believe that a B.S. degree equips a forecaster to add much to model forecasts. There is so much more to modern meteorology than there was even 10 years ago, that ~30 semester hours of meteorology during a 4-year program is just scratching the surface. As I've noted elsewhere, advanced degrees don't guarantee anything, but the lack of them does!
Lack of training: see my Training Rant. We presently have virtually no idea what it takes to make a good forecaster. We don't know anything about the skills and abilities forecasters need to make consistent improvements on guidance, or on what a proper training program should focus.
Lack of encouragement: as already noted, the system is suggesting in pretty clear terms that forecasters shouldn't even try to depart from model "guidance" very much. The tools for doing a proper diagnosis using observational data may not even be available on the current operational workstations.
Lack of meaningful verification : the NWS as an institution has no clue how to do a proper verification, and has some serious integrity issues to address, as well. No program of forecast improvement is possible without meaningful verification.
Lack of a substantive forecast improvement program (see item #11 here) : the absence of training and the inability to do useful verification means that the NWS as an organization has no commitment to improvement of forecasts by humans (rather than models). Forecast improvement means research, and the NWS forecasters on the bench have few if any tools for research provided for them. [NWS Management has no clue about research.] Furthermore, most of them have little grasp of what it takes to accomplish true forecast improvement (see below). Without a systematic approach for using the science of meteorology to improve forecasts, humans have little chance (and that is diminishing with time) to add value to the ever-improving models.

Note: Forecast technique development requires a lot more than just some case studies and a few tests of the proposed ideas. A thorough treatment of the contingency table:

	Observed event	Observed nonevent	Sum
Forecast event	N₁₁	N₁₂	N_1•
Forecast nonevent	N₂₁	N₂₂	N_2•
Sum	N_•1	N_•2	N_••

is necessary. That is, the ability of a technique to discriminate betweeneventsandnoneventsis the critical issue in demonstrating the value of any proposed idea. Showing a few examples where a forecast technique worked is simply inadequate for a worthwhile test of a new idea. Maybe that suffices to convince the proposer to proceed with a thorough scientific test, but that's about all such an unscientific test is worth. With small samples, it is very easy to convince yourself one way or another regarding some idea, but the actual value of some proposed technique needs a substantive look beyond a few case studies.

It seems pretty clear to me what the message is: human forecasters are not long for this world, at least in the public National Weather Service. It's only a matter of time before humans become mere caretakers of the equipment, intervening only in situations where the hard- and software crash. The evolution of "modern" meteorology, as shown by the actions of NWS management (as opposed to their words), is apparently in the direction of a public forecasting system that's entirely automated ... even the warnings (via the radar algorithms). Until there is a clear and substantive commitment to do something to reverse the foregoing reasons for a diminishing capacity for humans to add value to the model forecasts, the gap between model output and humans will continue to decline.

We are fast approaching the point where someone will ask the tough question, "Why do we need human forecasters, when they clearly are not adding very much?" Once the question is asked in the political sphere, it is only a matter of time. Unless something dramatic happens soon to change the path we are going down, I just can't see any future for human forecasters in the NWS, by no later than 2050 ... probably sooner (and perhaps much sooner).

III. In research

That operational forecasters might have trouble with the use of models seems to follow from the problems I've enumerated. However, I also see many examples of problems with the use of models for research, as well.

Using a mathematical model to explain some physical process has deep roots in the physical sciences. To some extent, there is some mystery associated with the connection between manipulation of abstract symbols via some set of rules and the physical world. The mystery is that doing this abstract symbol manipulation could actually yield insights into reality. There is a rich history, however, that confirms the validity of the process. I won't bore you with a thorough review of the history of physical science (although some readings along these lines might well be useful, especially to prospective new scientists; i.e. students). In the first place, I probably couldn't write that history with any sense of confidence in my deep knowledge of it. Second, only the highlights concern me, here.

The early history of the connection between mathematics and science is primarily associated with the development of conceptual systems that are simple enough to have analytic solutions. The mathematics of Newtonian physics may not seem simple to many people but the problems of Newtonian physics can often be simplified to the point of becoming linear systems that can have closed form mathematical functions as solutions. This allows a very thorough treatment of those simplified systems ... the behavior of systems with analytic solutions can be explored exhaustively and completely. When compared with reality, everyone knows precisely what the simplifications were to arrive at that solution, so differences between predicted and observed behavior can be attributed to those simplifying assumptions.

I remember the first time I encountered meteorological dynamics that began with, "Consider a flat, nonrotating earth." Specifically, I recall my naïve outrage over such extreme simplifications, little realizing at the time how valuable it can be to have closed form solutions to the equations describing a simplified version of some process.

There is great skill, of course, in developing simplified systems that admit analytic solutions. The trick is to know just when, in draining away the bathwater, the baby is on the edge of going down the drain, but is not gone just yet. Done properly, the simplifications retain an important aspect of the physics of the original problem, and ignore other aspects that might be important in some other context, but for the issue at hand are not critical. This can be a challenging tightrope and I admire theoreticians who have a knack for doing this well. It's easy to make models complex, but it's much harder to write simple models that nevertheless manage to illuminate the essential physics of interest, without being clouded with complications.

The advent of computers opened up the whole array of mathematical models to the possibility of a "solution". However, there is considerable risk associated with computer simulations done with either gridpoint or spectral models:

The finite equations solved by the computer are only an approximation to the mathematical models they are purporting to solve. The level of sophistication in modern models is high, but the methods used to create a numerical "solution" need not converge to the same solution that the continuum mathematics contain. There is considerable art to designing computational approaches and the solution obtained may be quite dependent on which choices are made.
Finite mathematics means that the boundary conditions are a potential problem. For example, unless the geometry of the problem is closed (as with global models), the lateral boundary conditions are a serious issue. Top and bottom boundary conditions are a struggle, even for global models. Depending on the nature of the problem at hand, the assumptions one makes about the boundaries can be critical in the development of a "solution".
Another, related problem is initial conditions. Given our finite observational systems, there often can be no way to be sure about the initial conditions. In more cases than we'd like to admit, the character of the solution is sensitively dependent on the details of those initial conditions. Many will recognize this as characteristic of nonlinear dynamics, "chaos", and the "butterfly effect". We always need to be concerned about this problem, so long as the problem we're solving is nonlinear. If the problem is linear, after all, we are likely to be able to develop closed form mathematical solutions, after all. It's the nonlinearity that drives us to computer solutions in the first place. The minute we get into this territory, the specter of "chaos" becomes something we need seriously to consider.
Yet another related problem is subgrid-scale phenomena (or their spectral equivalents). For Eulerian models, there is a constant "leakage" of information out of the grid. As scales contract, unresolved physical processes begin to act, mediating process on the contracted scales. This is the so-called "Turbulence Problem" ... in reality, "turbulence" is a sort of code word for "subgrid scale events" ... on the scale of global climate, extratropical cyclones are turbulence. One man's turbulence is another man's explicitly simulated flow. The way we in meteorology treat the subgrid "turbulence" has a potentially huge impact on the results, but we all know that all the existing algorithms for dealing with this (parameterizations) are not actually getting it right. It's just that in order to close the system of equations, we need some sort of treatment of "turbulence" and "viscosity" to keep the numerical simulation under control. We know that all of the existing treatments are "wrong" in some sense.
Another related issue is parameterizations of all sorts. An especially troubling problem is for grids where deep convection must be parameterized; there is no satisfactory understanding, as of this writing, to permit the development of a way to simulate accurately the effect of deep convection in a simulation that does not treat convection explicitly. Microphysical processes are also parameterized. For any given model, there may be any number of parameterized processes, all loosely described as "physics" packages. Basically, parameterization asks for a way to describe the physical impact of a subgrid scale process in terms of a variable (or perhaps more than one) that is explicitly predicted by the model. As an example, convective parameterizations all treat convection in terms of something explicitly forecast on the grid, but there are many competing convective parameterizations precisely because none of the existing ones are completely satisfactory. Frustratingly, there is no reason to believe that even convection-resolving models are a completely acceptable "solution" to the problem of the impacts of deep convection on the atmosphere. They may offer hints and perhaps even deep insights, but unless it can be demonstrated that the model is quantitatively right, and right for the right reasons (see the next item), there is still room for doubt.
Even when doing idealized simulations, the numerical simulation model might be sufficiently different from the "true" atmosphere, that the simulation has little or no applicability. In effect, the results of an idealized simulation are an exploration of the implications of the particular set of assumptions and simplifications embodied in that particular model. The trick, however, is to connect the simulation to the real atmosphere. For instance, a simulation of the atmosphere of Mars might not be all that useful for practical applications here on Earth. This means that models need to be validated. It is necessary to show that the model simulation is indeed likely to be some reasonable representation of the process one wishes to model. This can and does get very tricky. A disturbing possibility is that the simulation can be "right" (according to some accuracy metric) for the wrong reasons! The implications about the physics of the simulated process may not be anything like the actual physical processes being simulated, and it may not be very obvious that a problem of this sort even exists! Getting a good result (by some metric) is often enough in some modelers' minds to convince them that the quantitative assessments from the model output are correct. Personally, I don't think verisimilitude (i.e., does the solution look like an observed case?) is sufficient for deciding the value of an idealized simulation. Molinari and Dudek (1992) have an excellent discussion of this topic; they conclude, "the more powerful the tool, the more care is required to interpret its results."
Observations cannot be validated by model simulations. Models are often the tool of choice because of a perceived lack of data for a thorough treatment of the issue at hand via an observational study. Given an observations-based study, however, we now have the absurd notion that the authors of observational studies often are forced to include a modeling component to convince model-blind reviewers of the validity of the conclusions. As noted, this is an inversion of the scientific method; one forces the model to fit the observations, not the other way around. It is certainly true that the use of a numerical model can be helpful in determining the validity of some concept used to explain the observations, but there is no need to validate observations!

I have talked about my definition of science elsewhere: briefly, I believe science to be the formulation, testing, and revision of models (of various sorts) of the natural world. The testing is against observations and the revision is to achieve a better, more reliable fit to observations. Whereas it is possible in doing mathematical derivations, to prove that a conjecture is correct, in science, it is never possible to prove that something is correct. Rather, we can show that some conjecture is not likely to be correct, but it is impossible to prove scientific concepts to the same level of certainty that is possible in mathematics. Science is inherently about empirical testing. Mathematical models need to be shown to have been derived using the rules of the relevant mathematics correctly. Scientific models, on the other hand, can only be proven to be wrong, in the sense that they make less than perfect predictions about behavior in the real world. Thus, virtually all scientific models are wrong, at least until they have achieved perfection (which has not yet happened for any scientific model). Of course, some scientific models (like Einstein's relativity theories, or the 2^nd Law of Thermodynamics) have never been shown to fail, but at least the logical possibility of a counterexample has to be admitted. If counterexamples have not yet been found, such hypotheses are elevated to a higher stature, because so many tests have been performed; tests specifically designed to ferret out any hidden flaws. Thus, the hypothesis constituting the 2^nd Law of Thermodynamics is called a "Law" and not just a hypothesis because no counterexamples ever have been found. Einstein's relativity theories are called "Theories" in honor of the detailed and stringent challenges they have overcome ... in science, a "Theory" is something special, that is very distant indeed from the colloquial use of the term "theory" (as in barroom arguments where someone asserts that they have a "theory" about something or another).

Numerical simulation models must be shown to be relevant to address the questions posed by meteorologists before their results can be considered acceptable. The key is to use the model to make testable predictions about atmospheric behavior. That is, the data exist (or can be collected) to give the model a reasonable test. It is not always easy to go from a hypothesis to a test. As Karl Popper has suggested, we recognize the beauty of an experiment by its capacity to provide direct evidence refuting a particular hypothesis. We shouldn't be designing experiments to prove our conjectures. Rather, we should be designing clever ways to refute them! If our hypotheses survive a well-crafted critical experiment designed at the outset to refute them, then we can have some confidence of their validity. How much confidence?

Here's where statistics rears its seemingly ugly head. Many modelers are unwilling to learn statistical testing more than superficially. Robert Hooke (1963, in his Preface) offers the view that the choice is not whether or not to use statistics [in science]. The only choice is whether to use good or bad statistics. In spite of its huge significance to science, my experience is that many scientists have at best a rudimentary knowledge of statistics. There is an almost universal willingness among scientists to accept small samples as representative of what large samples would show. This naïveté is disturbing in those who should know better: specifically, scientists. Modelers seem altogether too willing to see success in their simulations, when a more self-skeptical attitude would be the proper stance.

Modelers in research tend to be uninterested in anything having to do with observations. The fact that observations have their own sets of problems seems to put many modelers off the path of using observations to help establish confidence in their simulations. Sure, there are many issues with observations: sampling deficiencies in space and time, measurement errors associated with the instruments, difficulties with handling diverse data archive sources, etc. The model seems much "cleaner" in the sense that resolution is more or less uniform, all variables are available at all points (more or less), the model is based on what is known about the laws of physics, etc. It's not hard to see why many students are attracted to modeling. The problem, as I've noted, is that many modelers, if not all of them, come to see their model as "the atmosphere" rather than a model. They over-generalize their results and always see the model output as accurate, even in cases where it is demonstrably in error ... they give the model the "benefit of the doubt" and choose to suspect any contradictory data rather than the model. This is profoundly unscientific, but is more and more often seen as acceptable behavior.

IV. Modeling and reproducibility in research

Although it almost certainly is not possible to define precisely what the "scientific method" requires of scientists, there is some considerable agreement (at least outside of meteorology) on the importance of being able to duplicate the results obtained in a scientific experiment or calculation. Being able to duplicate another's results is an important safeguard on the integrity of scientific results. At its core, science demands that its practitioners be skeptical about the results of their colleagues; such skepticism is not a character flaw. Rather, it is a requirement if science is to proceed. For scientific ideas to be accepted, they must be associated with a convincing argument. A key element of a convincing argument is an independent confirmation of the experiment or calculation. Whenever results of an experiment or a calculation are obtained and presented to scientific peers, it is altogether reasonable to receive pointed questions from those peers about the data and methods used to obtain those results. To be asked to assume the validity of the results on the basis of faith or the authority of the presenters is unacceptable in science. Any attempt to avoid such questions could be interpreted as intellectual dishonesty and can be used as justification to reject the validity of those results. The main point of asking such questions is to assess the validity of the methods used; if someone doubts the validity of the methods, then an independent confirmation becomes necessary to establish credibility of the work in question.

In present-day meteorology, numerical modeling has become a cornerstone of our science. Numerical solution of nonlinear systems of equations permits insights that were completely inaccessible prior to the advent of computers and numerical methods. However, at the same time that computer models have grown in sophistication, their complexity has grown to the point that the community of meteorologists is losing the capability to reproduce the experiments. Consider the following:

a. Initial data problems

As noted earlier in this essay, nonlinear models can be very sensitive to the initial data. Reproduction of results would necessitate the widespread availability of the actual values for all the variables used to initialize the model.

b Tunable parameters in the "physics"

The parameterizations inevitably include various coefficients and variables used to relate the process being parameterized to what the model explicitly carries among its variables. Developing "realistic" simulations often involves tinkering with these variables, to get the desired results. A complete disclosure of all values used in this process has to be available.

c. Details of the numerical methods

The numerical schemes of the model can be complicated and full of details that would not be likely to appear in a published paper. A common artifice in publications is to refer the reader to the references for such things. I have found, when tracking such things, that the references may not always be very explicit about such details, so avoiding the disclosure of the details by this artifice can be a form of intellectual dishonesty.

d. Display of the output

Choosing the presentation of the output from the model can be very misleading. Say, during a simulation, that the pattern doesn't look anything like the observations except for some isolated time period during the simulation. Display of the "best" results of the simulation is an especially egregious form of intellectual dishonesty. Access to all of the results of a simulation should be available. The output from a numerical model can be enormous, so the choice of what fields to show, how often to show them, and at what resolution are all choices that can be used to mask results that contravene the interpretations of the presenter.

e. Being "right for the wrong reasons"

At times, a simulation can produce good results (verisimilitude) even though a careful diagnosis of the results would make it clear that this happened more or less by chance. Even a blind chicken gets a kernel of corn now and again. Molinari and Dudek (1992) discuss this in a specific context, but it clearly can arise in many ways. This can be another type of intellectual dishonesty.

f. No benchmark capability for model development

If models of a particular sort have some sort of benchmark result, such as a linear problem for which the true solution is well-known, then it is possible to test the model's ability to reproduce the true solution. A proper simulation of the benchmark can be considered a validation of the model in question. Regrettably, for nonlinear problems, "truth" is pretty hard to come by. At the very least, the scientific community should establish benchmark results associated with a particular nonlinear problem by which other models of that type can compare their results. In fact, I'm surprised that such benchmarks apparently have not been sought to any noticeable extent! It seems to me that this would be a high priority among modelers.

g. Impossibility of giving manuscripts a truly thorough review

Under the current system, modeling results are disseminated in the scientific literature. In most cases, the modeling paper cannot be given a truly thorough review, for all the reasons I have listed above, as well as some I may have overlooked in this diatribe. In the absence of the ability to duplicate the results cited in the manuscript, it is basically impossible to be as thorough as referee might wish to be. Since duplication of the results of an existing paper generally is not considered worthy of publication (not so in other sciences, of course), we have no "cultural tradition" for a really stringent test of modeling results as published.

h. Deterioration of the "culture" - intellectual dishonesty

My conclusion, therefore, is that in meteorological research, modeling results are not being subjected to the sort of careful review that they need. In effect, a modeling paper is almost guaranteed to be publishable, since it is unlikely any referee has the ability to gainsay the results (i.e., all of the foregoing). Perhaps only when we reach some maturity in Web-sharing will it be normal for modelers to make available on the Web (or whatever the Web evolves into) their model code, as well the initial and boundary conditions, thereby allowing any referee to replicate all aspects of their simulations. The situation, as it stands, encourages various forms of intellectual dishonesty that slow scientific progress and generally represent a stain on our scientific integrity. [Note: I believe that observational data and calculation codealsoshould be made available on the Web, for purposes of replication of results.]

V. Discussion

The field of operational meteorology has been transformed by the numerical models. The fact that I am not pleased by the course of that transformation is pretty evident. In spite of many warnings that this might happen (see the foregoing), it is a seemingly unstoppable trend. Will the public decide that a product untouched by human hands is notably inferior to that produced by the system that is evolving for model-based forecasts? I don't know, but the difference between that and what we are doing now, with humans still in the process, is shrinking. By not working as hard as possible to add value to NWP model forecasts, forecasters are actually hastening the day that their jobs become obsolete. What about this are forecasters not understanding? Are they gambling that they can retire before the hammer falls?

In meteorological research, the burden of proof regarding the validity of a simulation should always on the simulation, not the observations. Verisimilitude is at best only a weak form of argument. Model results say everything about the model and its assumptions, but may have only negligible relevance to the process being simulated. The onus should be on the modeler to provide convincing evidence of the relationship between the model and the atmosphere. The fact that the model produces neat, professional-looking fields belies the fact that all of its results should be treated with skepticism. Messy real data analyses may not be so attractive, but they may well contain more information about the real processes than the spiffy model output.

Numerical models definitely have earned a permanent role in meteorology. That has never been in doubt, in my mind. What is troubling is the current overemphasis on modeling and the lack of interest evident in data and diagnosis of observations. This situation represents an imbalance among the triad of components in a healthy science: (1) theory, (2) observations, and (3) modeling. It has been discussed elsewhere that our science makes its most rapid advances when all these components of a successful science are in balance.

As it stand, these elements are notably out of balance. Modeling has been elevated to a high pedestal, and observations have become merely pedestrian. This has had a substantially negative effect on forecasting and I believe that our meteorological research is being poisoned, as well, by this misplaced "faith" in modeling and the associated decline of interest in observational meteorology.

Bibliographical References

Bjerknes, V., 1904: Das Problem der Wettervorhersage, betrachtet vom Stadpunkte der Mechanik und der Physik. (Weather forecasting as a problem in mechanics and physics). Meteor. Z., 21, 1-7.

Charney, J., R. Fjørtoft and J. von Neumann, 1950: Numerical integration of the barotropic vorticity equation. Tellus, 2, 237-254.

Doswell, C.A. III, L.R. Lemon and R.A. Maddox, 1981: Forecaster training - A review and analysis. Bull. Amer. Meteor. Soc., 61, 983-988.

______, 1986: The human element in weather forecasting. Nat. Wea. Dig., 11, 6-18.

Fritsch, J.M., J. Hilliker, J. Ross and R.L. Vislocky, 2000: Model consensus. Wea. forecasting.15, 571-582.

Hooke, R., 1963: Introduction to Scientific Inference. Holden-Day, 101 pp.

______, 1983: How to Tell the Liars from the Statisticians. Marcel Dekker, New York, 173 pp.

Kutzbach, G., 1979: The Thermal Theory of Cyclones. Amer. Meteor. Soc., 255 pp.

Lynch, P., 1999: Richardson's marvelous forecast. The Life Cycles of Extratropical Cyclones(M. Shapiro and S. Grønås, Eds.) Amer. Meteor. Soc., 61-73.

Molinari, J., and M. Dudek, 1992: Parameterization of convective precipitation in mesoscale numerical models: A critical review. Mon. Wea. Rev., 1`20, 326-344.

Monin, A.S., 1969: Weather Forecasting as a Problem in Physics. Translation (1972), MIT Press, 199 pp.

Petterssen, S., 1956: Weather Analysis and Forecasting. Volume I: Motion and Motion Systems. McGraw-Hill, 428 pp.

Richardson, L.F., 1922. Weather Prediction by Numerical Process. Cambridge University Press, 236 pp.

Schwartz, G., 1980: Death of the NWS forecaster. Bull. Amer. Meteor. Soc., 61, 36-37.

Schwerdtfeger, W., 1981: Comments on Tor Bergeron's contributions to synoptic meteorology. Pure Appl. Geophys., 119, 501-509.

Shapiro, M., and S. Grønås (Eds.), 1999: The Life Cycles of Extratropical Cyclones. Amer. Meteor. Soc., 359 pp.

Snellman, L.W., 1977: Operational forecasting using automated guidance. Bull. Am. Meteor. Soc., 58, 1036-1044.

Thompson, P.D., 1961. Numerical Weather Analysis and Prediction. Macmillan, 170 pp.