Saturday, September 21, 2013

Only 6 models?

I was amused by this headline in Science (and similar articles elsewhere):
 "Many researchers are excited about the nascent project, but others worry it could divert resources from existing collaborations."

We have about 6 climate models at this institute alone (MIROC 4, 4.5, 5, NICAM, CFES, there are a couple of "next generation" models developed on fancy grids), and no-one worries about that spreading resources too thin! That's mostly because no-one cares much about the outcome, of course. (BTW, the MIROC models are not just version changes, but full-blown divergences with separate developer teams and separate code bases).

9 comments:

Steve Crook said...

"That's mostly because no-one cares much about the outcome, of course"

I've been reading your blog for a while and I know you're unhappy with science politics, but that sentence...

Particularly the "of course"


jules said...

As we saw the models diverging I suggested that this was perhaps not optimal, but was ignored. Since mirocs 4, 4.5 and 5 emerged, I have asked on several ocassions what plans are being made to bring them into a common code structure. Some seems to kind of appreciate the problem, but I get the impression that fixing it is a fairly low priority, and without a concerted effort divergence will increase. The Japanese system tends towards one sensei being in charge of their own little tribe. Now that each sensei has their own version I see little hope of a recomvergence. Of course it is possible. Sometimes I have suggested things, thought I'd been ignored and nothing was happening only to find months or years later, things having being quietly completed...

David Young said...

This is an issue we have in CFD too. There are lots of models and codes and more appear all the time. My own take on this is that more codes implementing essentially the same models and formulations are not that helpful. One can imagine that one could implement different subgrid models in the same code as well as different discretizations, etc. For example we have codes that have inviscid and viscous options.

The issue here that dominates management thinking is cost. One can argue that more implementations of essentially the same methods increases cost. That is true, but in an area where uncertainty is large, one can also argue that multiple sources of data helps quantify the uncertainty so its a complex issue. I think a new model or code needs to significantly different than existing ones to justify its creation.

Steve Crook said...

@jules

My experience is that there's never anyone around that's prepared to pay for software housekeeping. It's always going to be done tomorrow.

The end result is that gradually software fills up with cruft and year on year things get more complicated and poorly documented. Usually to the point where no-one trusts what comments there are and the original developers have left or moved on.

Changes become slow, risky and expensive and dominated by concerns over unexpected side effects of even the most trivial change. This leads to calls for a rewrite.

I've developed software for decades and I've seen it all to often.

David Young said...

Steve Crook, I agree with your observation about software. That need not be the case however. The problem I think is the relentless push to "finish" something and move on to "something else." What people and managers need to recognize is that thanks to nonlinearity, some things will never be "finished" and that they need to keep an expert staff who can maintain/rewrite as needed code for their critical applications. I keep hearing dishonest salesmanship of codes and interfaces that are supposed to make things so easy a newly hired undergraduate can get the same results as a much more highly paid specialist. A lot of IT people are ignorant of the underlying science but say these things anyway because it plays well with cost conscious managers. It is annoying and destructive.

My own theory is that we are the victims of our own dishonest marketing. We constantly oversell our codes and models, often showing only good results while hiding the negative ones. People get a totally wrong impression. This happened in NASA and government generally with CFD in the 1990's. Now the truth is emerging, but a lot of the CFD research has been dismantled at NASA, so its too late for this lost generation. In a lot of cases, we have only ourselves to blame. Being honest about the issues pays dividends and rigorous education of executives is a critical part of that mission.

jules said...

These things have been well recognised in climate science for many years.

Those models that are made generally available to the wider scientific community are by necessity better organised in this way. Many models are fairly well supported with strict version control.

A few years ago GFDL shut down model development for a year or or more and wrote all its models into a singe entity, so that it could gain control of its model development.

Then we have developments like the journal GMD in which incremental model development can be documented.

All this just makes it more distressing when we see our own laboratory setting out to make such basic mistakes, and that's probably why James sounds so despondent about it.

James Annan said...

To be honest, "despondent" implies a level of engagement that I no longer feel. "Resigned" would probably be more apposite...

David Young said...

Jules, Yes, in weather modeling for example, the issues are very obvious and its easy to make the case that the problem will continue to require attention. We are not as lucky.

I also share your frustration with proliferation of codes that are quite similar in content. It is natural I think for Ph. D.'s who are usually smart and confident in their own capabilities to have a tendency to want to write and control their own code. We have this problem too and it takes strong leadership to keep people working within the same code framework.

Steve Crook said...

@Jules I'm pleased to hear that there are efforts to properly manage and incrementally develop models.

I'd often wondered if there was a common model programming language to make development faster and safer. A bit like R for stats programming. Or is it C/C++ and whiskery Fortran with roll your own routines?

@James. Resigned? It's probably better than despondent, but corrosive none the less. Chin up laddie :-)