Tuesday, January 31, 2006

Today I felt as if I was not in Japan, but in the Alabama of the 1950s. I've been made to feel less than human, like an animal

Japanese resident Steve McGowan yesterday, having just lost his case for racial discrimination after a shop-keeper refused him service for being black.
"Today I felt as if I was not in Japan, but in the Alabama of the 1950s. I've been made to feel less than human, like an animal," said McGowan, choking back tears. "This case was not just about me. With this ruling, the judge has given store owners the right to discriminate based on color."
Uniquely among the signatories to the United Nations Declaration on the Elimination of All Forms of Racial Discrimination, Japan has no laws prohibiting racial discrimination. The Government argues that such laws are not necessary, because victims can sue for damages in the civil courts. All it takes is a year or more of your life and a potentially bottomless pit of legal fees, and you might recoup your costs. But not, apparently, if the judge can find any loophole to excuse the discriminatory behaviour.

The reason Mr McGowan lost? He had claimed that the shop owner had turned him away for being black (using the term "kokujin" in Japanese, and there is apparently a tape recording in which the shop owner reiterates his views). The judge ruled that he hadn't provided sufficient evidence that the refusal was necessarily due to his being black, it might just have been because he was foreign. So, um, that's OK then.

More details and further similar horror stories can be found here.

"Stark warning over climate change"

The BBC News got itself into a tizzy recently over a "major report" which apparently said
Rising concentrations of greenhouse gases may have more serious impacts than previously believed.
and other stuff along those lines. But it turns out that this is merely the proceedings of a conference that took place a whole year ago, which itself was fairly dull in scientific terms but which had an exciting title ("Avoiding dangerous climate change") and garnered plenty of press attention due to the involvement of politicians. I didn't see anything particularly earth-shatttering (or even mildly surprising, in fact) in what was presented either then or now. And in particular, the "worse than previously believed" angle is strongly misleading - the evidence seems to be that it is pretty much exactly as had been predicted some time ago. So really, there is not much of a story there.

Stoat draws exactly the same conclusions, and discusses a few more of the details wrt sea level. So that must prove we are right :-)

I was more amused by the tale of the "3 priceless Ming Vases" which were apparently broken in one fell swoop by a visitor to a museum who tripped over his shoelaces. Heard on Radio5:
Presenter: So these priceless Ming vases, are they really rare then?
Expert: Not really, in fact they are getting more common every year.
(not an exact quote, but you get the idea)

Monday, January 30, 2006

OK, don't eat it then!

Following on from this story, and the ongoing fuss over Japanese "research" whaling, I was amused by the wording in this story in the Japan Times.
Japan's inventory of whale meat, a byproduct of research whaling, has doubled in the past decade
(my italics). It goes on to say:
According to the report, the inventory was about 1,000 to 2,500 tons around 1995. It hit a low point of 673 tons in March 1998 but began to increase to reach 4,800 tons last August.
The annual catch is about 2,000 tonnes so this implies that about half the total take has remained uneaten in recent years. Given that IWC rules insist that the "byproduct of research whaling" has to be eaten, this could turn out to be an embarassing quandary for Japan. If they try to push the consumption of whale meat, this will surely result in a lot of criticism both at home and abroad. They've tried putting it in school lunches in the past, which seems particularly crass. If they just build up the stockpile indefinitely, then that will surely look like a blatant breach of IWC rules (not that I know of any hard limit to the allowable size). But otherwise, they will have to scale back the hunt scientific research, perhaps eventually to nothing. There may be too much pride and pork (of the fiscal variety) at stake to allow that to happen, though.

As I said before, whale meat simply isn't a visible product outside a few small areas of Japan. It amounts to something on the order of 0.1% of the total meat and fish consumed here.

Tuesday, January 24, 2006

Probability, prediction and verification IV: More on Bayesian vs frequentist uncertainty

Having received some correspondence relating to this post, I think it might be worth exploring the issues in a little more detail.

The question I was considering was: what does "70% chance of rain tomorrow" actually mean? Most people would probably expect that if this forecast was issued 100 times, rain would follow about 70 times. And indeed this is what the forecaster thinks (hopes). But on any particular such day, another forecaster might give a different prediction (say "90% chance of rain") and their forecasts might also work out to be accurate on average. Were they both right? What is the "correct" probability of rain?

An analogy with number theory may be helpful. It has been shown that the number of primes less than x is approximately given by x/ln(x), where ln is the natural logarithm. Using this formula, we find there are about 390,000,000 primes between 109 and 1010 (ie 10-digit numbers, of which there are 9x109). In other words, if we pick a 10-digit number uniformly at random, there's a 4.3% probability that it is prime. That's a perfectly good frequentist statement. If we exclude those numbers which are divisible by 2, 3 or 5 (for which there are trivial tests) the probability rises to 16.1%. But what about 1,234,567,897? Does it make sense to talk about this number being prime with probability 16.1%? I suspect that some, perhaps most, number theorists would be uneasy about subscribing to that statement. Any particular number is either prime, or not. This fact may be currently unknown to me and you, but it is not random in a frequentist sense. Testing a number will always give the same result, whether it be "prime" or "not prime" (I'll ignore tests which are themselves probabilistic here).

But does it make sense for someone to accept the validity of a probabilistic weather forecast, while rejecting the appropriateness of a probabilistic assessment about a particular number being prime? It should be clear that the answer to this is a very definite no. Both statements describe a fact which can be determined by a deterministic calculation (by digital computer or analogue atmosphere), but which is currently unknown to me. Granted, we don't know how to perform the atmosphere's calculation, but we don't need to, as it is going to do the job for us anyway. I will find out tomorrow whether rain fell or not, and I can work out whether 1,234,567,897 is prime. In fact, if the primality test takes a day to run, the analogy is a very close one indeed. It should also be clear that "70% chance of rain on Jan 25 2006" and "70% chance of rain on Jan 25 2003" are both in principle equally valid statements. I won't know whether rain fell on the 25th Jan 2006 for a couple of days, but it would probably take even longer to find out about 25th Jan 2003 (even assuming someone has kept a record for the location in question). All of these statements are Bayesian estimates of our (someone's) confidence in a particular proposition, and have no direct frequentist interpretation.

That hasn't helped us pin down what is the "correct" probability. In fact I hope that it has helped to show that ultimately there is no such thing. Just as a different forecaster might give a different probability of rain, so a different mathematician might argue that since 1,234,567,897 is at the low end of the range, a better estimate of the local density of primes is 1/ln(1,234,567,897) = 4.8%, or 18% when multiples of 2, 3, and 5 are excluded. Someone else might known how to check for divisibility by 11 (it isn't), increasing the probability still further. These more sophisticated methods will generate more skillful estimates for 10-digit numbers, in a way that can be quantified. However, someone who assigns a probability of 4.3% to all randomly-chosen 10-digit numbers in a frequentist experiment would also turn out right in the long run. That's a skill-free forecast, but a valid one nevertheless (a future post will expand on that). Essentially, it is "climatology" for 10-digit integers.

Given that we are so used to making probabilistic statements which can only make sense with a Bayesian interpretation, it seems a little strange that people often find it difficult to accept and understand that they are doing this, instead appealing to some quasi-frequentist thought experiment. Almost every time that anyone uses an estimate of anything in the real world, it's a Bayesian one, whether it be the distance to the Sun, sensitivity of globally averaged surface temperature to a doubling of CO2, or the number of eggs in my fridge. The purely frequentist approach to probability dominates in all teaching of elementary theory, but it hardly exists in the real world.

Monday, January 23, 2006

Quote of the day

(An old article - I just happened to find it today). Having eaten my share of deep-fried pizzas during my childhood, I have to say the Mediterranean influence seems exceedingly tenuous. For any poor deprived souls who've not encountered this glorious foodstuff, look here and here. Wikipedia is disappointingly silent on the subject, only giving it a passing mention on the deep-fried Mars bar page :-) DFMB were never much more than a passing gimmick - but DFP have long been a staple of the Scotttish diet.

Sunday, January 22, 2006

University of Tokyo professor fakes paper on human enzyme experiment

Wonder how much attention this fraud will get, coming hot on the heels of the Korean scandal. It seems like he's been under investigation for some time, following complaints about the reproducibility of his results in numerous papers dating back to 2003.

Yuki Daruma

Shortly after arriving in Japan, we found a postcard with a picture of our local temple (Zuisenji) in deep snow. We wondered what we had let ourselves in for, but since then in fact we've only had at most one snowy day per year. Yesterday was this year's turn, and this is what Zuisenji looked like this afternoon.

About 100 deaths have so far been attributed to snow in the western and northern regions (often people falling off their roofs or getting buried under snow while clearing them - others being crushed under the house itself when it collapses due to the weight). Here in the sunny east, conditions are nothing like as severe.

There's something about a flat expanse of snow that gives me an irresistable urge to build a snowman. In Japan, they are called yuki daruma (snow doll).

Saturday, January 21, 2006

Just eat it!

According to the BBC, the nation is gripped by the story of a lost whale stuck in the Thames. I can guess what would happen if it turned up in Japan...


Well if course it died. Very sad and all that. I just hope they don't try to deal with this way.

Friday, January 20, 2006

Peer Review

The Hwang stem cell thing has provoked some soul-searching about the peer review process and whether it could be improved. Kevin Vranes and John Fleck both suggest making reviewers' names public. While there is no harm in posing the question, I don't think the idea has got much going for it in general terms, and certainly cannot see how it could make a difference in this or similar cases.

For a start, it's best to be clear about what peer review does and does not (cannot) do. It does not provide any guarantee that a result is correct (this should be self-evident from the fact that new papers get published contradicting old ones). It does not even mean that anyone has checked that the work has actually been carried out as described - referees are not forensic scientists with the time and resources to do this, even if they wanted. It does however generally indicate that the described work and results seem credible and relevant. Peer review cuts out some poor research, both by actually rejecting stuff that is clearly wrong and (probably) dissuading larger numbers of people from even trying to publish work that is weak. In my experience, it also generates significant improvements to the quality of manuscripts that make it through to publication, both in terms of making them more accessible to as broad a readership as possible, and also correcting minor (even major) errors. Perhaps peer review is best thought of as a sort of "moderation" task analogous to moderators of usenet and other web-based discussion fora (eg censoring Lumo's silly comments on my blog), although it is of course performed at a much more detailed and careful level than that requires.

Reviewers are doing an unpaid job which gets very little reward - we do it mainly because we know we have to in order for the system to survive, and the possible perks (getting a peek at some results early, encouraging the authors to add a few citations of your own work) are pretty trivial in comparison. Rarely, the position of power it affords could be used to do down a rival, but given that the other referees might give glowing reviews and the Editor is likely to work out what is going on, that would be a rather risky course of action. I've had a (very) small handful of reviews that I consider unfair, but even so they cannot prevent publication, only slow it down if the author is persistent enough (ie keeps trying new journals).

Peer review can never be expected to catch deliberate premeditated fraud. That's not what it is for. And anyway, does it matter if occasionally fraudulent (and more often, flawed) papers get through? Not much, IMO. If they are wrong, and the science is important enough to matter, they will get caught out fairly soon (as in this recent case). It might be an unwelcome distraction but it's hardly the end of the world.

Removing the right to anonymity would have the single immediate effect of cutting the number of willing referees substantially. Few people would be eager to risk offending more important scientists who might be awarding them grants or jobs in the future - that's not to say that the author would necessarily (or even likely) retaliate, but why take the risk? Even when the author has no possible authority over the reviewer, there's nothing to be gained by making an enemy in the field. I've encountered a few reviewers who have signed their reviews, but I don't know if this is their general policy or if they only do it when saying nice things. I'm not sure that the latter is worth a great deal. I don't see how naming the reviewers in the Hwang case would achieve anything more than perhaps allowing us to make scapegoats out of a few people who are actually victims as much as the rest of us. They didn't ask to be sent a fraudulent paper, and it's not reasonable to expect them to have caught it.

On a slightly different tack, some EGU journals have an open discussion phase, where as well as a formal peer review, there is an interval during which the paper is put on the web and anyone can comment (HESS and CP) . It's an interesting idea but doesn't seem to have caught on widely. And their system of using pdf documents makes it incredibly tedious to follow. The reviews there also seem to be published, and many are signed, but there is no compulsion (at least for the latter). There has been the suggestion that openly publishing reviews may generate a "me too" syndrome where later comments merely echo the first rather than providing an independent perspective. Perhaps there's room for some fine-tuning on that.

Another idea that's doing the rounds, that I am much more sympathetic to, is that authors should "detail their specific contributions to the research submitted" (Science's new policy). Check the accompanying example too. Some journals have encouraged this for some time, and I can see how it might act as some sort of an extra incentive to honesty if the fraudulent author has to specifically claim a particular bit of the work as their own rather than hope to deflect blame onto the whole group. I've never bothered with this procedure in the past - since I rarely have more than about 1 co-author there seems little point - but it might help to discourage one or two who seem overly cheeky about claiming co-authorship when they have at best a tenuous link to the work, and also result in those who deserve it being properly credited.

Wednesday, January 18, 2006

Prometheus: Myanna Lahsen's Latest Paper on Climate Models Archives

Roger Pielke Jnr has posted up excerpts from an interesting paper by Myanna Lahsen on climate modellers and the "trough of uncertainty". She paints a persuasive picture of modellers sometimes having an unhealthy level of belief in their models, and overselling their confidence in their results for a number of reasons. I'm sure there's some truth in that, at least as a couterweight to the existing paradigm that those closest to model-building are most aware of the warts (hence "trough", with the modellers and alienated critics being most sceptical, and the poor users being overly credulous).

There are also, of course, many users in the prediction end of the field for whom the models are explicitly considered as being merely an uncertain source of information about possible futures, and nowhere near to a being a crystal ball (though we may still use "the ocean" as shorthand for "the modelled ocean"). Moreoever, assessments like the IPCC necessarily spread the ball of uncertainty to include a wide range of perspectives (whether or not you think it should be even broader, there is no question that it is much wider than any one person's view). It is also worth mentioning one recent notable occasion when disagreement between models and data was essentially resolved in favour of the models. It would not, I believe, be at all reasonable to conclude from her work that all climate science is massively oversold (I can hear the septics sharpening their pens), but a healthy dose of rational scepticism is generally useful.

One thing Roger didn't feature is her comments about meteorologists, which may be interesting to those who have noticed the unreasonable level of hostility that certain American State Climatologists have shown towards climate modellling:
Synoptically trained empirical meteorologists have particular motivation to resent models. Their methods and lines of work were in important part replaced by global numerical models. The environmental concern about human-induced climate change, and the associated politics, also favored the GCMs and those working with them. The applied aspect of these meteorologists’ work was thus being taken over by numerical weather forecasting, pushing them in the direction of basic research. Their comments should be understood as potentially interested instances of boundary-work (Gieryn, 1995) whereby they, as a competing scientific group, seek to promote the authority of their own lines of research in competition with GCMs. This placed them at a competitive disadvantage when national funding priorities changed in favor of research with obvious social benefits, whereas GCM modeling seemed relevant to predicting future manifestations of human-induced climate change.
The emergence of numerical techniques also represented a loss in epistemic status as well as funding for the empirical meteorologists. So called ‘objective’ numerical methods resulted in the demotion and relabeling of their more qualitative approach as ‘subjective’, an unattractive label in the context of a cultural preference for ‘hard’ science within the scientific community.
Read the whole paper - with a sceptical mind :-)

Tuesday, January 17, 2006

Sexism in science

A debate has exploded at Cosmic Variance (here and here) recently over the issue of gender discrimination in science. I like to dip into their blog occasionally to remind myself of a carefree life back in the distant past when I was just doing some fun maths which had no relevance to the real world :-)

Anyway, on the one side are the idealists who deny that there can be any inherent differences between the sexes (better make that inherent differences in intellectual ability, I think even physicists will have noticed some differences), and on the other side there are the idiots who assert that women simply aren't clever enough and should take up sewing instead. As I've insulted both sides, the attentive reader will hopefully have concluded that I think there is probably some component of both innate aptitude differences and discrimination involved, but while there seems little support for the premise that the genders are necessarily indistinguishable in every detail mentally speaking, I definitely wouldn't want to downplay the drip-drip effect of continual minor differences in treatment, even when the most blatant and overt discrimination is eliminated. [That was really clumsily written. I hope its meaning is not too badly obscured.]

Working in the same lab, at the same level, as my wife for the past decade has given us a fair amount of anecdotal evidence for this. I offer here one case as an example. It's a fairly minor matter - no one would have been clumsy or nasty enough do anything really blatant - but also rather typical.

Some time ago (back in the UK), we both had a minor spat with some guy from the admin department in Head Office (who we had never met) who used to send out emails as Word documents. All the scientists were unix users, and this was back in the days when reading such a message required us to (a) put the message on a file server (b) boot up the old dusty PC that no-one ever used (c) copy the file over (d) boot up Word and read it (e) swear loudly on discovering that the message was just some worthless admin-type notice that didn't need to be sent, let alone read. We had both previously tried to encourage said admin person to just send his messages in plain text, to no avail ("Word is the lab standard. You have a copy available"). Eventually we both sent rather aggressive messages - jules' was rather snarky (no-one who knows her will be surprised to hear) and mine was downright rude (ditto). Nothing more was heard about it - or so we thought.

At the end of the year, we got our annual performance appraisal forms. As the name suggests, this is the official judgement on our performance over the year, which determines our pay rise and promotion prospects. One of us (do I need to say who?) had a complaint about our behaviour noted on their form, citing this email exchange. We don't actually know whether the admin guy had only complained about jules, or whether my boss had laughed off the complaint while hers had taken it more seriously, but either way, it was clearly not an objectively fair judgement, especially since I'd definitely been ruder.

Of course, there have also been numerous occasions in meetings when she would say something, and get ignored, and I'd repeat it, only to have the room fall silent and everyone agree it was just what was required. Sometimes it's so blatant it's funny. I can try to put some of it down to my opinion being rightfully judged more valuable than hers, but a head can only swell so much.

Being in Japan changes everything. Here, we are foreigners first, second and third, and gender is probably a distant 4th or lower in determining how people treat us. In fact jules says she sometimes finds it a bit scary to be taken so seriously, and has to be careful about what she says in case everyone believes it :-) Things seem rather different for the natives, but that's not a battle we can play any part in.

Probability, prediction and verification III: A short note on verification criteria

Forecast verification is the act of checking the forecast against the observed reality, to see how good it was. The basic question we attempt to answer is "Was the forecast valid?" Note that this is a distinct question from "How skillful is the forecast?" I'm going to do a more general comment concerning the verification issue particularly in relation to climate prediction, but I'll start with a minor digression which has arisen through Roger Pielke Snr's recent comments on his blog.

Firstly, it turns out that when he talked about the models' "skill" (or lack thereof), he wasn't actually using the term in its standard form (a comparison with a null hypothesis). In fact what he was talking about seems more akin to the (related but distinct) validation issue. The questions he was addressing are along the lines of "does the model output lie within the observational uncertainty of the data?" The purpose of this note is to show why this is an invalid approach.

I'll denote the truth by t, the observations o and the model m = t+d where d is some (unknown, hopefully small) difference between model and reality. The mean square difference between model and observations is given by

E((m-o)2) = E((t+d-o)2) = E((t-o)2) + 2E((t-o)d)+ E(d2) = E((t-o)2) + E(d2)

where E() is the expectation operator (the average over many samples). The struck out cross term is zero because the measurement error is unbiased (zero mean).

Now, the mean squared observational error is equal to E((t-o)2) by definition. But E(d2) in the above equation can never be negative. So we have shown that the RMS difference between model and observations is necessarily greater than or equal to the RMS error on the observations, with equality holding if and only if the model is perfect (d=0, m=t). In the real world, that means that this "test" always automatically rejects every model! That may be convenient for a sceptic, but it is hardly scientifically interesting. By this definition, "skill-free" is simply a perjorative synonym for "model". And this does not just apply to climate models, but to any model of any real system, anywhere, no matter how skillful or accurate it has proved itself to be in the conventional sense.

In fact, the correct test is not whether the model output lies within the uncertainty range of the observations, but whether the observations lie within the uncertainty of the model forecast. Eg, given a temperature prediction of 12+-2C, an observed value of 10.5 validates, and that doesn't depend on whether or not your thermometer reads to a precision of 0.5C or 0.1C or anything else.

[Note to pedants: observational error can play a role in verification, if it is large enough compared to the forecast uncertainty. Eg given a forecast of 12+-1C, an observation of 10C with an uncertainty of 2C does not invalidate the forecast, because the true temperature might well have been greater than 11C. But that's probably an unnecessary level of detail here.]

Lovelock in the Independent

Via Stoat, I find that Lovelock is getting himself some press coverage for his new book. His article is full of lots of alarmist nonsense, including the gem
before this century is over billions of us will die and the few breeding pairs of people that survive will be in the Arctic where the climate remains tolerable
Some quantitative estimates too:
as the century progresses, the temperature will rise 8 degrees centigrade in temperate regions and 5 degrees in the tropics
I don't know what planet he's living on, but these estimates are ridiculous. The globe will probably warm by about 2-3C in the next century, with oceans and tropical areas generally warming less than the average, and land and northern latitudes warming more. Something in the region of 8 degrees warming by the end of the century might be about right for the north pole, but not for the UK. 5C in the tropics is simply make-believe.

I hope that intelligent readers will see this for what it is - a plug for his new enviro-horror fantasy thriller, and not a scientifically meaningful comment any more than the execrable Crichton. It's a shame to see formerly-respected scientists "go emeritus" (see here for another) but his past achievements do not immunise him from criticism.

Saturday, January 14, 2006

Probability, prediction and verification II: Forecast skill

Forecast skill is generally defined as the performance of particular forecast system in comparison to some other reference technique. For example, from the AMS:

skill—A statistical evaluation of the accuracy of forecasts or the effectiveness of detection techniques. Several simple formulations are commonly used in meteorology. The skill score (SS) is useful for evaluating predictions of temperatures, pressures, or the numerical values of other parameters. It compares a forecaster's root-mean-squared or mean-absolute prediction errors, Ef, over a period of time, with those of a reference technique, Erefr, such as forecasts based entirely on climatology or persistence, which involve no analysis of synoptic weather conditions:
If SS > 0, the forecaster or technique is deemed to possess some skill compared to the reference technique.
(The UK Met Office gives essentially the same definition.)

For short-term weather forecasting, persistence (tomorrow's weather will be like today's) will usually be a suitable reference technique. For seasonal forecasting, climatology (July will be like a typical July) would be more appropriate. Note that although skill is essentially a continuous comparative measure between two alternative forecast systems, there is also a discrete boundary between positive and negative skill which represents the point below which the output from the forecast system under consideration is worse than the reference technique. When the reference technique is some readily available null hypothesis such as the two examples suggested, it is common to say that a system with negative skill is skill-free. But as IRI states, it is also possible to compare two sophisticated forecast systems directly (eg the skill of the UKMO forecast using NCAR as a reference, and vice versa). In that case of course it would be unreasonable to describe the poorer as skill-free - it would merely be less skillful.

With that in mind, let's turn our attention to what Roger Pielke Snr has to say on the subject of forecast skill in climate modelling. Recently, he published a guest post from Hank Tennekes, about which more later. But first, his assertions regarding model skill in the comments caught my eye.

Roger repeated his claim that the climate models have no skill (an issue he's repeatedly raised in earlier posts). Eli Rabett asked him to explain what he meant by this statement. I found his reply rather unsatisfactory, and you can see our exchanges further down that page.

Incredibly, it turns out that Roger is claiming it is appropriate to use the data themselves as the reference technique! If the model fails to predict the trend, or variability, or nonlinear transitions, shown by the data (to within the uncertainty of the data themselves) then in his opinion it has no skill. Of course, given the above introduction, it will be clear why this is a wholly unsuitable reference technique. This data is not available at the time of making the forecast, and so cannot reasonably be considered a reference for determining the threshold between skillful and skill-free. In fact, there is no realistic way that any forecast system could ever be expected to match the data themselves in this way - certainly, one cannot expect a weather forecast to predict the future observations to such a high degree of precision. Will the predicted trend of 6C in temperature (from 3am to 3pm today) match the observations to within their own precision (perhaps 0.5C in my garden, but I could easily measure more accurately if I tried)? It's possible on any given day, but it doesn't usually happen - the error in the forecast is usually about 1-2C. Will the predicted trend in seasonal change from January to July be more accurate than the measurements which are yet to be made? Of course not. It's a ridiculous suggestion. According to his definition, virtually all models are skill-free, all the time. By using an established term from the meteorological community in such an idiosyncratic way, he is misleading his readers, and with his background, it's hard to believe that he doesn't realise this.

I've asked him several times to find any examples from the peer-reviewed literature where the future data are used as the reference technique for determining whether a forecast system has skill or not. There's been no substantive reply, of course.

For a climate forecast, I think that a sensible starting point would be to use persistence (the next 30 years will look like the last) as a reference. By that measure, I am confident that model forecasts have skill, at least for temperature on broad scales (I've not looked in detail at much else). And as you all know, I've been prepared to put money on it.


Well, I thought that Roger was coming round to realising his error, but in fact he's just put up another post re-iterating the same erroneous point of view. So I'll go over it in a bit more detail.

A skill test is a comparison of two competing hypotheses - usually a model forecast and some null hypothesis such as persistence - to see which has lower errors. Every definition I have seen (AMS, UKMO, IRI) uses essentially the same definition, and Roger specifically cited the AMS version when asked what he meant by his use of the term. To those who disingenuously or naively say "Without a foundation in the real world (i.e. using observed data), the skill of a model forecast cannot be determined", I reply - of course, that is what lower errors in the above sentence means - the error is the difference between the hypothesis, and observed reality! I'll write it out in more detail for any who still haven't got the point.

Repeating the formula above

where Ef = E(m-o) is the RMS difference betweeen model forecast m and observations o, and Erefr = E(r-o) is the RMS difference between reference hypothesis r and observations o. Do I really need to spell out why one cannot used the observations themselves as the reference hypothesis? Do I actually have to write down that in this case the formula becomes

SS = 1 - E(m-o)/E(r-o) = 1 - E(m-o)/E(o-o)?

I repeat again my challenge to Roger, or anyone else, to find any example from the peer-reviewed literature where the target data has beeen used as the reference hypothesis for the purposes of determining whether a forecast has skill. I am confident that most readers will draw the appropriate conclusion from the fact that no such example has been forthcoming, even if they've never encountered a divide-by-zero error.

Friday, January 13, 2006

Probability, prediction and verification I: Uncertainty

I'm going to spend some time addressing some issues in probabilistic climate prediction which have been bouncing around a few other blogs lately. I'll start with a comment about uncertainty.

Uncertainty can be broadly split into two categories: aleatory, and epistemic. The former is the sort of irreducible randomness that cannot be reduced by improved measurements, such as the outcome of a (mythical?) "fair coin toss", or the time to decay of a radioactive atom. The latter is the uncertainty that relates to our ignorance of the system, be it due to limited observations, a lack of understanding or approximations and errors in our models. We can reasonably hope that this uncertainty can be reduced by increasing our knowledge in various ways.

Almost all elementary probability theory is presented in terms of the frequentist approach - random coin tosses, or repeated samples from a well-defined distribution, or similar. In practical problems, however, almost all our uncertainty has an epistemic component. Even the fair coin toss could perhaps be predicted, if one observed the initial trajectory with sufficient precision. In fact it might not be too much of an exaggeration to say that aleatory uncertainty is limited to maths problems such as describing the pdf of the number of heads in 10 tosses of a fair coin. Epistemic uncertainty, in contrast, is near-ubiquitous. For instance, what is climate sensitivity? Obviously this is not an intrinsically random variable - merely an imperfectly known one.

The frequentist interpretation of probability only really applies to aleatory uncertainty. The pdf of the number of heads in 10 coin tosses can be estimated by repeated samples of 10 coin tosses, forming a histogram of the number of heads. Arbitrary precision can be achieved by increasing the number of trials (not the cleverest way of solving this particular problem but never mind about that). For climate prediction, we have one planet, and even though Monte Carlo methods and "perturbed physics ensembles" have a pseudo-frequentist apearance, we must not lose sight of the fact that the underlying answer to "what is climate sensitivity" is actually a single real number, not a distribution. The distribution is merely an artefact of our current ignorance, and we might hope that it will converge in the (near) future. So in practice climate scientists generally (universally?) adopt an explicitly Bayesian approach to estimation. Note also that the Aleatory Probability page on Wikipedia redirects to Frequency Probability, but Epistemic Probability leads to Bayesian Probability.

In weather prediction, a quasi-frequentist interpretation looks possible at a first glance. A forecast that says "70% chance of rain tomorrow" actually means "we think that of the N times we make this statement, roughly 0.7N times will have rain". However, there is nothing magical about this particular assignment of probabilities to these N days. A better forecast system might segregate the N days into 0.5N forecasts of "90% chance of rain" and 0.5N of "50% chance of rain". A badly calibrated forecasting system might say "50% chance of rain" for all of them :-) In each case, tomorrow's weather is actually a deterministic event, entirely determined by the current atmospheric state, and it's either going to rain or it isn't. The decision to assign a 70% probability to a particular forecast cannot based on any fundamental randomness in the atmospheric system (because there is none), and the hypothetical "probability distribution of the current atmospheric state" from which a forecasting system attempts to predict the future, does not exist as some physical reality, but only as a useful theoretical construct to describe our ignorance.

Occasionally one finds childish rants on the web about how "Bayesian probability is not science", but it would probably be less wrong to say that science can only use Bayesian probability, and a purely frequentist approach is limited to mathematical problems in textbooks. I am sure that despite his comments, the author of the linked rant makes use of the weather forecast :-)

Methane emissions from plants

Time to raise the tone with a return to scientific matters, in particular this recent paper has cause quite a stir: Methane emissions from terrestrial plants under aerobic conditions : Nature. Believe it or not, our institutional subs have lapsed entirely, because our admin department is not capable of anticipating the end of the year, so I've only seen bits of it. Nevertheless, the message seems clear: plants seem to generate significant quantities of methane which will, unless this research is overturned, require some adjustment to our understanding of the methane balance in the atmosphere (in particular, some increases to the natural sinks will need to be found in order to explain the observed/pre-industrial equilibrium). It's a significant effect but won't turn the climate modelling world upside down, as methane is only a modest contributor to the overall picture.

RC says it is surprising, and shies away from commenting on implications, other than pointing out that such a consensus-busting paper is proof that the scientific method is alive and well. Stoat tries to put an alarmist spin on it [update ok, I'll add half a smiley before I get lynched - see comments ;-)]: since deforestation means that this natural production has shrunk recently, we need to increase the estimated anthropogenic emissions to compensate, thus implying a greater anthropogenic effect (but...anthro effects include the deforestation anyway, so that argument is meaningless a priori..). Anyway, while this might make a very small difference to the details of historical attribution, I'd rather point to way in which this new research might help to explain why the measured methane concentration in recent years has spectacularly failed to increase as all SRES scenarios and model results predicted (projected). As I've pointed out before (eg here1, here2 and here3) this is a growing discrepancy that IMO requires some re-evaluation of the scenarios. Now that there is a fundamentally "scientific" excuse to look again at methane (not just the embarassing fact that the scenarios fail to match observed reality) I hope we can soon expect to see some improved predictions.

Tuesday, January 10, 2006


Or biting commentary on attitudes within Japanese society? Or maybe just a cheap laugh? You decide. This photo, entitled "A commuter checks out an ad at Jiyugaoka station" was featured as "Picture of the day" on a Japanese news site today, believe it or not (to be fair, nothing much else has happened, apart from a bit of snow).

For something more serious (or maybe not), here's a video from a popular Japanese TV comedian.

On beef and babies

A couple of climate-related articles in the news recently have caught my attention.

The first is an article in the Grauniad:
New research indicates that gas-guzzling cars are a much less important factor in climate change than the huge amounts of food devoured by carnivorous 'burger man'. Jonathon Porritt on the geopolitics of food
This is tenuous to say the least. The research in question looked into the GHG emissions associated with meat production in the USA - beef in particular being the worst offender. Of course, most beef in the USA is grain-fed, which means industrial energy-intensive agriculture to grow the grain, which then gets turned rather inefficiently into meat. So eating meat (in particular, eating beef in USA-style quantities) implies large GHG emissions. Therefore, the argument goes, giving up meat-eating would save those emissions, and swapping your conventional car for a hybrid would in fact save less (according to their calculation).

So what is wrong with this article? Well, for starters, the analysis is based on grain-fed beef, which is common in the USA (I saw an estimate of 80% on the web) but rare in the UK. Our beef is predominantly grass-fed, although they may eat some grain in the winter along with silage. Permanent pasture is actually a very effective carbon sink, largely because it is rarely ploughed so carbon builds up in the soil over a number of years (here's a blast from my past about GHG emissions from agriculture!). Furthermore, even though the beef pasture is often potentially valuable as arable land, a significant proportion of the meat we eat is lamb (much more so than in the USA), which is farmed on much poorer pasture that has little other value. In energy and opportunity cost terms, this food is close to free. Moreover, the upland farming is responsible for the maintenance of one of the most beautiful maintained landscapes in the world - and I say that as someone who is no great advocate of farming, but who simply recognises that they are not all evil. Lastly, the comparison with merely downsizing from one car to a slightly more efficient one is hardly fair in contrast with the switch from a diet heavy in USA beef to complete vegetarianism. Swapping an SUV for a bicycle might be closer to the mark, and in that case, the cyclist certainly wins by a landslide. I'm disappointed that Jonathan Porrit has swallowed (and regurgitated) this stuff as if it was relevant in the UK (where the Guardian is published) rather than applying his critical faculties first.

Of course, its irrelevance didn't stop some hippy lentil-munchers from gloating over it environmentally-aware commentators from drawing attention to it - I refer of course to none other than the esteemed William Connolley. So it is with some amusement that I will now point out his boss's comments on population growth. According to Chris Rapley (Director of the British Antarctic Survey), the Earth's population needs to be substantially cut - perhaps to 2-3 billion from the current 6.5 and growing. News coverage is here and his full article here

Although reducing human emissions to the atmosphere is undoubtedly of critical importance, as are any and all measures to reduce the human environmental "footprint", the truth is that the contribution of each individual cannot be reduced to zero.

Only the lack of the individual can bring it down to nothing.

So if we believe that the size of the human "footprint" is a serious problem (and there is much evidence for this) then a rational view would be that along with a raft of measures to reduce the footprint per person, the issue of population management must be addressed.
This isn't just a matter of GHG emissions and climate change, but pressure on all sorts of natural resources - water, fish, farmland. I'll enjoy my guilt-free burger while William force-feeds lentils to his children :-)

Of course, Japan is already making a start on population decline - that story seems to be cropping up all over the place now, such as this article about coming-of-age day, which saw the 2nd-lowest number of celebrants ever (1.43m), and equal lowest in percentage-of-population terms (1.12%). The most surprising thing for me was the apparent fact that so many people believe in superstitions that even following the strong downward trend of the birth rate in recent years, this percentage only just matched the all-time low set back in 1987 which was caused by 1966 being considered an "unlucky" year in which to give birth!

Monday, January 09, 2006

Best Blonde Joke

For a change from the usual mix of climate-science-and-Japanese-trivia, I was going to blog about number theory, but I noticed this wonderful blonde joke via Cosmic Variance which is much more fun.

Sunday, January 08, 2006

More Japanese Gadgetry

'Smart cycles' that automatically adjust steering developed in Osaka

I have to wonder, what is wrong with a tricycle? They are easy for the elderly to get on and off, and those with balance problems can ride as slowly as they like any risk of without falling over.

Japan's science budget

We believe the path Japan should take is to provide a good research environment here for scientists from across the world rather than competing against other countries.
I have no idea how the promised ¥5 trillion per year compares to the existing spend, but the fact that they are boasting about it hopefully means it's good news (at least, not bad news in the context of general spending cuts). The financial situation of our lab has seemed a little shaky recently (in fact that's pretty much the story of my career - no sooner do I start work in a new job, than the lab budget leaks away). So I'll not get too excited yet. But our campus does seem to be an excellent example of a "good research environment here for scientists from around the world", with its numerous foreign staff, long-term visitors (eg) and hosting of various international meetings.

Saturday, January 07, 2006

Brian's still betting

Or at least, he's trying to.

The indefatigable Brian Schmidt keeps on trying to chase down those who claim scepticism over continued future global warming. Here's a recent thread on some denialist blog. In response to his challenge, there was nothing more than the usual evasion and excuses. But there is an anonymous comment on his own blog which appears to take up the challenge...I look forward to further developments.

Friday, January 06, 2006

Oops 2

Seems like I spoke too soon: someone just bought 2,000 shares instead of 2 on the Japanese stock market - with each share being priced at about half a million Yen each, that's a ¥1 billion ($10 milllion) blunder. What's worse, this was an employee trading on her own account, rather than a faceless corporation making a dent in its profits. At least she only paid a fair market price and might be able to sell them again without too ruinous a loss. It will, however, have made a hole in her end-of-year bonus!

Thursday, January 05, 2006

Documentary On Japanese Sushi

Ok, I know I said I'd be blogging about science again soon (and I really will have something to say very soon), but this Documentary On Japanese Sushi is too good to miss. Be sure to check it out before visiting.

(Hat tip to Lost in Japan)

Update 8/02

Google killed the (fansubbed, copyright-infringing) version linked to above, but you should find something working from here. It's on google video at the time of writing, but I'd guess that this link might last longer. Apparently there may be an official release some time.

Credit where it's due

I poked fun at Japanese banks for a monumental cock-up recently, so it's only fair that I point out a success story. According to this Japan Times story, the recent merger of computer systems to finalise the merging of two major financial companies into what is now the world's largest bank, passed off with hardly a hitch. Set against what could have happened, a mere 10 failed internet transactions is barely a molehill.

In other news, the JT is introducing a (free) registration system to access its archive, which is a bit of a pain. Why oh why oh why do they bother - are they trying to chase away readers to cut down the bandwith costs? Unfortunately I've not found a good alternative Japanese english-language news source.

Wednesday, January 04, 2006

Eating an ecosystem

The Christmas/New Year blogapause seems to be ending around the world. Japan doesn't really do Christmas (although parents of young children increasingly give them some toys "from Santa-san"), with the holiday focussed instead on the start of January. We've just spent a few days on the local sub-tropical paradise of Oshima, cycling a little and eating a lot.

Although Oshima (also written Ohshima) is technically part of Tokyo City, we headed off in the opposite direction towards Atami in Shizuoka Prefecture, where the most convenient ferry runs from. The boat is some sort of hydrofoil thing that sounded and felt more like a plane.

We were staying in posh hotel in a famous seafood area, so of course they pushed the boat out for our first dinner - and scoured the rock pools too, it seems. We counted 15 distinct species, more than half of which were shellfish. Each of us was wholly responsible for 8 deaths, and had a part share in many more. The piece of resistance was some abalone thing flipped on its back and cooked alive over a naked flame. Of course having dropped a live lobster and bags of mussels into boiling water I've no real excuse to be squeamish but it wasn't the most appetising thing to see at the table. It didn't even taste of much (yes, we ate them). By the end of the meal, the table resembled nothing so much as a battlefield strewn with corpses.

I didn't take my camera to dinner that night, but here's what we ate on a subsequent evening.

Yes, that thing at the right hand edge is a whole fish, sliced up into sashimi which was neatly arranged on the remains of its body. It felt like we'd been served the set dinner for 12, and the breakfasts were similar (but thankfully a little smaller, with a higher proportion of it cooked).

The cycling was fun - the New Year holiday is a great time for a short cycling holiday in Japan (outside of the seriously snowy regions) because it's the one time that the roads are pretty much guaranteed to be clear. The downside is that many hotels and restaurants etc are closed, but with a bit of planning (and the foresight to always carry some food!) that need not be an insurmountable problem. On our first day, we climbed up and round the main volcano - Miharayama (which last erupted in 1986, causing a 1-month evacuation of the island) - and then on the following day circumnavigated the whole island (all 45km of it, but it's fairly hilly).

Lots of blue sky on this day, but it was windy (this was on the lee side of the island, hence the lack of waves) and rather cold!