Does Antimatter Fall UP?

Does matter repel antimatter? Or, does gravity always suck? Physicists would like to know! And a recent arXiv.org article has the title:  Can the new Neutrino Telescopes reveal the gravitational properties of antimatter?

The abstract reads:
We argue that the Ice Cube, a neutrino telescope under construction at the South Pole, may test the hypothesis of the gravitational repulsion between matter and antimatter. If there is such a gravitational repulsion, the gravitational field, deep inside the horizon of a black hole, might create neutrino-antineutrino pairs from the quantum vacuum. While neutrinos must stay confined inside the horizon, the antineutrinos should be violently ejected. Hence, a black hole (made from matter) should behave as a point-like source of antineutrinos. Our simplified calculations suggest, that the antineutrinos emitted by supermassive black holes in the centre of the Milky Way and Andromeda Galaxy, could be detected at the Ice Cube.

I had never thought about this question. But it may be that in the far, far future (after all the stars die) that our descendants (probably machines?) orbit around black holes to capture the energy in the antineutrinos being ejected.

(By the way, where would the energy come from, the black hole or the "quantum vacuum" itself?)

The Borders of Privacy

In my previous post, I defined privacy piracy. My focus was on for-profit companies pirating personal information. Here I want to mention that governments are increasingly restricting our right to electronic privacy as well.

As an example, U.S. border officials can seize and search laptops, smartphones and other electronic devices for any reason. The ACLU is suing, with the stated goal that "...the government has to have some shred of evidence they can point to that may turn up some evidence of wrongdoing” before such searches can be performed. The American Civil Liberties Union cites government figures and estimates 6,500 persons have had their electronic devices searched along the U.S. border since October 2008. No mention of how many terrorists were caught.

So what can the average computer geek do to protect his privacy?

One solution is encryption. TrueCrypt is an open source software, free to download, that provides a way to encrypt files, partitions, and even a laptop's entire operating system. There are versions for both Windows and Linux, although complete OS encryption is available for Windows only. See the complete documentation here. I have used the software to encrypt Windows operating systems, partitions on Linux systems, and numerous files on both operating systems. TrueCrypt performed without error and did not seem to affect performance adversely. (I noticed my CPU utilization went up a bit, but my CPU had no difficulty keeping up with the hard disk. A CPU can decrypt/encrypt data faster than the hard drive can read/write it.) Just be sure to follow their password recommendations. IMHO, the algorithms used by TrueCrypt should be quite robust to even the most sophisticated decryption efforts that nefarious governments can mount.

It is possible to observe that a file has been encrypted -- all that completely random looking data constituting the entire file.  Recent U.S. case law suggests that government agents, during a laptop search, could notice an encrypted file and then be able to compel one to divulge one's password for it. (Fifth Amendment protections not withstanding.) They could then use the password to gain access and decrypt the data contained in the file.

To deal with this privacy commandeering, TrueCrypt has a couple of plausible deniability tricks. One trick is to hide an encrypted volume within an encrypted volume, each having separate passwords. The inner volume is undetectable. Which volume is accessed depends on which password is used. This trick allows a person to reveal the password of the outer encrypted file but "forget" to mention the inner encrypted volume. Another trick is the ability to hide an entire operating system (Windows only) behind a decoy encrypted operating system.

However, like most, although I like to rant against nefarious governments my real concern is having my laptop stolen. A web search revealed inconsistent statistics, but would I guess anywhere from 100,000 to 500,000 laptops are stolen each year. So my bigger worry is to have some thief get his hands on my private and financial data. This includes not only bank statements and brokerage account information, but related data in my operating environment such as cookies and the contents of my swap file.

Again, what to do?

I create a virtual machine that I exclusively use for my online financial transactions and private communications. I then store the virtual machine on a TrueCrypt volume on my laptop. Therefore, if my laptop is ever stolen, the thieves will be able to find all about my laptop web surfing habits, but nothing truly sensitive or potentially damaging that I store on the virtual machine.

BTW, I haven't overlooked that smartphones contain a lot of private data too. I'll address smartphone encryption in a later post.

Privacy Piracy

Define Internet privacy piracy as the unauthorized collection, analysis, and distribution of personal information by third parties for profit. My questions are: are the pirates taking over the Internet? Are they changing the architect of the Internet to favor themselves at our expense? Are they making it easier for government espionage?

In an article in the WSJ, it is claimed that ("don't be evil") Google has begun to aggressively cash in on its vast trove of data about Internet users. Google had feared a public backlash. "But the rapid emergence of scrappy rivals who track people's online activities and sell that data, along with Facebook Inc.'s growth, is forcing a shift." Also, according to Mr. Eric Schmidt, "If you have something that you don't want anyone to know, maybe you shouldn't be doing it in the first place." Not exactly comforting words. (Note: See this article for a defense of Google from Wired Magazine.)

Also, from an LATimes Technology Blog post:
Apple Inc. is now collecting the "precise," "real-time geographic location" of its users' iPhones, iPads and computers.
In an updated version of its privacy policy, the company added a paragraph noting that once users agree, Apple and unspecified "partners and licensees" may collect and store user location data.
When users attempt to download apps or media from the iTunes store, they are prompted to agree to the new terms and conditions. Until they agree, they cannot download anything through the store.
The large Internet firms could introduce technologies that would make it EASY for users to protect their privacy. But will these firms do so? For example, Microsoft had the chance to redesign Internet Explorer to make it more privacy friendly. But evidently did not.

In future posts, I would like to discuss approaches the average software geek can take to help protect online privacy.

A Note on the Quality of the Climate Model Software

In response to a comment to my last post, I wrote that:
IMHO, the predictive skill of the climate models have not been formally and empirically demonstrated (as in IV&V).

This is the same position I had back in March, when I posted a note on the current state of the climate model software.

Jon Pipitone has performed a study of the quality of software in climate modeling. I mention Pipitone's work because it was brought to my attention that Steve Easterbrook links to it in a statement he made in a blog post yesterday:
Our research shows that earth system models, the workhorses of climate science, appear to have very few bugs...
Does not a statement like this AMHO (affect my humble opinion)? Unless I take it out of context -- no. What is the context?

In a blog post describing Jon Pipitone's work, Easterbrook writes:
I think there are two key results of Jon’s work:

1. The initial results on defect density bear up. Although not quite as startlingly low as my back of the envelope calculation, Jon’s assessment of three major GCMs indicate they all fall in the range commonly regarded as good quality software by industry standards.

2. There are a whole bunch of reasons why result #1 may well be meaningless, because the metrics for measuring software quality don’t really apply well to large scale scientific simulation models. [Emphasis added.]

And in the Conclusion of his thesis Pipitone writes:
The results of our defect density analysis of three leading climate models shows that they each have a very low defect density, across several releases. A low defect density suggests that the models are of high software quality, but we have only looked at one of many possible quality metrics. As we have discussed, knowing which metrics are relevant to climate modelling software quality, and understanding precisely how they correspond the climate modellers notions of software quality (as well as our own) is the next challenge to take on in order to achieve a more thorough assessment of climate model software quality. [Emphasis added.]

We found a variety of code faults from our static analysis. The majority of faults common to each of the models are due to unused code or implicit type manipulation. From what we know of the construction of the models, there is good reason to believe that many of these faults are the result of acknowledged design choices -- most significantly are those that allow for the flexible configuration of the models. Of course, without additional study, it is not unknown whether the faults we have uncovered point to weaknesses in the code that result in operational failures, or generally, what the impact is of these faults on model development and use. [Emphaisi added.]

And in describing possible threats to the validity of his thesis, Jon writes:
We do not yet understand enough about the different types of climate modelling organisations to hope to make any principled sampling of climate models that would have any power to generalize. [Emphaisis added.] Nevertheless, since we used convenience and snowball sampling to find modelling centres to participate in our study we are particularly open to several biases [10]. For example:

* Modelling centres willing to participant in a study on software quality may be more concerned with software quality themselves;

* Modelling centres which openly publish their climate model code and project artifacts may be also be more concerned with software quality;

In addition, our selection of comparator projects was equally undisciplined. We simply choose projects that were open-source, and that were large enough and well-known enough to provide an intuitive, but by no means rigorous, check against the analysis of the climate models. We have also chosen to include a model component, from centre C1, amongst the GCMs from the other centres we analysed. Even though this particular model component is developed as an independent project it is not clear to what extent it is comparable to a full GCM.

Our choice to use defect density and static analysis as quality indicators was made largely because we had existing publications to compare our results with, not because we felt these techniques are necessarily good indicators. Furthermore, whilst gauging software quality is known to be tricky and subjective, most sources suggest that it can only accurately be done by considering a wide range of quality indicators [21, 3, 1, 17]. Thus, at best, our study can only hope to present a very limited view of software quality. [Emphasis added.]

Thus, "there are a bunch of reasons" why Easterbrook's statement "may well be meaningless".

No Climate of Decorum

It seems the heat of the Global Warming issue is threatening decorum worldwide.

Andrew Revkin recently posted on his blog, Dot Earth, an article about misconduct, climate research, and trust. This drew, from what I had previously read by him, an uncharacteristically derogatory and profane response from Steve Easterbrook.

I made the following comment to Steve:
I have followed Andy Revkin’s Dot Earth blog for a long time and there is little doubt in my mind that he tries to express all viewpoints on the climate change problem in as a coherent, rational way as possible. That is, he tries to present all arguments, from denier to alarmist, in the best and strongest possible way. (That he is imperfect he explicitly notes.)

He also makes his own beliefs quite clear. In the blog post in which you suggest he shut up in the most vulgar possible way, he states: “Do I trust climate science? As a living body of intellectual inquiry exploring profoundly complex questions, yes. Do I trust all climate scientists, research institutions, funding sources, journals and others involved in this arena to convey the full context of findings and to avoid sometimes stepping beyond the data? I wouldn’t be a journalist if I answered yes.”

So I find your rant completely baffling. Revkin is a journalist. You should calm down and offer an apology.

Very few of us can spend a lot of time hanging out with climate scientists. And I have commented before on how much I appreciate your extensive, detailed posts on the various climate conferences you attend. And the thinking and attitudes you find at them.

But you should never tell anyone to shut up or that they are too ignorant to say anything but parrot the views of others. Much less Andy Revkin in such vulgar terms. Why? It hardens attitudes instead of changing them. Just what the status quo wants. And I don’t want the status quo.
Of course, I am referring above to the Principle of Charity. Dr. Easterbrook replied:
Expressing all viewpoints "in the best and strongest possible way" is downright irresponsible, when some of them are lies and smears. That's exactly the problem I'm calling out. It's called false balance, and it has to stop.
Steve then added:
George [said]: “But you should never tell anyone to shut up [...] in such vulgar terms. Why? It hardens attitudes instead of changing them. Just what the status quo wants. And I don’t want the status quo.”

Yeah, I’ve been thinking a lot about this. The status quo is completely unacceptable and has to change. But I disagree with you on what it will take. The most important thing that has to happen is that those who understand the big picture, who understand the risks of delaying strong action on mitigation policies, have to get a lot more passionate and a hell of a lot more “in-your-face”. We have to shift the Overton Window, and the only way to do that is to ramp up the action at the space beyond what is currently considered politically possible. We’re not going to get there by being polite and agreeable, when those who would delay action are busy using every machiavellian tactic they can think of.

Oh, and the swearing? If there are delicate souls out there who can’t cope with a few swear words, how the hell are they going to cope when the shit really hits the fan with climate disruptions? The genteel won’t survive the collapse.
To which my reply was:
Steve [said]: “The most important thing that has to happen is that those who understand the big picture, who understand the risks of delaying strong action on mitigation policies, have to get a lot more passionate and a hell of a lot more ‘in-your-face’.”

If a journalist or scientist were to make such a declaration, should not my opinion of their journalistic or scientific credibility and trustworthiness diminish? That is, would not the risk of bias increase?

Anecdotally, I have known a lot of passionate people and in my experience their sense of conviction has nothing to do with reality. (To think otherwise means we need to make our climate model software more passionate!?)

Emotion is a driving force, not a deciding force. A passionate call is a call to action. It is useless when trying to convince someone about scientific facts.

So criticizing Revkin for not considering it his job as a journalist to “shift the Overton Window” is being very harsh on Revkin. Andy is trying to be informative — helping people to decide for themselves. IMHO, that’s what journalists ought to do.

Is everyone who is not actively a strong mitigation policy advocate at risk of verbal abuse? Is decorum dead?

I would like to add here further anecdotal evidence that such "in-your-face" writing as Steve's is counterproductive when trying to change people's minds about an issue. See Anthony Watts recent blog posting about Quote of the [expletive deleted] week. IMHO, the post puts Easterbrook in a bad light. And none of the comments to the post are of a "since Steve is so impassioned about the issue, I'm changing my mind about the Global Warming issue" flavor.

Android

   Android is the name of an increasingly popular OS software stack for smartphones. (Where I define a smartphone as a hand-held computer that oh-by-the-way can also make phone calls.) Android is closely associated with Google. The Android OS is Linux and, therefore, is open source software (current Oracle lawsuit, notwithstanding).

It doesn't take much imagination to realize that if people the world over are carrying around hand-held computers full of useful apps, it can greatly enhance and even fundamentally change the way we communicate and interact with one another. Additionally, IMHO, Android (as an open source framework backed by a company with substantial resources) offers the possibility of weakening the innovation stifling practices of the telecommunications service providers.

As a programmer, I recently decided to get some hands-on experience with the Android OS, so last May I purchased an HTC Droid Incredible. Programming for the Android turned out to be a somewhat frustrating experience for reasons that I plan on discussing in future posts.

Even so, I managed to create a simple application called KidLocator. (See the link to a page describing the program on the right. The page contains instructions for downloading the program as well.)

For those with a compatible phone (e.g., Android 2.1), please feel free to download and use the app.

Can Scientific/Engineering Code Be Validated? Part 2

This is a continuation of my previous post. There I note that I interpret software validation more broadly than Roache. That I believe it can be applied to embodied code as well as documented theory. Here I continue to present how Roache's interpretation of validation may differ somewhat from my own.

In Appendix B, Roache starts off with a commonly used definition for validation:
Validation: The process of determining the degree to which a model {and its associated data} is an accurate representation of the real world from the perspective of the intended uses of the model.
Roache sees three main issues with people properly interpreting this definition. One issue is with the phrase "intended uses." His recommendation is:
Intended use, at least in its specific sense, is not required for validation. The common validation definition could be salvaged by re-defining intended use to include very general intentions, but frankly this appears to be a hollow exercise. The fact is that a useful validation exercise does not necessarily require an intended use, specific or general.
This recommendation is developed using argument such as:
Clearly, much of the confusion is the result of trying to use the same word for different needs. Project oriented engineers are more concerned with specific applications, and naturally tend to rank acceptability within validation (which term is used more often than accreditation or certification). Research engineers and scientists tend to take a broader view, and often would prefer to use validation to encompass only the assessment of accuracy level, rather than to make decisions about whether that level is adequate for unspecified future uses. It is also significant to recognize that these project-specific requirements on accuracy are often ephemeral, so it is difficult to see a rationale for a priori rigid specifications of validation requirements [5,11] when the criteria so often can be re-negotiated if the initial evaluation fails narrowly.
And:
The requirement for "intended use" sounds good at first, but it fails upon closer thought. Did D. C. Wilcox [13] need to have an "intended use" in mind when he evaluated the k-w RANS turbulence models for adverse pressure gradient flows? He may very well have had uses in mind, but does a modeler need to have the same use in mind two decades later? If not, must the validation comparison be repeated? Certainly not.
But who would want to repeat it?

Validation is subjective. (As Roache puts it -- ephemeral.) So it logically must be performed from some perspective. Who's perspective? The software's stakeholders. But unless usage is predefined, not all of the software's potential stakeholders have been identified. How can their potentially differing priors be ignored?

Roache evidently believe validation can be made objective. That acceptability, accreditation, and certification can be separated out from validation. That the degree to which a model is an accurate representation of the real world can be decided upon by some abstract, objective algorithm. No Bayesian priors required.

But I could not disagree more. So I highly recommend reading Roache for a viewpoint different than my own.

Can Scientific/Engineering Code Be Validated?

I am starting to read Patrick J. Roache's book, The Fundamentals of Verification and Validation. I thought I knew the fundamentals of IV&V for scientific and engineering software already, but reading Roache seems to have me feeling a bit ignorant.

Be that as it may, I think I must disagree with the limited scope of Roache's definition of validation.

In Appendix B of the book, Validation -- What Does it Mean? Roache writes:
Before examining the definition of validation, we need to make a small distinction on what it is we are claiming to validate, i.e., between code and model. A model is incorporated into a code, and the same model (e.g. some RANS model) can exist in many codes. Strictly speaking, it is the model that is to be validated, whereas the codes need to be verified. But for a model to be validated, it must be embodied in a code before it can be run. It is thus common to speak loosely of "validating a code" when one means "validating the model in the code".

I think this distinction is too limiting. The embodied code must be validated too.

IMHO, Roache is using the word verification in the sense of formal verification. That's fine, except scientific and engineering software can rarely be heroically tested. Formal proof of such software's correctness is a practical impossibility. Does Roache really think verification is impossible?

Suppose I found a DVD case on the sidewalk and inside was a DVD with a label that said: "Models the Earth's Climate." I put the DVD in my computer and, sure enough, on the DVD is what appears to be a complete and sophisticated complex climate model. How would I go about verifying such a software's outputs? What amount of testing would be sufficient? What verification processes would I choose to use?

On the other hand, suppose I obtain funding to develop, from scratch, a new and complete Earth climate modeling software. What methodologies would I choose to develop and test the software? Would I think it important to verify the processes completed at various stages of the software's development?

And here is the rub. Suppose that it turns out each software actually has the same physics model. Nevertheless, would I need to validate that the different processes I used to verify the software on the DVD and to verify the software built from scratch were appropriate for each? Yes! The verification processes for each software would be different. These differences must be validated as appropriate and effective.

So if I feel a bit ignorant under a limited definition of validation, I now feel even more so under an expanded definition.

The I in IV&V is Important

It was pointed out in the last post that software verification and validation (V&V) are not purely exercises in deductive logic. A comment to the post explicitly noted essential components are based on probabilistic reasoning. The basic point of the post was that the result of V&V is not a proof of certainty. 

Rather software V&V is a measure of the acceptability of the risk that the software may fail to perform properly and thus not provide the desired benefits, that the consequences of using the software may even be negative. (Risk is defined as probability times consequence.)

And so here I make a quick note that the point of the previous post is not the only common and important philosophical misunderstanding about V&V. There is often a failure to realize that software V&V must be independent verification and validation (IV&V).

There is a general consensus that the process by which software is developed can add to or subtract from the quality of the final software product. But the degree to which this occurs is a subjective judgment. Different software stakeholders will have different opinions.

Also, the potential consequences of using software are different for different stakeholders. Just as the climate effects different groups of people differently, an error in the global climate models could potentially be misused to effect different people to differing degrees.

The bottom line is that the estimated risk associated with any software can vary greatly (even in sign) depending on which stakeholders are being used as the reference. Thus, software V&V must not be restricted to an activity that is performed by a single software stakeholder. That would not be fair. Software V&V must be IV&V such that all stakeholders are considered fairly.

You would think this concept would be obvious for all risk analyses (software IV&V or whatever) and far from a potential problem. Unfortunately, this is not the case. For example, how worried should we be about driving a Toyota? According to popular NYT blogger Robert Wright:
My back-of-the-envelope calculations (explained in a footnote below) suggest that if you drive one of the Toyotas recalled for acceleration problems and don’t bother to comply with the recall, your chances of being involved in a fatal accident over the next two years because of the unfixed problem are a bit worse than one in a million — 2.8 in a million, to be more exact. Meanwhile, your chances of being killed in a car accident during the next two years just by virtue of being an American are one in 5,244.

So driving one of these suspect Toyotas raises your chances of dying in a car crash over the next two years from .01907 percent (that’s 19 one-thousandths of 1 percent, when rounded off) to .01935 percent (also 19 one-thousandths of one percent).
Wright does not think these numbers are of much concern. But IMHO, he fails to understand that one stakeholder in the issue (Toyota) should not decide the risk for another (the public). For he writes:
But lots of Americans seem to disagree with me. Why? I think one reason is that not all deaths are created equal. A fatal brake failure is scary, but not as scary as your car seizing control of itself and taking you on a harrowing death ride. It’s almost as if the car is a living, malicious being.
IMHO, it's not that all deaths are not created equal -- it's that not all risk analyses are.

This was also noted in Chance News #62, where we have the following questions being asked about Wright's discussion of these numbers:
  1. People seem to make a distinctions between risks that they place upon themselves (e.g., talking on a cell phone while driving) and risks that are imposed upon them by an outsider (e.g., accidents caused by faulty manufacturing). Is this fair?
  2. Contrast the absolute change in risk (.01935-.01907=.00028) with the relative change in risk (.01935/.01907=1.0147). Which way seems to better reflect the change in risk?
  3. Examine the assumptions that Robert Wright uses. Do these seem reasonable?

IV&V is not Impossible

There is a very important reason why I have devoted a couple of posts to the scientific method. The posts lay the groundwork for addressing an issue concerning the independent verification and validation (IV&V) of science and engineering software.

The very important issue? Many people feel IV&V is impossible.

In an article in the Feb. 4, 1994 issue of Science Magazine, Oreskes et al. make the following argument:
Verification and validation of numerical models of natural systems is impossible. This is because natural systems are never closed and because model results are always nonunique. Models can be confirmed by the demonstration of agreement between observation and prediction, but confirmation is inherently partial. Complete confirmation is logically precluded by the fallacy of affirming the consequent and by incomplete access to natural phenomena. Models can only be evaluated in relative terms, and their predictive value is always open to question. The primary value of models is heuristic.
This argument should be taken seriously. After all, Science is a peer reviewed publication that tries to represent the best of quality science. Additionally, there does not seem to be much in the way of direct, forceful rebuttal of this argument easily and freely available on the WWW. AFAIK, most of what is available is either dismissive of the argument or in basic agreement with it.

For example, Patrick J. Roache is rather dismissive and writes in a paper on the quantification of uncertainty in computational fluid dynamics:
In a widely quoted paper that has been recently described as brilliant in an otherwise excellent Scientific American article (Horgan 1995), Oreskes et al (1994) think that we can find the real meaning of a technical term by inquiring about its common meaning. They make much of supposed intrinsic meaning in the words verify and validate and, as in a Greek morality play, agonize over truth. They come to the remarkable conclusion that it is impossible to verify or validate a numerical model of a natural system. Now most of their concern is with groundwater flow codes, and indeed, in geophysics problems, validation is very difficult. But they extend this to all physical sciences. They clearly have no intuitive concept of error tolerance, or of range of applicability, or of common sense. My impression is that they, like most lay readers, actually think Newton’s law of gravity was proven wrong by Einstein, rather than that Einstein defined the limits of applicability of Newton. But Oreskes et al (1994) go much further, quoting with approval (in their footnote 36) various modern philosophers who question not only whether we can prove any hypothesis true, but also “whether we can in fact prove a hypothesis false.” They are talking about physical laws—not just codes but any physical law. Specifically, we can neither validate nor invalidate Newton’s Law of Gravity. (What shall we do? No hazardous waste disposals, no bridges, no airplanes, no : : : .) See also Konikow & Bredehoeft (1992) and a rebuttal discussion by Leijnse & Hassanizadeh (1994). Clearly, we are not interested in such worthless semantics and effete philosophizing, but in practical definitions, applied in the context of engineering and science accuracy.
Ahmed E. Hassan, on the other hand, seems in basic agreement with Oreskes and writes in a fairly recent review paper on the validation of numerical ground water models:
Many sites of ground water contamination rely heavily on complex numerical models of flow and transport to develop closure plans. This complexity has created a need for tools and approaches that can build confidence in model predictions and provide evidence that these predictions are sufficient for decision making. Confidence building is a long-term, iterative process and the author believes that this process should be termed model validation. Model validation is a process, not an end result. That is, the process of model validation cannot ensure acceptable prediction or quality of the model. Rather, it provides an important safeguard against faulty models or inadequately developed and tested models. If model results become the basis for decision making, then the validation process provides evidence that the model is valid for making decisions (not necessarily a true representation of reality). Validation, verification, and confirmation are concepts associated with ground water numerical models that not only do not represent established and generally accepted practices, but there is not even widespread agreement on the meaning of the terms as applied to models.
Let me also mention that the Oreskes article also briefly and indirectly alludes to another logical fallacy, the appeal to authority:
In contrast to the term verification, the term validation does not necessarily denote an establishment of truth (although truth is not precluded). Rather, it denotes the establishment of legitimacy, typically given in terms of contracts, arguments, and methods (27).

There are a lot of things I think would be interesting to discuss about Oreskes' article. However, this post is already getting too long. So I will only state what I feel is the strongest counter-argument and fill in the details in later posts. I do not agree with Oreskes because the scientific method, of which IV&V is a part, is not an exercise in logic. As I have already pointed out in an earlier post:
Note that even this most bare form of the scientific method contains two logical fallacies. The first is the use of abduction (affirming the consequent). The second is the partial reliance on IV&V for error management (appeal to authority). The use of abduction eliminates logical certainty from the scientific method and introduces the possibility of error. The logical shortcoming of IV&V means that finding and eliminating error is never certain.
The basic problem with Oreskes' argument is that it runs counter to the very foundations of the scientific method. The scientific method does not require logical certainty in order for it to work. The value of models is not only that they can be heuristic, it is that they can be be scientific. To be anti-model is to be anti-science. Good luck with that.

Modelers HATE Python!?

I recently ran across the following by a person involved in mesoscale weather modeling and graduating meteorology majors:
Fortran is the language of choice and the reason has nothing to do with legacy code. Nearly all modelers that I know are fluent not only in Fortran, but C, C++, and Perl as well. Fortran is the language used because it allows you to express the mathematics and physics in a very clear succinct fashion. The idea here is that [while] a craftsman has many tools in his tool chest, the amateur believes everything is a nail. The only common feature in terms of programming tools amongst modelers is a universal HATRED of object-oriented programming languages, particularly Python.
Object-oriented programming is the answer to a question that nobody has ever felt the need to ask. Programming in an object-oriented language is like kicking a dead whale down the beach.


I have no doubt that this is a sincere and knowledgeable comment. And I am not saying that just because this blog observes proper decorum and thus always assumes the Principle of Charity. I think I know why such an attitude may be prevalent. (But not universal. I have modeled things and I do not hate Python.) Let me explain by way of a toy example.

Principles of Planetary Climate is an online working draft of a book on the physics of climate. The author is Raymond T. Pierrehumbert who does research and teaches at the University of Chicago. Dr. Pierrehumbert also frequently posts (as "raypierre") on the popular blog RealClimate.

There is a computational skills web page that accompanies the book. On this page is a tutorial with links to basic and advanced Python scripts for solving a simple example ordinary differential equation with one dependent variable. There is also a script that uses a numerical/graphical library called ClimateUtilities.

The basic Python script implements three different ODE integration methods (Euler, midpoint, and Runge-Kutta) for the example differential equation and then compares their error to the exact analytical solution. (Let me note, since this blog is very concerned about IV&V issues, that the only discussion of validation and verification is a brief reference to a numerical stability problem with the midpoint method. The opportunity to discuss convergence and performance issues is also missed. As is the practicality of using multiple algorithms to solve the same problem as a verification technique. Obviously the author felt such IV&V issues to be of less than fundamental importance.)

The advanced Python script implements the three methods using an object oriented approach. IMHO, this script clearly demonstrates the unsuitability of an "everything is a nail" object oriented approach to numerical programming. IMHO, it perfectly illustrates the point of the comment I quoted above. The "advanced" script has many more lines of code, is much more complex in design, and I am certain would execute much more slowly than the "basic" implementation.

But a forceful reply to the comment quoted above is that neither the "advanced" nor the "basic" approaches are appropriate. Sure pure Python is numerically slow, but Python comes with batteries included. Every Pythonista knows that a key feature of Python is its "extensive standard libraries and third party modules for virtually every task". Python is a great way to glue libraries and third party modules together.

But here I found out that some libraries are better than others. I was unable to successfully install the ClimateUtilities library on my version of Ubuntu Linux (9.10). So I wrote a script that uses the SciPy library instead (as well as my own version of Runge-Kutta), as shown below. Note how short and straightforward the implementation is and, if you run it yourself, how much faster it is to use a numerical library. It is practically as fast as any compiled language implementation, Fortran or whatever. And don't forget, in Python, everything is an object. (E.g., do a dir(1) in Python. Even the number one is an object!)

(I ran into some interesting numerical features. See the dt value I used below for RK4. Maybe a subject for a later post?)


 '''
 Numerically solve an ODE using RK4 or scipy's odeint.
 
 See gmcrewsBlog post for details.
 
 ODE: dy/dt = slope(y, t)
 Where: slope(y, t) = -t * y
 And: y(0.) = 1.0
 Stopping at: y(5.)
 
 Note that analytical solution is: y(t) = y(0) * exp(-t**2 / 2)
 So error at y(5.) may be calculated.
 '''
 
 import math
 import time
 from scipy.integrate import odeint
 
 
 def slope(y, t):
     '''Function to use for testing the numerical methods.'''
     return -t * y
 
 
 # Parameters:
 t_start = 0.
 y_start = 1.
 t_end = 5.
 
 # Analytical solution:
 y_exact = y_start * math.exp(-t_end**2 / 2.)
 
 print "ODE: dy/dt = -t * y"
 print "Initial condition: y(%g) = %g" % (t_start, y_start)
 print "Analytical solution: y(t) = y(0) * exp(-t**2 / 2)"
 print "Analytical solution at y(%g) = %g" % (t_end, y_exact)
 print
 
 # Do a Runge-Kutta (RK4) march and see how good a job it does:
 
 dt = 0.000044 # chosen so that approx. same error as scipy
 # However: try dt = .04 which gives even lower error!
 dt = .04
 runtime_start = time.time() # keep track of computer's run time
 t = t_start
 y = y_start
 h = dt / 2.
 while t < t_end:
     k1 = dt * slope(y, t)
     th = t + h
     k2 = dt * slope(y + k1 / 2., th)
     k3 = dt * slope(y + k2 / 2., th)
     t = t + dt
     k4 = dt * slope(y + k3, t)
     y = y + (k1 + (2. * k2) + (2. * k3) + k4) / 6.
 runtime = time.time() - runtime_start
 
 err = (y - y_exact) / y_exact * 100. # percent
 err_rate = err / runtime # error accumulation rate over time
 
 print "RK4 Results:"
 print "dt = %g" % dt
 print "Runtime = %g seconds" % runtime
 print "Solution: y(%g) = %g" % (t_end, y)
 print "Error: %g %%" % err
 print
 
 # What does scipy's ode solver return?
 runtime_start = time.time() #keep track of computer's run time
 results = odeint(slope, y_start, [t_start, t_end])
 runtime = time.time() - runtime_start
 
 y = results[1][0]
 err = (y - y_exact) / y_exact * 100. # percent
 err_rate = err / runtime # error accumulation rate over time
 
 print "scipy.integrate.odeint Results:"
 print "Runtime = %g seconds" % runtime
 print "Solution: y(%g) = %s" % (t_end, y)
 print "Error: %g %%" % err
 print

My Opinion About Programming Languages

There are several computer languages that I have had significant experience with. Over time, I think programming languages have gotten much better.

It is possible and sometimes entertaining to analogize these most basic of programming tools by viewing them as personal weapons. In chronological order of experience, I have the following subjective opinion:

  1. Fortran == bow and arrow. An ancient weapon, I learned how to use it over 40 years ago. Yet, in its modern form, seems a perfectly usable weapon for certain specialized applications. And still fun to use.
  2. Assembly language == toothpick. Hard to use and I never quite believed I could actually kill anything substantial with it.
  3. Applesoft Basic and Turbo Pascal == decided these weren't actually weapons. More like a cocktail fork and a butter knife.
  4. C == Battle sword. Found out I could kill anything with it. But required considerable courage and expertise for large jobs. And oh, I often cut myself with the "pointy" end (pointers!). (My fellow programming warriors used to accidentally stab me with their weapons' pointy ends too, no matter how careful they tried to be.)
  5. C++ == Klingon Bat'leth. Looked like a very formidable weapon and knowledge about it was a formal requirement for the honor of being known as a true programming warrior. But somehow, I never did figure out how to use the thing exactly right. I couldn't ever kill any problem much better than just using C. At first, I tried to wield it like a battle sword. Then I tried to adopt various styles, but never really felt graceful. Now I mostly wield it like a battle sword again. Screw it. The problems get killed.
  6. Java == Catapult. Seemed like an "infernal contraption" and took a team to use it right. Most practical only for certain types of big jobs.
  7. PHP == Hammer. Nothing fancy, but a handy little tool for building sites.
  8. Javascript == Sledgehammer. Took a lot of effort for the problems to be solved and the results looked really messy. But did seem to get the job done.
  9. Python == Starwars light-saber. Currently my favorite. For an old programming Jedi like me, I feel like I can elegantly kill any problem with this tool.

Pseudo-Code


'''The scientific method expressed in Python syntax.

There is a study that suggests "programmers use pseudo-code and pen and paper to
reduce the cognitive complexity of the programming task." (Rachel Bellamy,
article behind ACM-paywall.) And if you do a Google search on "pseudo-code," you
will find a lot of hits that echo this sentiment.

I agree with this sentiment. In fact, as a generalist, with knowledge in many 
areas of math, science, engineering, and programming, I would like to have a 
"common language" that I can use to express myself in any technical area. Is 
this possible? 

If it is, IMHO, Python may come closest to fitting the bill. It is an expressive
language at multiple levels.

(Of course this would not be a true statement for someone unfamiliar with 
Python. They would have the added cognitive complexity of figuring out the 
language's tokens, grammer, syntax, and idiom. And what is the purpose of a
"common language" if nobody else can understand you? It seems I may have the
burden of helping to make Python popular for such a usage.)

There are other approaches like MathCad, that try to preserve the two
dimensional nature of usual mathematical notation and various common symbology.
But I guess I am not tied to tradition just for its own sake.

In programming design, the key issue is not so much to reduce complexity -- but
to contain it. The ability of object-oriented languages to contain complexity
behind an interface IMHO explains the popularity of object oriented languages.
Python's object model is a very simple one and so would seem ideal to serve
as the basis of a general pupose pseudo-code.

Another issue almost as important is elegance. A pseudo-language has to be 
usable -- to allow complexity at a high and abstract level to be expressed in a
simple and efficient manner. 

Elegance can be styled by defining clear paradigm shifts at object interfaces.
Sometimes the pseudo-language itself has elegant ways of expressing commonly
encountered complexities. For example, NumPy's handling of arrays seems easy
to understand and simple to use. So once again Python suggests itself as a good
technical pseudo-language candidate. 

(BTW, these issues are the main reasons I have never found flow-charts or UML 
diagrams very useful for software design. Documenting the design maybe -- but 
not for creating it. Every UML document I have ever produced has always come
AFTER I have decided upon the software's design and fundamental algorithms. The
only other benefit to UML I have experienced is to brainstorm with peers at a 
whiteboard. And there I usually just start making up notation and being sloppy 
just to speed the creative process along.)

So as an example of using Python as a high-level, all-round technical pseudo-
code, consider the scientific method. My personal philosophical viewpoint is 
that the scientific method is not so much a search for objective truth about 
Nature as it is an iterative exercise in predictive computation about Nature. 
Can I express this very abstract notion simply and unambiguously in pseudo-code
using Python syntax?

'''

# Everything always has a context. Here we presuppose the current level of
# scientific knowledge.
from ScientificKnowledge import Theory, Experiment

# One would think that the scientific method would just be part of
# ScientificKnowledge. But let's pretend it's not.
def scientific_method(theory_id, lab_id):
    '''Perform the scientific method.
    
    theory_id = theory name or identifier
    lab_id = identifier of place and people performing the method.
    
    '''
    
    # Every "lab" has their own view of any particular scientific theory:
    my_theories = Theory(lab_id)
    
    # Each lab has their own experimental capabilities:
    my_experiments = Experiment(lab_id)
    
    # Iteratively perform the method as long as relevant to increasing our
    # overall state of scientific knowledge and practical.
    while my_theories.relevant(theory_id):
        
        # What was my belief in the theory before defining and performing
        # the experiment?
        prior_belief = my_theories.belief_intensity(theory_id)
        
        # What experiment will test the theory optimally?
        # What will be the predicted result?
        experiment, prediction = my_theories.generate(
                theory_id, my_experiments)
                
        if experiment == None:
            return # testing theory no longer practical
        
        # Perform the experiment.    
        result = experiment.perform()
        
        # Determine if the results of the experiment were significant.
        
        # Considering all possible theories, how plausible was this result?
        plausibility = my_theories.plausibility(experiment, result)
        
        # How likely was this result?
        likelyhood = prediction.belief_intensity / plausibility
        
        # How does this experiment change my beliefs in this theory?
        posterior_belief = prior_belief * likelyhood
        
        # Update my theories to reflect this new knowledge:
        my_theories.abduction(experiment, result, posterior_belief)
        
# Note how the shortcomings become glaring. There is no IV&V. (I guess this
# could be remedied with a try statement. Unlike most other languages, Python
# style is to use exceptions for non-ideal workflow as well as "extreme"
# exceptions.) Also, there is no mechanism for publishing experimental results
# to others. All this is good since how to improve the description is obvious.

Debating the Existence of Gravity

A few weeks ago over at the Serendipity Blog the author writes:
"If anyone wants to debate the existence or seriousness of anthropogenic climate change, I’d give the same response as I would if they wanted to debate the existence or strength of gravity."

The author views debating climate change as "pointless".

IMHO, the author is actually missing an important point. In a scientific debate, it is perfectly acceptable to remind the "settled science" opponent that a sense of conviction has nothing to do with reality. Trying to convince someone in a scientific debate is to miss the point of the debate. Scientific debates are not rhetorical debates. Why? Because reality doesn't care what anyone thinks.

Take gravity. Is gravity "settled science" and beyond debate? From the physics arXiv blog comes:

"Some physicists are convinced that the properties of information do not come from the behaviour of information carriers such as photons and electrons but the other way round. They think that information itself is the ghostly bedrock on which our universe is built.

Gravity has always been a fly in this ointment. But the growing realisation that information plays a fundamental role here too, could open the way to the kind of unification between the quantum mechanics and relativity that physicists have dreamed of."

Notice that it does not matter how convinced some people are. That is not the goal. So that would not be my goal in a debate on climate change. Just present the science behind the changes. Don't try and convince anyone. Almost Zen-like, the science will win the debate. (I seem to be on a Zen theme.)

Bayesian Scientific Method

The purpose of this post is to illustrate how scientific beliefs or truths change in a Bayesian manner when using the scientific method. It is really pretty simple, although the Bayesian viewpoint differs quite a bit from the notion of Popperian falsifiability. Here is the figure I will be using:


I have described this basic figure in a previous post. New to the figure are B(T), B(R|T), B(R), and B(T|R). These represent Bayesian beliefs. I know that it is more common to use the term Bayesian probabilities and the symbol P instead of B, but I want to avoid any possible confusion with frequency probabilities.

Prior to performing the next experiment, B(T) is my belief in theory/model 'T'. Like all Bayesian beliefs, it is a number between 0 and 1. Notice that this is my own subjective belief. But you will see that by using a Bayesian approach, my (changing) degree of belief will remain consistent with experiment over time.

B(R|T) is my belief that the experiment will yield observations/results 'R' assuming 'T' is true. This value will be deductively derivable from theory 'T'.

B(R) = B(R|T)B(T) + B(R|~T)B(~T). Where: B(T) + B(~T) = 1. Use this formula to calculate the degree that some theory (T or ~T) could believably explain results 'R'.

Notice that theories can still overlap in predicting results and that a zero value for B(R) is possible if no prior theory could explain the results.

B(T|R) = B(R|T)B(T)/B(R). This formula conditionalizes B(T) and calculates my posterior degree of belief in theory 'T'.

Notice that if B(R) is zero then the formula is of the form 0/0 and a new theory must be developed that explains the experiment before the iterative process that is the scientific method can continue.

Notice also that if the new experiment is not independent of previous experiments, then B(R|T) = 1. (Prior theory was previously conditionalized on previous experiments.) This gives a formula of the form 1/1 and my belief will not be altered. So such experiment is useless.

Some numerical examples should make the above clear. (Note: your definition of likely or unlikely may vary from mine.)

A Likely Theory Becomes Very Likely
Prior: B(T) = .95 (likely theory)
B(~T) = 1 - .95 = .05 (all other competing theory unlikely)
B(R|T) = .99 (results very likely)
B(R|~T) = .16 (results rather unlikely according to competing theory)
Posterior: B(T|R) = .99 * .95 / [ .99 * .95 + .16 * .05] = .99 (very likely)

A Likely Theory Becomes Neutral
Prior: B(T) = .95 (likely)
B(~T) = 1 - .95 = .05 (competing theory unlikely)
B(R|T) = .05 (unlikely results)
B(R|~T) = .99 (but strongly predicted by competing theory)
Conditioned: B(T|R) = .05 * .95 / [ .05 * .95 + .99 * .05] = .49 (neutral)

An Unlikely Theory Becomes Neutral
Prior: B(T) = .05 (unlikely)
B(~T) = 1 - .95 = .95 (competing theory likely)
B(R|T) = .99 (but unlikely theory highly confirmed)
B(R|~T) = .05 (and competing theory did not predict result)
Posterior: B(T|R) = .05 * .99 / [ .05 * .99 + .05 * .95] = .51 (neutral)

Zen Uncertainty

Over at the blog Various Consequences, jstultz has a post that takes note of the various levels of uncertainty possible in complex, physics-based computer models. He notes that: "At the bottom of the descent we find level infinity, Zen Uncertainty."

IMHO, Zen uncertainty is something quite different than infinite uncertainty. The definition referenced by jstultz is "Zen Uncertainty: Attempts to understand uncertainty are mere illusions; there is only suffering."

However, a completely Zen-equivalent definition would be:
Zen Uncertainty: Attempts to understand certainty are mere illusions; there is only happiness.


It is actually quite easy to understand Zen from a mathematical standpoint -- Straight lines are merely large circles. That is, positive infinity is equal to negative infinity. Therefore, Zen uncertainty means that infinite uncertainty is equivalent to infinite certainty.  Not exactly what jstultz had in mind!

Before dismissing this idea out-of-hand, consider two things. 1) There is no mathematical inconsistency inherent in believing this. 2) Scientific experiment shows this to be the case in reality.

Mathematically, note Robinson's non-standard analysis and his hyperreal numbers. There is nothing in mathematics, other than definitions, that prevents equating negative infinity with positive infinity.

For a physics example, consider negative Kelvin temperatures, specifically nuclear spin systems. The cooling thermodynamic temperature profile of one such nuclear spin system experiment was (See, IIRC, the Purcell and Pound reference in the article.):

room-temperature ----> +4K ----> (0K) ----> -4K ----> (-infinity == +infinity) ----> room-temperature

A Note on the Climate Model Software

The scientific method requires that a theory make a documented prediction and then an experiment (or observation) performed that tests the prediction. Bayesian inference rules are then used to condition belief in the theory based on the test's results.

Assuming the climate models are the embodiment of scientific theory, which of the climate models predicted the current 10 year (rather flat) temperature trend or the winter of 2010? The answer is none. But that's fine. The models predict climate and not weather, and since climate is usually defined as at least a 30 year record of climate, the climate models should not yet be used to "scientifically" condition our climate priors. (Either direction -- confirm or falsify.)

The alternative, assuming the science is settled and the climate models are engineering works, means that consensus software engineering quality assurance processes must be followed before the results can be used directly (without experiment) as evidence for Bayesian inference. IMHO, such SQA has not yet been adequately performed on the climate model software. A terrible shortcoming, since I think this alternative has the potential to allow us to rationally reach an earlier consensus.

A Point of Decorum

I commented on a blog recently, expressing a concern about the integrity of the scientific method as it is being applied by "the consensus" (IPCC) of climate scientists. I was informed by the blog's author, that I was not being duly concerned, unduly concerned, or even obsessively concerned -- I was being "a little hysterical." Hysterical? LOL.

A point of decorum. Politeness has its purpose. Making uncomplimentary statements about a person's emotional state and its effect on his ability to reason correctly can easily be interpreted as simply not wishing to discuss an issue on its merits. Or that the purpose of the posting is something other than technical or scientific.

Such statements as "being hysterical" are quite impossible to defend against. What evidence can you produce to change someone's opinion about such a thing? So such statements are never made with the intent of being proven right or wrong. This, again, calls into question an author's own motivations in making such statements. IMHO, a technically useless turn of events.

And so the author's response to my comment prevents any further comment by me. Technical discussion over. (At least the comment was posted. Kind of the author not to let my work go wasted.)

Let me reemphasize my main point. Impoliteness is scientifically/technically useless.

E.T. Jaynes once wrote: "In any field, the Establishment is seldom in pursuit of the truth, because it is composed of those who sincerely believe that they are already in possession of it."

I have never doubted the sincerity of the IPCC climate scientists' beliefs about the proper application of the scientific method. Nor do I doubt the sincerity of those climate scientists skeptical of the IPCC consensus' beliefs.

Why? Fortunately, for the scientific method ONLY, these sincere attitudes of climate scientists do not matter. The scientific method, when applied with integrity (regardless of one's prior attitude), is self-correcting.

As you can tell from my sig: "Politely Avoiding Sophistry," I believe decorum should be observed. But I guess some people simply do not understand why such a thing would be important.

The Rapidly Increasing Scope for Software Quality Improvement

The nature of a computer is fundamentally different than that of any other manufactured device. No other device has software. This is a game changer.

For example, the headline in msnbc today reads:

Cars of today are essentially computers on wheels — and that’s a problem. Locating the source of electronic glitches is like “looking for a needle in a haystack.” Full story.

But the story doesn't mention another important consequence. In Alan Cooper's book The Inmates are Running the Asylum, he poses the following questions:

What do you get when you cross a computer with an airplane?
Answer: A computer.
What do you get when you cross a computer with a camera?
Answer: A computer.
What do you get when you cross a computer with an alarm clock?
Answer: A computer.
What do you get when you cross a computer with a car?
Answer: A computer.
What do you get when you cross a computer with a bank?
Answer: A computer.
What do you get when you cross a computer with a warship?
Answer: A computer.

The point Cooper is making is that software quality assurance (SQA) is becoming increasingly more important. And just because Toyota knows how to make a quality car, it does not mean they also know how to make a quality computer. And their failure to fully appreciate that cars are now "computers on wheels" is costing them enormously. Perhaps even the existence of the company.

This enormous risk potential is there for almost every manufacturer. Worse(?), science and even mathematics are becoming more and more dependent on computer software. For example:

What do you get when you cross climate science with a computer?
Answer: A computer.

Our confidence in climate forecasts depend in an ever increasingly fundamental way on how well they do their climate model software quality assurance.

The Measure of Software Quality Improvement

It seems there are many definitions for the term "software quality". On the bright side, this can be liberating. I can add my own definition:

Software quality is the degree of belief that the code will actually execute its intended functions, in the intended operating environment, for the intended life-cycle, without significant error or failure.

Notice that quality is subjective. It is a degree of belief, a Bayesian probability. "A measure of a state of knowledge." This is important for IVV since it means there is a big difference between "software quality" and "consensus software quality."

But that's another topic. The topic here is the proper measure of software quality improvement. (Engineers like quantitative measures and making things better.) So here I would like to note that improving software quality requires conditioning a Bayesian prior. But this, in turn, requires new evidence. The more new evidence presented and the stronger the evidence, the more software quality improves.

This means that the sole test of software quality is, not surprisingly, testing. Testing provides an objective foundation for quality that keeps quality's subjectivity from (hopefully) getting too far from reality.

But that's another topic too. Here I am noting that, IMHO, the strength, power, or intensity of new evidence is best measured in decibels. Bayesian probability ratios are often expressed in decibels. For example, see How Loud is the Evidence?

The decibel is a log scale that simplifies overall power gain/loss calculations and is convenient for ratios of numbers that differ by orders of magnitude. Additionally, log scales have been useful in describing the potential of certain physical phenomenon (e.g., earthquakes) and human perceptions (e.g., hearing). Thus, log scales can be useful when performing risk assessments and other related quality assurance activities.

For more information on evidence measured in decibels, see Chapter 4 of Jayne's Probability Theory: The Logic of Science.

Finally, an analogy. If evidence of software quality is measured in decibels, it suggests software quality assurance can be thought of as singing the quality song about the software. Consensus software quality then is where we all sing the same song, or at least sing our part in a symphony. :-)

(One of my complaints about the state of quality of say, climate science, is that every participant feels she/he must carry the melody.)