Everyone Should Learn About Software

Can we trust climate models? This was the title of a recent blog post by Tamsin Edwards. She says:
Ever since I started scientific research, I’ve been fascinated by uncertainty. By the limits of our knowledge. It’s not something we’re exposed to much at school, where Science is often equated with Facts and not, as it should be, with Doubt and Questioning. My job as a climate scientist is to judge these limits, our confidence, in predictions of climate change and its impacts, and to communicate them clearly.
I liked this article. However, I do not think the key public issue about the climate models is a lack of appreciation of the uncertainty inherent in science. Rather, it is ignorance about the difficulties and limitations inherent in computer modelling of natural phenomena. Especially utterly complex systems like the climate. BTW, I think this ignorance about computer coding extends to some scientists as well.

This is an aspect of a more general problem: the need for everyone in modern society to understand what software is all about. And the fact that schools are not teaching this.

For example, there was an article in the Wall Street Journal over the weekend critical of the US education system titled Sorry, College Grads, I probably Won't Hire You. The author, Kirk McDonald, complains that: "States should provide additional resources to train and employ teachers of science, technology, engineering and math, as well as increase access to the latest hardware and software for elementary and high-school students." His advice is that "If you want to survive in this economy, you'd be well-advised to learn how to speak computer code".

Many programmers don't think this is an issue. Typical Programmer says:
This is another article beating the "everyone must learn to code" drum. Employers can’t find enough people with programming skills, schools aren’t turning out enough engineers, jobs at these cool companies are left unfilled because American students are too lazy or short-sighted to spend a summer learning "basic computer language." If only it was that simple. I have some complaints about this "everyone must code" movement, and Mr. McDonald’s article gives me a starting point because he touched on so many of them.
I recommend reading the entire post. But, in the end, I am not convinced with Typical Programmer Greg Jorgensen's points. Simply put, the reason to learn a computer language is not to be able to program, but to have a foundation from which to understand the nature of software. As an engineer, I have taken a lot of courses in science and math as well as engineering. But I am not a scientist or mathematician. I am an engineer. However, it is necessary for me to understand science and math in order to be a competent engineer. By analogy, many people now think it takes an understanding of software to be a productive member of modern society. And I think that too. And the best way to learn is by doing.

Older and Wiser... Up to a Point



Having had a lot of fun, and having gained a lot of software knowledge and wisdom over the years, I do not look forward to the day when, like an old pro athlete, I have to retire from being a programmer because I physically can't do it anymore.

So, I was pleased to stumble across an IEEE Spectrum Tech Talk article that suggested I don't have much to worry about along that line for quite a while. Here is an excerpt:
Prejudice against older programmers is wrong, but new research suggests it's also inaccurate. A dandy natural experiment to test the technical chops of the old against the young has been conducted—or discovered—by two computer scientists at North Carolina State University, in Raleigh. Professor Emerson Murphy-Hill and Ph.D. student Patrick Morrison went to Stack Overflow, a Web site where programmers answer questions and get rated by the audience. It turned out that ratings rose with contributors' age, at least into the 40s (beyond that the data were sparse). The range of topics handled also rose with range (though, strangely, after dipping in the period from 15 to 30). Finally, the old were at least as well versed as the young in the newer technologies.

Of course, such a natural experiment can't account for all possible confounding factors.
This is in line with what I already knew about chess. I played a lot of chess when I was a kid. I was told then that a person's chess rating didn't drop all that much during the player's lifetime.

The article also addresses chess, but a bit ambiguously. Here is a figure the article presents:

It was hard for me to interpret the vertical axis. I am not sure if SD Units represent standard deviation or not. I remember that for I.Q., one standard deviation is about 15 points, but I don't know if there is an appropriate analogy that can be drawn here.

So I looked up Aging in Sports and Chess which presents some experimental results documented in a paper by Ray C. Fair done in 2007. It says that if you are 100% of your maximum ratings by age 35, then you are 99% at age 40, 97% at age 50, 95% at age 60, 93% at age 70, 90% at age 80, and 80% at age 90. So it doesn't look like more than a slow steady decline until people get into their 80s.

As with the Stack Overflow data, it is hard to judge the relationship of data about chess skill to programming skill. But all this data suggests I don't have too much to worry about for a long time.

What is Wrong with Risk Matrices? -- Part 2

This is a continuation of my last post.

Risk is probability times consequence of a given scenario. The Wikipedia gives an example of a risk matrix as:


Risk has been granularized to reflect uncertainty, lack of data, and subjectivity. The vertical axis is probability. The horizontal axis is consequence. As can be seen from the figures in my previous post, this matrix is logically inconsistent with respect to iso-risk curves (lines of constant risk) whether or not the axis scales are linear or log-log. That is, for example, it is mathematically impossible to have any value other than "Low" in the first row or first column if the scales are linear. Or different values for any diagonal if the axes are log-log.

However, IMHO, the above example does not illustrate what is actually wrong with risk matrices. It is that these table values are being interpreted as risk values in the first place. Instead, risk matrix table values should be interpreted as risk categories. They measure our level of concern about a particular category of risk, not its actual value. The axes measure the "worst case scenario" values of probability and consequence for a given risk.

A risk value for any expected scenario at, say, a nuclear power plant could never be anything other than "Low". The very idea of an acceptable "Extreme" risk of "Certain" probability and "Catastrophic" consequences is ridiculous. No stakeholder would sign-off on that. To be of practical utility, the table values cannot be interpreted as risk values.

A risk category is defined as the management process used to manage risks that are at a certain level of concern. There should be stakeholder consensus on risk categories. Risk categories can be ordered by the amount of process grading and tailoring they allow for ordered levels of concern.

For example, for the most "Extreme" risk category, it may be required that each and every risk, regardless of its probability or consequence in a worst case scenario, be formally documented, analyzed to an extreme level of detail and rigor, prioritized, mitigated to an extremely low level, tracked throughout its lifetime, and signed-off on periodically by all stakeholders. Risk categories of lower concern (High, Moderate, and Low) will allow grading and tailoring of these Extreme requirements to lower levels.

What's Wrong with Risk Matrices?

Risk is equal to probability times consequence. The Wikipedia entry for risk matrix has a reference to an article by Tony Cox titled What's Wrong with Risk Matrices? Unfortunately, the article is behind a paywall. So as far as I am concerned, it doesn't exist. But I was able to google a blog posting about the article.

Interesting was how Cox constructed a risk matrix and then drew lines of constant risk on it. Something like this:
The lines are hyperbolas with the axes as asymptotes.

According to the blog posting referenced above: "Cox shows that the qualitative risk ranking provided by a risk matrix will agree with the quantitative risk ranking only if the matrix is constructed according to certain general principles."

Before examining these general principles, and let me be clear that I fundamentally disagree with these principles, first things first.

I have always viewed risk matrices as having log-log scales. (And I'm not the only one looking at risk matrices this way.) Something like this:

Notice the constant values of risk are all straight lines with a slope of minus one. And note the risk contours in this example are separated by an order of magnitude, not just a doubling of risk value as in the previous figure. This means it is easier to represent a wider range of probabilities and consequence scenarios (measurable in dollar amounts) using a log-log scale.

But the most important reason why I think it is better to use a log-log scale is because risk categorization is subjective. And I believe that where possible subjective judgments, like risk category, should be measured in decibels. As I have written about in a previous post: The decibel is a log scale that simplifies overall power gain/loss calculations and is convenient for ratios of numbers that differ by orders of magnitude. Additionally, log scales have been useful in describing the potential of certain physical phenomenon (e.g., earthquakes) and human perceptions (e.g., hearing). Thus, log scales can be useful when performing risk assessments and other related quality assurance activities.

Next time, a rather loud (pun intended) criticism of Cox's general risk matrix principles. :-)