Everyone Should Learn About Software

Can we trust climate models? This was the title of a recent blog post by Tamsin Edwards. She says:
Ever since I started scientific research, I’ve been fascinated by uncertainty. By the limits of our knowledge. It’s not something we’re exposed to much at school, where Science is often equated with Facts and not, as it should be, with Doubt and Questioning. My job as a climate scientist is to judge these limits, our confidence, in predictions of climate change and its impacts, and to communicate them clearly.
I liked this article. However, I do not think the key public issue about the climate models is a lack of appreciation of the uncertainty inherent in science. Rather, it is ignorance about the difficulties and limitations inherent in computer modelling of natural phenomena. Especially utterly complex systems like the climate. BTW, I think this ignorance about computer coding extends to some scientists as well.

This is an aspect of a more general problem: the need for everyone in modern society to understand what software is all about. And the fact that schools are not teaching this.

For example, there was an article in the Wall Street Journal over the weekend critical of the US education system titled Sorry, College Grads, I probably Won't Hire You. The author, Kirk McDonald, complains that: "States should provide additional resources to train and employ teachers of science, technology, engineering and math, as well as increase access to the latest hardware and software for elementary and high-school students." His advice is that "If you want to survive in this economy, you'd be well-advised to learn how to speak computer code".

Many programmers don't think this is an issue. Typical Programmer says:
This is another article beating the "everyone must learn to code" drum. Employers can’t find enough people with programming skills, schools aren’t turning out enough engineers, jobs at these cool companies are left unfilled because American students are too lazy or short-sighted to spend a summer learning "basic computer language." If only it was that simple. I have some complaints about this "everyone must code" movement, and Mr. McDonald’s article gives me a starting point because he touched on so many of them.
I recommend reading the entire post. But, in the end, I am not convinced with Typical Programmer Greg Jorgensen's points. Simply put, the reason to learn a computer language is not to be able to program, but to have a foundation from which to understand the nature of software. As an engineer, I have taken a lot of courses in science and math as well as engineering. But I am not a scientist or mathematician. I am an engineer. However, it is necessary for me to understand science and math in order to be a competent engineer. By analogy, many people now think it takes an understanding of software to be a productive member of modern society. And I think that too. And the best way to learn is by doing.

Older and Wiser... Up to a Point



Having had a lot of fun, and having gained a lot of software knowledge and wisdom over the years, I do not look forward to the day when, like an old pro athlete, I have to retire from being a programmer because I physically can't do it anymore.

So, I was pleased to stumble across an IEEE Spectrum Tech Talk article that suggested I don't have much to worry about along that line for quite a while. Here is an excerpt:
Prejudice against older programmers is wrong, but new research suggests it's also inaccurate. A dandy natural experiment to test the technical chops of the old against the young has been conducted—or discovered—by two computer scientists at North Carolina State University, in Raleigh. Professor Emerson Murphy-Hill and Ph.D. student Patrick Morrison went to Stack Overflow, a Web site where programmers answer questions and get rated by the audience. It turned out that ratings rose with contributors' age, at least into the 40s (beyond that the data were sparse). The range of topics handled also rose with range (though, strangely, after dipping in the period from 15 to 30). Finally, the old were at least as well versed as the young in the newer technologies.

Of course, such a natural experiment can't account for all possible confounding factors.
This is in line with what I already knew about chess. I played a lot of chess when I was a kid. I was told then that a person's chess rating didn't drop all that much during the player's lifetime.

The article also addresses chess, but a bit ambiguously. Here is a figure the article presents:

It was hard for me to interpret the vertical axis. I am not sure if SD Units represent standard deviation or not. I remember that for I.Q., one standard deviation is about 15 points, but I don't know if there is an appropriate analogy that can be drawn here.

So I looked up Aging in Sports and Chess which presents some experimental results documented in a paper by Ray C. Fair done in 2007. It says that if you are 100% of your maximum ratings by age 35, then you are 99% at age 40, 97% at age 50, 95% at age 60, 93% at age 70, 90% at age 80, and 80% at age 90. So it doesn't look like more than a slow steady decline until people get into their 80s.

As with the Stack Overflow data, it is hard to judge the relationship of data about chess skill to programming skill. But all this data suggests I don't have too much to worry about for a long time.

What is Wrong with Risk Matrices? -- Part 2

This is a continuation of my last post.

Risk is probability times consequence of a given scenario. The Wikipedia gives an example of a risk matrix as:


Risk has been granularized to reflect uncertainty, lack of data, and subjectivity. The vertical axis is probability. The horizontal axis is consequence. As can be seen from the figures in my previous post, this matrix is logically inconsistent with respect to iso-risk curves (lines of constant risk) whether or not the axis scales are linear or log-log. That is, for example, it is mathematically impossible to have any value other than "Low" in the first row or first column if the scales are linear. Or different values for any diagonal if the axes are log-log.

However, IMHO, the above example does not illustrate what is actually wrong with risk matrices. It is that these table values are being interpreted as risk values in the first place. Instead, risk matrix table values should be interpreted as risk categories. They measure our level of concern about a particular category of risk, not its actual value. The axes measure the "worst case scenario" values of probability and consequence for a given risk.

A risk value for any expected scenario at, say, a nuclear power plant could never be anything other than "Low". The very idea of an acceptable "Extreme" risk of "Certain" probability and "Catastrophic" consequences is ridiculous. No stakeholder would sign-off on that. To be of practical utility, the table values cannot be interpreted as risk values.

A risk category is defined as the management process used to manage risks that are at a certain level of concern. There should be stakeholder consensus on risk categories. Risk categories can be ordered by the amount of process grading and tailoring they allow for ordered levels of concern.

For example, for the most "Extreme" risk category, it may be required that each and every risk, regardless of its probability or consequence in a worst case scenario, be formally documented, analyzed to an extreme level of detail and rigor, prioritized, mitigated to an extremely low level, tracked throughout its lifetime, and signed-off on periodically by all stakeholders. Risk categories of lower concern (High, Moderate, and Low) will allow grading and tailoring of these Extreme requirements to lower levels.

What's Wrong with Risk Matrices?

Risk is equal to probability times consequence. The Wikipedia entry for risk matrix has a reference to an article by Tony Cox titled What's Wrong with Risk Matrices? Unfortunately, the article is behind a paywall. So as far as I am concerned, it doesn't exist. But I was able to google a blog posting about the article.

Interesting was how Cox constructed a risk matrix and then drew lines of constant risk on it. Something like this:
The lines are hyperbolas with the axes as asymptotes.

According to the blog posting referenced above: "Cox shows that the qualitative risk ranking provided by a risk matrix will agree with the quantitative risk ranking only if the matrix is constructed according to certain general principles."

Before examining these general principles, and let me be clear that I fundamentally disagree with these principles, first things first.

I have always viewed risk matrices as having log-log scales. (And I'm not the only one looking at risk matrices this way.) Something like this:

Notice the constant values of risk are all straight lines with a slope of minus one. And note the risk contours in this example are separated by an order of magnitude, not just a doubling of risk value as in the previous figure. This means it is easier to represent a wider range of probabilities and consequence scenarios (measurable in dollar amounts) using a log-log scale.

But the most important reason why I think it is better to use a log-log scale is because risk categorization is subjective. And I believe that where possible subjective judgments, like risk category, should be measured in decibels. As I have written about in a previous post: The decibel is a log scale that simplifies overall power gain/loss calculations and is convenient for ratios of numbers that differ by orders of magnitude. Additionally, log scales have been useful in describing the potential of certain physical phenomenon (e.g., earthquakes) and human perceptions (e.g., hearing). Thus, log scales can be useful when performing risk assessments and other related quality assurance activities.

Next time, a rather loud (pun intended) criticism of Cox's general risk matrix principles. :-)

Failures During Runtime

Researchers recently published online a PDF entitled: A Characteristic Study on Failures of Production Distributed Data-Parallel Programs. The data they used was provided by Microsoft. The programs were all MapReduce-like in structure, composed of "declarative SQL-like queries and imperative C# user-defined functions."

Interesting to me was that the authors collected some actual statistics about the failures encountered during runtime. I love real-world numbers. However, I found their results a bit hard to follow. Here is what I got out of the paper.

They ignored operating system and hardware failures. Of the run-time errors considered: 15% were "logic" errors, 23% were "table" errors and 62% were "row" errors.

They gave examples of logic errors such as: cannot find DLLs or scripts for execution, accessing array elements with an out-of-range index, accessing dictionary items with a non-existent key, or other data-unrelated failures.

Table errors such as: accessing a column with an incorrect name or index, a mismatch between data schema and table schema, or other table-level failures.

And row errors such as: corrupt rows with exceptional data, using illegal arguments, dereferencing null column values, exceptions in user-defined functions, allowing out-of-memory errors, or other row-level failures.

Also interesting was that, according to the authors, 8% of runtime errors could not have been caught by programmers during testing using their current debugging toolset.

It seems possible to create check-in and checkout documentation processes for a development organization's SCM system that could automatically generate statistics similar to the above. I think this would have a positive effect on software quality. For example, the researchers suggest that many failures have a common fix pattern that could be automated. Whether the cost would be worth the effort--I don't know. But it does seem obvious that SCM should be a prime source of quality-related data.

    Programmers Must Consider Risk

    There is a thoughtful programmer-oriented blog called The Codist written by Andrew Wulf. In a recent posting he starts off:
    I need to step outside my usual persona of writing about programming to comment on the happenings of the past few days. 
    In Boston two brothers decided to blow up the Marathon, and an hour from my house half the city of West, Texas was blown to pieces in a massive explosion.
    However, he goes on to discuss these events in a way that I don't think is actually outside the realm of programming. Why? He talks about risk.

    As I have written about before, there is always the risk that software may not perform correctly. The general risks, both benefits and consequences, will be different for different stakeholders. (I wrote that software should be independently verified and validated.) The software development effort must understand, communicate and help deal with the risks associated with the software for all its stakeholders.

    In this post I simply note that software is an integral part of modern society. Thus, risk is an integral part of software engineering. In fact, risk is a fundamental concept to engineering in general.

    Software may have played an important part in both of the events Andrew mentions. For the Boston Marathon, I am pretty sure that data mining software such as face recognition algorithms were used in identifying the suspects. No details about the Texas event are publicly available yet, but with SCADA systems being common in plants nowadays, I can easily imagine software being important there too.


    Why are PC Sales Declining?

    On good data, PC sales are rapidly declining. Shocking! Everyone has an opinion about why this is happening. Mine? I think a lot of it is has to do with a self-fulfilling belief currently held by many people influential in the PC industry. For example:
    “In a sense, these devices [smartphone, tablet, PC] are kind of blurring together,” Andreessen says. “A lot of the killer apps these days – and I would say this is true of Facebook, Twitter, Pinterest, and Gmail – you can use them on whatever device you want, or use them on all the devices at the same time.

    “I use the laptop at work, I use the phone when I am walking around – it’s the marrying of the smart device and the user interface back to the cloud that makes these things magical.”
    This has become a meme. A type of meme where believing in it makes it come true. And the PC business believes it. Look at the interface for Windows 8, formerly known as Metro. Look at the Unity interface for Ubuntu. Both obviously terrible interfaces for doing things only a PC is powerful enough to do. Yet the idea of one interface tied back to the cloud for all devices seems to be a truism among the PC leadership. So that's what it's going to be.

    The PC will be the equivalent of a big-screen TV.

    On the other hand, PCs can do so much more than smartphones and tablets can do. Is there anything else PCs should be doing?

    From a recent Alan Kay, Time Magazine article, the interviewer (David Greelish) asked:
    "What do you think about the trend that these devices are becoming purely communication and social tools? What do you see as good or bad about that? Is current technology improving or harming the social skills of children and especially teens? How about adults?"
    To which Alan Kay replied:
    "Social thinking requires very exacting thresholds to be powerful. For example, we’ve had social thinking for 200,000 years and hardly anything happened that could be considered progress over most of that time. This is because what is most pervasive about social thinking is “how to get along and mutually cope.” Modern science was only invented 400 years ago, and it is a good example of what social thinking can do with a high threshold. Science requires a society because even people who are trying to be good thinkers love their own thoughts and theories — much of the debugging has to be done by others. But the whole system has to rise above our genetic approaches to being social to much more principled methods in order to make social thinking work.

    "By contrast, it is not a huge exaggeration to point out that electronic media over the last 100+ years have actually removed some of day to day needs for reading and writing, and have allowed much of the civilized world to lapse back into oral societal forms (and this is not a good thing at all for systems that require most of the citizenry to think in modern forms).

    "For most people, what is going on is quite harmful."
    Kay thinks PCs could be doing more. So why don't they? Do we lack the knowledge, wisdom, and skill to make it so?

    I've been working on making the PC a "personal and family web assistant". Software that does something to help us to, as Kay put it, "make social thinking work." A "device" that acts as our agent and family protector, working to optimize the relationship between our private lives and the WWW. The main component of the software can only run on a PC.

    The Solution is Not the Problem

    A great thing about agile software development is that it encourages problem decomposition. The decomposition may be functional, structural, or object oriented. Decomposition is my workhorse technique for handling software complexity. And complexity is my number one development problem.

    Since an agile goal is working code each iteration, decomposition is usually intended to produce real and useful (if incomplete) solutions. This means testing and data gathering are an integral part of the agile design process.

    Thus, agile development has some ideas that I think are appealing to every type of engineer. However, the agile programming methodology also has its hard-to-do parts.

    One weakness is that agile development tends to focus on the solution and not the problem. For example, people needing custom software will often submit their "requirements" in the form of a description of how they want their new GUI to look like. (This has happened to me many times.) In other words, they implicitly define the requirements by explicitly defining what they think the solution should be. Unfortunately, these kinds of customers tend to fit in well on an agile team. My experience is that such solutions tend to be mediocre at best. (I call these "Every Program Looks Like a Facebook Web Page" solutions.)

    Better is to define the requirements independent of what the eventual implementation might look like. As the British poet, Edward Hodnett once said: “If you do not ask the right questions, you do not get the right answers.” The solution is not the problem. So keep them separate.

    But obviously, this is hard to do in an iterative development environment.

    Also, the software requirements are just information. And there is a big difference between information, knowledge, and wisdom. Time is required to become knowledgeable about the requirements (and domain expertise and experience) and even more time is required to determine the wisest solution.

    Interesting aside. I just googled:
    • "software information" (> 9 million hits)
    • "software knowledge" (> 1 million hits)
    • "software wisdom" (< 5000 hits).
    I guess "software wisdom" basically doesn't exist.

    Agile methods can also sometimes be an impediment in using my other workhorse technique for attacking complex problems: paradigm shifts. I talked about paradigms in my previous post.

    And the reason why is -- innovation. Coming up with an innovative solution to a hard software problem  depends on coming up with a new analogy or new paradigm shift. Agile methods commit to paradigms way too early and at too low level for much chance of real innovation happening. So using agile methods for such problems are difficult.

    Software Meta Development Note - Paradigms


    Implementing complex software solutions to user requirements is what makes programming so difficult. That is, programming is a lot harder than just implementing algorithms. There are two common ways of handling this complexity--decomposition into multiple interfaces and "paradigm shifts" at interfaces. (These interfaces include functional, class, and data-structural APIs. They include user interfaces.)

    First let me clear up what I mean by a paradigm shift at an interface. It's where a more complex internal problem solution procedure (the paradigm) is presented (at the interface) as something simpler and easier to understand to the user of the procedure. For example, Newton's Method can be used to find the square root of a number. But it is more practical to implement a special function called sqrt rather than expose the complexity of a function called newton_raphson to the programmer, and let her figure out how to get a square root from it. There is a paradigm shift from Newton's Method to square root at the sqrt interface.

    The subject of this post is that there are three different kinds of simple paradigms that are useful to think about: naturalartificial, and what I call synthetic.

    A natural paradigm is encountered a lot in object oriented programming. We can have a logical car object, checkbook object, or screen object. By using such real-life analogies it makes our complex problems easier to understand and work with. Even without further decomposition, we already understand these natural, complex things. I was able to implement a robot/facility message handling system at an automated factory once by simulating the US postal system. In the digital simulacrum, electronic messages became letters, junk mail,  priority mail, etc. Every robot had its logical mail box complete with flag. So did the cranes and other facility equipment. Some computers turned into post offices. Servers became mail centers. Every message had its sender and return zip codes. Etc. We knew the design would work -- the mail does get delivered in real life! And I could even explain what we we doing to the project's managers. :-)

    Natural paradigms have carried over to skeuomorphic graphical user interfaces (GUIs) that emulate objects in the physical world. Steve Jobs and Apple were famous for this. The concept of a file folder is a classic natural paradigm for users.

    An artificial paradigm is where the problem is expressed, at least in part, in terms of abstract, domain specific entities. A C.S. degree requires learning a lot of artificial computer science paradigms. Mathematicians, scientists, and engineers have their own artificial paradigms. An example of an artificial paradigm shift is mapping data from a hierarchical file structure to a relational database. In this case from an artificial paradigm to a different artificial paradigm.

    Perhaps the archetypal example of an artificial GUI paradigm is the QWERTY keyboard. Before the typewriter, no one would have a clue what a keyboard was for. After the typewriter--see next.

    A synthetic paradigm is where you take a common natural paradigm or concept and mix it with an artificial paradigm. A good example is a spreadsheet. Naturally, it represents a sheet of paper with rows and columns where you can write numbers. The concept is fundamental to all accounting. But artificially, we add the ability to put live mathematical formulas and scripts in the cells. Something we can't do in nature, but a snap in the abstract computer world. It changes everything about what spreadsheets can do. The way accountants did their jobs changed in a fundamental way with the invention of spreadsheets.

    The most popular GUI oriented synthetic paradigm is the mouse. The mouse has artificial components such as multi-clickable buttons and a clickable, rotatable wheel. But it is also a natural extension of hand pointing.

    What's the takeaway from all this? If you want to write software that changes the way things are accomplished in a fundamental way, invent a new synthetic paradigm for a user domain and present it in an API or GUI.

    NASA Science


    Big science is making big announcements (for example, see here) about the Alpha Magnetic Spectrometer (AMS) device attached to the International Space Station.

    Another type of signature for dark matter has been found instead of just gravity signatures. There is now good data that suggests dark matter particles collide with each other and that they produce "ordinary" collision decay particles. So even though dark matter does not interact with light, these decay particles do and that is what we are seeing in the AMS.

    OK. So to continue to be skeptical of dark matter, then I must explain data involving another whole type of phenomena in addition to coming up with a better explanation of dark matter gravity observations.

    Dark matter just became more likely to actually exist, I think. This makes the AMS device a great bit of science.

    But do I think the $2 billion spent on the device was worth that kind of information? Off the top of my head, that's about $5 for every person in the USA.

    Truthfully, I can't say. I think it doubtful that a consensus argument can be made that society as a whole (each person, on average) benefits from such an experiment to the tune of $2 billion ($5 each). How can the numbers possibly work out? How can we come to a consensus on a dollar amount to assign to the results of this experiment?

    And that applies to any dollar amount. What if the next dark matter experiment will cost $20 billion, $200 billion, ...? Is there any way we can avoid being arbitrary, capricious and whimsical in our spending on science? To have an engineering discipline in our spending on science?

    This may be a more difficult question than the one on dark matter.