Can Scientific/Engineering Code Be Validated? Part 2

This is a continuation of my previous post. There I note that I interpret software validation more broadly than Roache. That I believe it can be applied to embodied code as well as documented theory. Here I continue to present how Roache's interpretation of validation may differ somewhat from my own.

In Appendix B, Roache starts off with a commonly used definition for validation:
Validation: The process of determining the degree to which a model {and its associated data} is an accurate representation of the real world from the perspective of the intended uses of the model.
Roache sees three main issues with people properly interpreting this definition. One issue is with the phrase "intended uses." His recommendation is:
Intended use, at least in its specific sense, is not required for validation. The common validation definition could be salvaged by re-defining intended use to include very general intentions, but frankly this appears to be a hollow exercise. The fact is that a useful validation exercise does not necessarily require an intended use, specific or general.
This recommendation is developed using argument such as:
Clearly, much of the confusion is the result of trying to use the same word for different needs. Project oriented engineers are more concerned with specific applications, and naturally tend to rank acceptability within validation (which term is used more often than accreditation or certification). Research engineers and scientists tend to take a broader view, and often would prefer to use validation to encompass only the assessment of accuracy level, rather than to make decisions about whether that level is adequate for unspecified future uses. It is also significant to recognize that these project-specific requirements on accuracy are often ephemeral, so it is difficult to see a rationale for a priori rigid specifications of validation requirements [5,11] when the criteria so often can be re-negotiated if the initial evaluation fails narrowly.
And:
The requirement for "intended use" sounds good at first, but it fails upon closer thought. Did D. C. Wilcox [13] need to have an "intended use" in mind when he evaluated the k-w RANS turbulence models for adverse pressure gradient flows? He may very well have had uses in mind, but does a modeler need to have the same use in mind two decades later? If not, must the validation comparison be repeated? Certainly not.
But who would want to repeat it?

Validation is subjective. (As Roache puts it -- ephemeral.) So it logically must be performed from some perspective. Who's perspective? The software's stakeholders. But unless usage is predefined, not all of the software's potential stakeholders have been identified. How can their potentially differing priors be ignored?

Roache evidently believe validation can be made objective. That acceptability, accreditation, and certification can be separated out from validation. That the degree to which a model is an accurate representation of the real world can be decided upon by some abstract, objective algorithm. No Bayesian priors required.

But I could not disagree more. So I highly recommend reading Roache for a viewpoint different than my own.

3 comments:

  1. That the degree to which a model is an accurate representation of the real world can be decided upon by some abstract, objective algorithm. No Bayesian priors required.
    If we do a good job documenting the likelihoods, then we can always bring our priors to the party after the fact.

    ReplyDelete
  2. True, but then how do we decide on just how good a job of documenting the likelihoods we need to perform if we don't know the priors? Validation is self-referential. (A key reason it is so hard.)

    Not all model validations require the same rigor. Validating a model for use with nuclear power steam piping has one standard of rigor, validation for lawn sprinkler piping another.

    The concept of "grading" SQA requirements, including V&V requirements, is very common and a practical necessity. For example, consider DOE G 414.1-4. This safety software guide is of the opinion:

    From the foregoing [identification of issues impacting safety software] it can be seen that there are several interdependencies and tradeoffs that should be addressed when integrating software into safety systems. The necessity for robust software quality engineering processes is obvious when safety software applications are required. However, just ensuring that a "good" software engineering process or that V&V activities exist is not sufficient alone to produce safe and reliable software.

    ReplyDelete
  3. I think I could be more direct. IMHO, scientific/engineering software validation is a process that should mimic the scientific method itself. It should be iterative, depend on real world data, and be self-correcting. Its results are also Bayesian beliefs. And so the term validation is not a property of the software itself, but a description of people's beliefs about the software.

    ReplyDelete