The limits of discoveries (Carlo Ghezzi)


The results of scientific discoveries are
seldom expressed as absolute truths. Absolute truths may be the result of a purely
theoretical research which proves a statement (that is, a theorem) in a given mathematical
setting. In practice, as we observed in earlier lectures,
research is an iterative process, through which we validate and refine the research
results, until they reach an acceptable degree of support from experiments. At each stage of this iterative process, research results are an approximation of the “real” results. In this lecture we discuss how approximation
can be assessed and controlled in the research process. We can frame this problem in a logical setting,
through the notions of soundness and completeness. When a researcher studies an unknown phenomenon,
he or she comes out with hypotheses, which constitute a theory that tries to explain
the phenomenon. The theory is sound if all the potential predictions
made by the theory are actually observable for the real phenomenon. As the picture shows, a sound theory only
allows truths to be deduced, but not necessarily all truths. A theory is complete if it can predict all
actually observable manifestations. As the picture shows, a complete theory allows
all truths to be deduced, but also possibly non-true things. Ideally, the theory should be both sound and
complete. The deducible truths from a theory that are
not real truths are called false positives. The real truths that are not deducible from
a theory are called false negatives. Of course deducible truths that are real truths
are true positives. For example, let us go back to the simple
culinary theory we defined in an earlier lecture “All Italians like pasta”. The theory is complete, since it can predict
all Italians who like pasta, but it is obviously unsound. For example, it can be applied to a newborn
Italian to deduce that the infant likes pasta. In this example, infants constitute false
positives. A refinement of the theory like “All Italians
more than 18 years old like pasta” is also unsound, since some existing adult Italians
(who are also false positives) may dislike pasta. Let us consider other kinds of research, different
from the examples we discussed so far, which were all trying to explain real-world phenomena. Suppose that the objective of our is to solve
a novel problem, or to find a better solution to an existing problem for which a solution
is already known. Typically, in Informatics the focus may be
on the invention of a new, clever algorithms for a new problem, or on a new algorithm that
improves efficiency of known algorithms in terms of computation time or consumed storage. The concepts of soundness and completeness
equally apply to problem solutions, and algorithms in particular. A sound algorithm never includes a wrong answer,
but it might miss a few right answers. A complete algorithm never misses a right
answer: it provides the complete set of right answers, but it might include a few wrong
answers. A sound algorithm is conservative (pessimistic). Its dogma is to remain on safe grounds to
get assurance on its results. On the opposite, a complete algorithm is liberal
and optimistic. It is open to more results than may turn out
to be invalid in an attempt to include all results. This is described by this figure: A false negative is a right answer that is
not recognized as such by the algorithm. Likewise, a false positive is an answer provided
by an algorithm that turns out to be wrong. An exact solution is both sound and complete. However, in many cases it may be impossible
or impractical di find an exact solution. We need to accept, and we may be perfectly
happy with, approximate solutions. Certain algorithms are neither complete nor
sound, and thus exhibit both false positives and false negative. The number of false positives and false negatives
is an indication of the inaccuracy of an algorithm. Approximate algorithms are very common in
Informatics. A very important reason is that there are
many problems that are undecidable. An undecidable problem is a decision problem
for which it is impossible to construct an algorithm that always leads to a correct (yes/no)
answer. Undecidable problems are ubiquitous. In particular, most problems concerning program
analysis are undecidable. Program analysis refers to analyzing computer
programs to verify certain properties that assure absence of bugs in the program, or
to exclude certain erroneous conditions. A very simple property may be: every statement
in the program is executable, i.e, it is reachable in some computation. A statement that is never executed is useless:
it constitutes dead code. Although well-written program should contain
dead code, per se it does not indicate an error in the algorithm. However, it is very often the symptom that
there is an error in the program. An algorithm that produces the list of all
unreachable statements would be quite useful in program debugging. Unfortunately, this problem is undecidable,
and an exact algorithm does not exist. Program analysis research has analyzed this
problem and produced approximate solutions. A sound solution produces a list of statement
that are guaranteed to be unreachable (although there can be more). A complete solution would produce a list of
statements that includes all unreachable statements, although they might instead be reachable. Notice that an algorithm that always produces
an empty set of statements would be sound, while an algorithm that always produces all
statements would be complete. However, in these two cases the number of
false negatives and false positives, respectively, would be unacceptable! How can we reason about the adequacy of an
approximate research result? Two notions are commonly used for this purpose:
precision and recall. Precision is defined as the ratio between
tp and tp plus fp. In the case of an approximate algorithm, it
is the fraction of correct answers among the computed answers. Ideally, it should be 1. Completeness is instead defined as the ratio
between tp and tp plus fn. In the case of an approximate algorithm, it is the fraction
of the computed answers over the total number of correct answers.

Leave a Reply

Your email address will not be published. Required fields are marked *