Scientific validity

Scientific validity is the applicability of a conclusion drawn in the context of a scientific experiment to the world at large. Science rarely attempts to answer questions that apply only in a laboratory setting, but when following the scientific method many problems must be reduced to tightly controlled laboratory studies that are far removed from the larger real world questions being asked. An important question then is whether or not the conclusions and data drawn from a tightly controlled study can be extrapolated outside of the lab. The degree to which extrapolation is possible is the scientific validity of an experiment.

The freshman psychology undergrad problem
An easy way to see this is in one of the classic problems in the field of psychology. In the majority of research universities conducting psychological experiments the primary subject pool are undergraduate students that receive class credit for participating in experiments. This means that there is a very strong systemic bias towards using a particular population in almost all psychological research across the world. If there is something "special" about this particular sub-population that makes them fundamentally different from the population at large then experimental results have no general validity and can only be applied to the sub-population of "college psychology undergraduates."

A similar problem comes in infant studies. A great deal of interesting work is being done on infants, babies and toddlers about mental, linguistic, musical, and sensory development. Much of this research is done using modern imaging devices such as PET scans and EEGs. While these scanners are mostly harmless, they often look foreboding. The population of parents that are aware of infant studies and who are willing to let their babies participate is basically graduate students (which is why universities love it when grad students breed). Again if there is something special about a population of infants whose parents are psychology graduate students then the validity of the experiments to the general population of human infants comes into question.

There are several statistical techniques that help deal with certain validity issues, but mostly they can only be dealt with through good experimental design. The various threats to scientific validity are carefully explored both in the philosophy of science and more directly by researchers themselves.

Threats to scientific validity
The threats to scientific validity were laid out over 40 years ago in the seminal work Experimental and quasi-experimental designs for research by Campbell and Stanley. In science, validity is important in two ways. Internal validity refers to the degree to which the data addresses the original tested hypotheses. External validity describes the generalizability of specific research findings to phenomena outside of the research project.

The goal of science is to construct theories and gather data to support or falsify these theories in a manner that maximizes both internal and external validity. The scientific method attempts to limit and control the number of threats to validity faced by an experiment. Cogent application of statistics and good experimental design can greatly increase validity. Meticulous design, method and analysis of science result in the validity lacking in pseudoscience. Proponents of pseudoscience, woo, and quackery either ignore threats to validity or use them to generate false data to back up their crazy claims. The eight most frequently cited threats to internal validity are:
 * 1) History - the specific events occurring between the first and second measurements in addition to the experimental variables
 * 2) Maturation - natural changes within the participants over time (not specific to particular events), e.g., growing older, hungrier, more tired, and so on.
 * 3) Testing - the effects of taking a test upon subsequent retests.
 * 4) Instrumentation - changes in calibration of a measurement tool or changes in the observers or scorers may produce changes in the obtained measurements.
 * 5) Statistical regression - operating where groups have been selected on the basis of their extreme scores.
 * 6) Selection biases - findings resulting from differential selection of respondents for the comparison groups.
 * 7) Experimental mortality - or differential dropout between groups.
 * 8) Selection-maturation interaction - etc. e.g., in multiple-group quasi-experimental designs

Four factors jeopardizing external validity or representativeness are:
 * 1) Reactive or interaction effect of testing, a pretest might increase
 * 2) Interaction effects of selection biases and the experimental variable.
 * 3) Reactive effects of experimental arrangements, which would preclude generalization about the effect of the experimental variable upon persons being exposed to it in non-experimental settings
 * 4) Multiple-treatment interference, where effects of earlier treatments are not erasable.