This past August, most of the major news outlets reported the seemingly sad news that Lucy fell from a tree and plunged 40 feet to her death. This was news because of a study conducted by researchers in Texas and reported in Nature . You see, Lucy has a long history in that she existed more than 3 million years ago and is considered a forerunner of modern humans. The discovery of fragments of her skeleton in Ethiopia in 1974 by Donald Johanson, a paleoanthropologist, established that Lucy was a young adult, 3.5 feet tall and weighing 60 lbs, and she walked upright on the ground and also climbed trees; this was all before the development of a large brain. The study published in Nature by Kappelman et al, used computed tomography scans to examine the bone fragments (about 40% of a whole skeleton), and the authors concluded that Lucy probably fell from a tall tree and sustained multiple fractures and damage to her internal organs. Moreover, they suggested that “she likely died quickly” and do not “think she suffered.”
Donald Johanson disagreed with their findings. He attributed the breaks and cracks in the bones to weathering and the fossilization process. He is quoted as saying, “My reticence about the [study] is that in some ways it is a narrative, a just-so story…something that you can’t verify and you can’t falsify and is therefore unprovable.” And that speaks to the real subject of this editorial, which is good science, bad science, and junk science.
Many adjectives have been used to describe science. For example, the words “good,” “bad,” “incomplete,” “pathological,” “dishonest,” “fraudulent,” “wrong,” “voodoo,” “fake,” “bogus,” and “pseudo” have all been linked with “science.” Each 2-word term is intended to describe different situations regarding the qualities (both good and bad) of the science and the qualities (both good and bad) of the scientists. For convenience, I will limit this discussion to 3 categories of scientific inquiry.
Good science
The scientific method is fairly easy to describe, but studies can be time-consuming and difficult to accomplish depending on the nature and complexity of the question that is being investigated. The steps involved include (1) formulation of a question based on observations or previous explanations, (2) consulting the literature to determine what is and what is not already known, (3) constructing a hypothesis, (4) testing the hypothesis by designing and conducting an experiment, (5) analyzing the data, (6) interpreting the data and formulating conclusions, and (7) communicating or publishing detailed aspects of the study and its results via the appropriate medium.
Although this is a simple description of one scientific trial, it must be emphasized that a single trial generally means little; most research questions require many trials before a definite consensus can be determined. Each experimental trial builds direction for the next inquiry, and the conclusions of one study essentially serve as the starting point for new inquiry. The goal of all of this effort is to find and refine the truth by correcting old knowledge and acquiring new knowledge, and integrating it into the present body of knowledge. This, in a way, is a process that defines reality.
During the conduct of research, several important stipulations must be followed. For example, it is important that the whole process be objective (free from bias), the experiment must be conducted carefully so that it can be repeated by other scientists, and the research must be submitted to, scrutinized by, and judged by peers. It may surprise you to know that not all orthodontic journals use a peer-review system. To emphasize…peer review is important to the scientist who seeks knowledge and to the clinician who applies knowledge, but most of all for the patients who benefit from knowledge. They deserve the best, most efficient, most effective, and safest treatments available now and in the future. They do not deserve the old trial-and-error approach.
Bad science
Bad science can be considered good science, except that at least one step involved is done poorly or wrongly, apparently accidentally, unknowingly, and unintentionally. As a result, well-intentioned but incorrect, obsolete, incomplete, or simplistic scientific ideas are explored, but nothing productive results.
For example, the hypothesis that is formulated could be untestable, as pointed out above by Johanson. Melvin Moss was skilled at offering up untestable theories as well, but rather than exploring his theories, I will provide an example that I made up. It is called the Behrents theory of fat. This theory suggests that fat is neither gained nor lost in the human race; it is simply redistributed among the humans of the world. So, if you are losing weight, according to this theory, then you should feel bad because you are making someone else fat. Likewise, if you are gaining weight, then you should feel good because someone else is losing weight. What makes this hypothesis impossible to test is that it could only be proven by weighing everyone in the world at the same moment in time and a few months later doing the same thing again. That’s impossible.
In the orthodontic arena, I have seen papers (either submitted or published, I’ll not say more) in which the selection of the sample or the control group determined the outcome before the research was actually conducted. For example, one study indicated that a sample of adolescents was selected, and the treatment results were compared with some norm values; the authors concluded that their treatment produced more growth than would be expected. Unfortunately, this effect was guaranteed because the treated subjects were all boys, but the control sample was a mixture of the sexes.
More recently, an investigator wanted to show that his treatment produced better facial esthetics than other treatments. To do this, he selected a bunch of posttreatment facial photographs from his files and then selected an equal number of posttreatment pictures from articles by other authors in the literature. Guess what: when the pictures were shown to some “judges,” the faces selected from his practice won.
It is also the case that various papers have been submitted in which the investigated subjects were selectively picked based on the notion that it is important to study only those in whom the treatment actually worked. It is usually the case that those selected also grew very nicely, perhaps in concert with the treatment or perhaps in spite of the treatment. Since many practitioners have thousands of patients to choose from, it is clear that they can pick a sample of good outcomes for a case report or a presentation. They can thus offer anecdotal evidence for whatever message they wish to convey. On the other hand, if practitioners really want to know what is happening on the “typical” patient, then a proper scientific inquiry using a sample numbering in the fifties, hundreds, or even thousands could be useful and important.
The work of Dr Alexis Carrel (1873-1944) is pertinent to the general question of proper experimentation. Dr Carrel was a French surgeon who was awarded the Nobel Prize in Physiology/Medicine in 1912. He was interested in aging, and he claimed that cells continued to grow indefinitely: ie, they are immortal; this was the dominant view in the early 20th century. To test this theory, he placed fibroblasts from an embryonic chicken heart into a flask and then provided serum to the cells on a regular basis for nourishment. He maintained the cells in this state for over 20 years, which is longer than the lifespan of a chicken. His results were widely disseminated and drew great attention. Unfortunately, no one was able to replicate his findings. In the 1960s, the answer would be provided. Hayflick and Moorhead found that cells undergo a limited number of replications and then die. Well, what was wrong with Carrel’s work? Apparently, the serum Carrel provided contained new cells, so that in the process of providing nourishment, new cells were introduced into the culture on a daily basis. The experiment was flawed, so the theory that cells are immortal was clearly wrong and then later replaced by a competing theory that could be replicated.
As a final example of bad science, I point to a paper by Claus et al that appeared in the literature in 2012. The title of the work was “Dental x-rays and risk of meningioma.” This was a large study involving over a thousand people with meningioma, ages 20 to 79 years, compared with over a thousand matched controls who did not have meningioma. All participants were asked to recall the details of the dental care they had received over their lifetime (including orthodontics) and to report the number of times they had specific dental x-rays (bitewings, full-mouth series, panorex…and even cephalograms) during 4 time periods (<10 years, 10-19 years, 20-49 years, and >49 years of age). Of course, they found something…that bitewings are associated with a higher risk of cancer; panorex x-rays also were cited as a risk (but not cephalograms). Well, there is a lot wrong with this study, but I will only point out 3 big weaknesses. First, do you remember when and what dental x-rays you have had over your lifetime? I don’t, and I suspect that you don’t either; this type of situation causes both underreporting and overreporting. Second, bitewings were found to be a risk factor for cancer, but a full-mouth series of x-rays was not; this makes no sense, since bitewing x-rays are usually part of a full-mouth series. Finally, epidemiologic studies such as this intend to define relationships but do not prove cause and effect; unfortunately, such studies do not highlight that limitation, and thus they are often poorly interpreted by the lay public and the media.