The slippery slope – Critical perspectives on in vitro research methodologies

Abstract

Objectives

This paper attempts to provide critical perspectives on common in vitro research methodologies, including shear bond testing, wear testing, and load-to-failure tests. Origins of interest in high-quality laboratory data is reviewed, in vitro data is categorized into property and simulation protocols, and two approaches are suggested for establishing clinical validity. It is hoped that these insights will encourage further progress toward development of in vitro tests that are validated against clinical performance and/or by producing clinically validated failure or damage mechanisms.

Materials and methods

Published shear and tensile bond data (macro and micro) is examined in light of published finite element analyses (FEA). This data is subjected to a Weibull scaling analysis to ascertain whether scaling is consistent with failure from the bonded interface or not. Wear tests results are presented in light of the damage mechanism(s) operating. Quantitative wear data is re-examined as being dependent upon contact pressure. Load-to-failure test results are re-analyzed by calculating contact stresses at failure for 119 tests from 54 publications over more than 25 years.

Results

FEA analyses and reported failure modes (adhesive, mixed, cohesive) are consistent with failure not involving interfacial “shear stresses” as calculated in published work. Weibull scaling clearly suggests failure involving external surfaces of specimens, not interfacial origins. Contact stresses (pressures) are clearly an important variable in wear testing and are not well-controlled in published work. Load-to-failure tests create damage not seen clinically due to excessively high contact stresses. Most contact stresses in the 119 tests examined were calculated to be between 1000 MPa and 5000 MPa, whereas clinical contact stresses at wear facets have been measured not to exceed 40 MPa.

Conclusions

Our community can do a much better job of designing in vitro tests that more closely simulate clinical conditions, especially when contact is involved. Journals are encouraged to thoughtfully consider a ban on publishing papers using bond tests and load-to-failure methods that are seriously flawed and have no clinical relevance.

Introduction

In vitro tests can broadly be categorized into two main groups: (1) physical property measurements; and, (2) simulations of clinical behavior. Into the first group fall standardized tests of strength, fracture toughness, hardness, thermal expansion and the like using non-dental specimens. Simulations attempt to create conditions of physical and chemical challenge that are believed to be encountered orally including loading (monotonic or cyclic), thermal variations, and wear often using clinically realistic specimens. Clinicians are far more likely to encounter product comparisons based on in vitro testing than those based upon clinical studies. Further, many laboratory and clinical technique variables will only ever be studied using in vitro tests, such as comparisons among commercial restorative materials, cement choices, sandblasting, etching, chemical treatments and the thickness ratios of bilayered systems.

Origins of interest in quality in vitro data

In 1918 the U.S. Army awarded the National Bureau of Standards (NBS) a contract to examine the setting characteristics and physical properties (including setting expansion and compressive strength) of twelve samples of amalgam . These products all had numerous claims of superiority and endorsements printed on their cartons without any numerical data in support. In turn, the Army was given evidence-based recommendations by the NBS as to which amalgams should give good service. This project highlighted the need for a specification for dental amalgam and a study was initiated to investigate this, modeled after the American Medical Association’s Pharmacopeia specifications . Follow-on work included the publication of an extensive 32 page report on the physical properties of dental materials by which honest comparisons and choices could begin to be made . Needless to say, the “opinion leaders” of the time whose support was behind inferior products were not pleased, “Deans, prominent lecturers, dentists whose personal endorsements were ignored in the specifications were in a clouded area” .

On the other hand, the specification highlighted serious concern regarding the lack of standardization in testing methods and the incomparability of results among laboratories. Request for lectures and clinics began to piled-up, as did requests to extend the program to examine gold alloys, cements and accessory materials. One remarkable businessman/dentist, Dr. Louis J. Weinstein, stepped forward and sponsored six years of research (1922–1928) under the Bureau Research Associate plan whereby a qualified scientist could work within the NBS on problems of public interest . Dr. Weinstein’s request was simple, “I want to see data on dental gold alloys and accessories that will stand up when presented before schools and private groups and not be confused with data in present day texts and glaring advertisements” . It was under this sponsorship that precision investment casting was optimized, including the development of dedicated investments and the use of an asbestos-containing tape to line the casting ring. The first comprehensive report from this research was given at a Dallas meeting of the American Dental Association – touching off a furious effort by vested interests to squelch such activity, “These interests wanted to crush the research. They appealed to the Secretary of Commerce, insisting that the reports were creating confusion among dentists and in the schools and requested that the work be stopped” . The Director of the NBS backed the dental materials work as a program having health-saving importance and by 1927 the Bureau and the American Dental Association (ADA) had formalized a cooperative research program that continues to this day, leading to both the ADA Specification Program and later the International Standards Organization, Technical Committee 106 (Dentistry).

Origins of interest in quality in vitro data

In 1918 the U.S. Army awarded the National Bureau of Standards (NBS) a contract to examine the setting characteristics and physical properties (including setting expansion and compressive strength) of twelve samples of amalgam . These products all had numerous claims of superiority and endorsements printed on their cartons without any numerical data in support. In turn, the Army was given evidence-based recommendations by the NBS as to which amalgams should give good service. This project highlighted the need for a specification for dental amalgam and a study was initiated to investigate this, modeled after the American Medical Association’s Pharmacopeia specifications . Follow-on work included the publication of an extensive 32 page report on the physical properties of dental materials by which honest comparisons and choices could begin to be made . Needless to say, the “opinion leaders” of the time whose support was behind inferior products were not pleased, “Deans, prominent lecturers, dentists whose personal endorsements were ignored in the specifications were in a clouded area” .

On the other hand, the specification highlighted serious concern regarding the lack of standardization in testing methods and the incomparability of results among laboratories. Request for lectures and clinics began to piled-up, as did requests to extend the program to examine gold alloys, cements and accessory materials. One remarkable businessman/dentist, Dr. Louis J. Weinstein, stepped forward and sponsored six years of research (1922–1928) under the Bureau Research Associate plan whereby a qualified scientist could work within the NBS on problems of public interest . Dr. Weinstein’s request was simple, “I want to see data on dental gold alloys and accessories that will stand up when presented before schools and private groups and not be confused with data in present day texts and glaring advertisements” . It was under this sponsorship that precision investment casting was optimized, including the development of dedicated investments and the use of an asbestos-containing tape to line the casting ring. The first comprehensive report from this research was given at a Dallas meeting of the American Dental Association – touching off a furious effort by vested interests to squelch such activity, “These interests wanted to crush the research. They appealed to the Secretary of Commerce, insisting that the reports were creating confusion among dentists and in the schools and requested that the work be stopped” . The Director of the NBS backed the dental materials work as a program having health-saving importance and by 1927 the Bureau and the American Dental Association (ADA) had formalized a cooperative research program that continues to this day, leading to both the ADA Specification Program and later the International Standards Organization, Technical Committee 106 (Dentistry).

Basic requirements for clinical validity

In 1969, Björn Hedegård outlined the following goal, “With sound clinical research on a larger and more penetrating scale, data and information may be obtained, that will make it possible to set up more meaningful test procedures in the laboratory. And that is the goal: to be able to characterize the dental material in the laboratory and correctly predict its clinical performance” . How much closer are we to that goal today than in 1969? Certainly we have much more interest in and performance of clinical studies and an emphasis on expanding the breadth and depth of our clinical database. As will be discussed there is increasing opportunity to compare property data with clinical behavior. There are also many more discussions and presentations questioning the meaning of test methodologies. This paper aims to add to and inform that discussion.

Validation of in vitro results as having clinical meaning

Use of physical property measures and in vitro simulation outcomes in predicting clinical behavior would seem to require one of two basic “reality checks”: (1) reproducing clinically documented damage accumulation or failure mechanisms; and/or, (2) validation in predicting clinical behavior (e.g., rank ordering performance of commercial systems). As will be repeatedly reinforced, this rules out assumed but not validated failure and damage mechanisms – irrespective of how good they feel at the gut level. Producing clinically documented damage or failure requires creating an equivalent stress state acting on realistic flaws. This further implies that the physics of the situation (often a contact problem) are well understood. For example if failure is due to shear stresses, test methods should produce shear and the magnitude of these should be calculated correctly. Having realistic flaws implies that all processing-related laboratory or clinical defects should be contained within test specimens at appropriate sizes and distributions. Validation of clinical behavior means either (1) identifying or modeling clinical mechanism(s) of damage or failure or (2) correctly rank-ordering the clinical performance of a number of commercial systems from clinical trial data. As will be shown, it is likely not helpful to develop or utilize methodologies build upon assumed mechanisms – this is the road down the slippery slope.

Shear bond test methods

Three characteristics strongly compromise traditional “shear bond” testing methods, including that: (1) shear stresses are not calculated correctly; (2) most failure does not originate from or involve the interface; and, (3) Weibull scaling analysis of macro-to-micro test data is more consistent with failure from external surfaces in tension than interfacial surface in shear. Braga et al. , building on previous work by DeHoff et al. clearly demonstrates that traditional shear protocols incorrectly calculate shear stresses at failure, vastly underestimating them and failing to consider their non-uniform distribution. Braga et al. provide additional compelling evidence for the ubiquity of non-interfacial failure. Scherrer et al. adds to evidence for non-interfacial failure as the majority mode and provides data, that when analyzed (as below), is consistent with surface failure versus interfacial failure as tests are scaled-down from macro-sized to micro-sized.

DeHoff et al. examined two common loading protocols by three-dimensional finite element analysis (3D FEA), (1) loading with a knife edge and (2) loading with a wire loop. In both cases simply dividing the failure load by the bonded area grossly underestimated the true interfacial stresses. Braga et al. investigated three loading protocols for composite-dentin bonding, again with 3D FEA: (1) knife edge; (2) wire loop; and, (3) flat rod. Where the nominal shear stress (load/area) was 16 MPa the FEA calculated stresses ranged from 105 MPa (shear) to 159 MPa (tension). Both tensile and shear stresses were highly localized, most for knife edge loading and least for wire loop ( Fig. 1 ). Most importantly, tensile stresses were higher than shear suggesting that “shear” testing actually involves failure initiation due to tensile stresses. Even for FEA simulating pure tensile loading, tensile stresses were higher than “nominal” due to non-uniform distributions at the exterior of the interface .

Fig. 1
3D FEA tensile and shear stresses in dentin under “shear” loading by knife edge (left), square bar (center) and wire loop (right).
From Baraga et al. with permission .

Neither DeHoff et al. nor Braga et al. calculated surface stresses on the exterior of the composite near or beneath the point of applied load. The authors’ expectation is for failure to occur often from such an external site due to high contact stresses. Braga et al. reported on failure mode data from 37 studies published between 2007 and 2009 looking at either shear or tensile testing of either dentin or enamel bonds. In these studies “cohesive” or “mixed” failure modes were reported from 40% to 70% of specimens. Cohesive failure implies that the crack path never involved the interface and that the failure initiation site and propagating crack did not involve the interface. Other than that the parts stayed bonded, it seems reasonable that “cohesive” events should not even be counted as having investigated the interface in any meaningful fashion. Similar arguments can be made for “mixed” failure modes where significant portions of the crack path have not involved the interface and the failure origin has not been fractographically established. Thus, the opinion of the authors’ is that “shear bond” testing is fundamentally too flawed to be considered meaningful from an interfacial fracture mechanics view.

Cases of “adhesive” failure also need careful examination since cracks initiating from surface tensile stresses away from the interface can run to and then along the interface for reasons not related to interfacial “bonding”. Complex crack behavior due to mixed stress intensity modes (K I peel and K II shear) changes in the lever arm during failure, influence of residual stresses and elastic/plastic properties across the interface may all cause the crack to follow or turn away from the interface for reasons unrelated to the “bond” – leading to meaningless attributions such as the percent “adhesive” or “cohesive” failure. The clinical message from all the above is reflected in the nearly complete lack of correlation between bond strength data and class V lesion retention rates . This message is further strengthened by the observation that glass ionomer cements clinically out-perform any other dentin bonding material in class V lesion retention, despite its lack of measurable “bond strength” .

It has been observed that composite-dentin “bond strengths” measured in micro tests are generally higher than for the same systems measured in macro tests . Since there is nothing inherently different in the resin-based composite at the specimen sizes used, an increase in measured “strength” with decreasing surface area or volume at risk likely follows Weibull scaling. Weibull scaling (defined below) results from the statistical probability of finding a flaw decreasing as the test specimen gets smaller. Weibull scaling explains, for example, why three-point bend bars have a higher measured “strengths” than four-point bend bars. The availability of macro and micro data for the same bonding systems opens the possibility to semi-quantitatively examine whether shear test failures originate from the bonded interface or elsewhere. Since quite some data exists for macro and micro tests, Weibull scaling can be applied to see whether measured “strength” increases are consistent with scaling down of the bonded surface area, the external surface or from within the specimen (volume). This analysis assumes that failure can originate from flaws at one of three sites: (1) bonded interface; (2) the external surface of the specimen; or; (3) from a volume flaw. Assuming a standard test geometry involving a cylinder (5 mm in height) of resin-composite bonded to a flat dentin surface, the bonded surface area ( A ), external surface area ( S ) and volume ( V ) will be given by:

A = π r 2
S = 2 π r h
V = π r 2 h

It can be seen that the bonded surface area and the cylinder volume both scale with r 2 while the external surface scales with r (where r is the contact or specimen radius). Weibull scaling is a relationship that describes the increased likelihood of finding a flaw as either the surface area or volume under stress is decreased, expressed as a strength ratio:

σ micro σ macro = A macro A micro 1 / m

where σ = failure stress, A = surface area or volume under tensile stress and m = Weibull modulus.

Using data for six different bonding systems from Table 1 in Scherrer et al. , “strength” increases were calculated for micro-shear versus macro-shear and micro-tensile versus macro-tensile. These calculated strength increases are shown in Fig. 2 . Mean increases in measured strengths were 1.77× for micro-shear/macro-shear and 3.07× for micro-tensile/macro-tensile. Fig. 3 evaluates the calculated Weibull scaling factor for bonded area, external surface and volume scaled by r for Weibull moduli varying from 3 to 10. Overlaid in Fig. 3 are lines designating the mean strength ratios for tensile and shear tests from Fig. 2 . This analysis suggests a Weibull modulus of 5 for failure of these handmade composite specimens. Fig. 4 then investigates the Weibull strength ratio increases as the bonded area radius is scaled down from 3.0 mm to 0.25 mm keeping the Weibull modulus at 5. Again by overlaying the mean strength ratios for tensile and shear tests it can be appreciated that shear testing is consistent with failure involving external surface flaws and does not scale appropriately with bonded area flaws. Such indirect evidence developed from published data strongly reinforces the arguments above that shear testing involves failure from surface stresses, likely tensile contact stress, on the loaded surface of cylinder specimens. On the other hand, tensile tests scale appropriately with either interface flaws or volume flaws. Reference to Fig. 6 in Baraga et al. reveals that combinations of “cohesive” and “mixed” failures were the mode in approximately 70% of the 37 studies examined. Failure from either volume flaws or external surface flaws is consistent with both this report from Baraga et al. and the Weibull scaling analysis above.

Table 1
Examples of materials (and material systems) categorized by type and duration of function. Such a classification is proposed as an aid in assessing the likelihood that physical property and laboratory simulation data might correlate with clinical handling or clinical performance (higher for Class A, lowest for Class E) .
Class Characteristics Examples
A Short duration Impression materials
Extra-oral (or predominantly extra-oral) Bite registration materials
Simple, specific function Gypsum products
B Moderate duration Orthodontic brackets, wires and elastics
Intra-oral Endodontic instruments
Simple function Dental burs
Restorative armamentarium (wedges, bands, light curing equipment)
C Long-term duration Basing materials
Intra-oral Cements
Simple function (esthetic and/or space obdurate only; limited or no structural function) Endodontic sealers and canal filling materials
Simple intra-coronal restorations (amalgam or resin-based composite)
Veneers – minimal preparation (ceramic or resin-based composite)
D Long-term duration Stress-bearing intra-coronal/extra-coronal restorations (amalgam or resin-based composite)
Intra-oral Veneers – aggressive preparation (ceramic or resin-based composite)
Complex function; Single material Endodontic posts
Denture bases
Denture teeth
Maxillofacial prostheses
Implants
E Long-term duration Fixed partial dentures (metal–ceramic, ceramic–ceramic)
Intra-oral Implant with ceramic abutment
Complex function; Multiple materials (material systems)
Only gold members can continue reading. Log In or Register to continue

Stay updated, free dental videos. Join our Telegram channel

Nov 28, 2017 | Posted by in Dental Materials | Comments Off on The slippery slope – Critical perspectives on in vitro research methodologies

VIDEdental - Online dental courses

Get VIDEdental app for watching clinical videos