Articles are sought by word and index-term searches in bibliographic databases.
Misallocation of MeSH by indexers affects how articles can be found.
Omission of important descriptors in abstracts affects how articles can be found.
Poor quality of reporting makes articles difficult to index, find and understand.
Identification- and translation-bias contribute to research waste.
To review how articles are retrieved from bibliographic databases, what article identification and translation problems have affected research, and how these problems can contribute to research waste and affect clinical practice.
This literature review sought and appraised articles regarding identification- and translation-bias in the medical and dental literature, which limit the ability of users to find research articles and to use these in practice.
Articles can be retrieved from bibliographic databases by performing a word or index-term (for example, MeSH for MEDLINE) search. Identification of articles is challenging when it is not clear which words are most relevant, and which terms have been allocated to indexing fields. Poor reporting quality of abstracts and articles has been reported across the medical literature at large. Specifically in dentistry, research regarding time-to-event survival analyses found the allocation of MeSH terms to be inconsistent and inaccurate, important words were omitted from abstracts by authors, and the quality of reporting in the body of articles was generally poor. These shortcomings mean that articles will be difficult to identify, and difficult to understand if found. Use of specialized electronic search strategies can decrease identification bias, and use of tailored reporting guidelines can decrease translation bias. Research that cannot be found, or cannot be used results in research waste, and undermines clinical practice.
Identification- and translation-bias have been shown to affect time-to-event dental articles, are likely affect other fields of research, and are largely unrecognized by authors and evidence seekers alike. By understanding that the problems exist, solutions can be sought to improve identification and translation of our research.
Evidence is used by different people, for different reasons. They could be seeking background information to assist in new research, be completing an assignment for university, writing a lecture for colleagues, seeking information to support clinical decisions or be involved in an in-depth analysis of published data. The evidence is only helpful to the users if it can be both identified, and then understood.
Identification bias occurs when relevant articles cannot be found, and translation bias occurs when those articles that are found cannot be understood. Together, these problems contribute to research waste: when research is ignored, cannot be found, cannot be used, or is unintentionally repeated .
Researchers write articles, and seek to find evidence from articles that others have written. There is the feeling that in this age of electronic libraries and search engines, the process of cataloging the research means that all those who seek it can retrieve it, easily. Unfortunately, and possibly surprisingly, this is not actually the case.
Identification and translation bias results in avoidable research waste. A commentary in 2009 estimated that approximately 85% of research was affected by avoidable waste, and likened this to financial waste running into billions of dollars . Increasing concern regarding research waste initiated a series of articles in the Lancet in early 2014 exploring the problems underpinning research waste, options available to improve the situation, and recommendations to consider for the future. In addition to fiscal disadvantages, other authors reported that many initially promising research results did not seem to be impacting health care research or clinical practice. For example, over 95% of articles in 2005 regarding cancer prognostic markers had reported at least one significant prognostic variable, but these potentially useful findings did not appear to be either known or impacting future research.
This article aims to review:
How articles are retrieved from bibliographic databases.
What article identification and translation problems have affected medical research.
What article identification and translation problems have affected a specific area of dental research, namely time-to-event survival analyses.
How identification-bias and translation-bias can contribute to research waste and affect clinical practice.
Indexing and identification: retrieval of articles from bibliographic databases
When writing abstracts, and when searching for evidence, it is important to understand how the databases will catalog the information, and how this can then be retrieved.
Specifically, finding data about dental outcomes is not necessarily straightforward. The articles are indexed in many databases, such as MEDLINE, Embase, ClNAHL, PsycLit and the Cochrane Library. The databases contain the ‘data’ from the articles, and are separate entities to the search platforms that are used to retrieve these data. Common search platforms come from providers such as OVID, PubMed and SilverPlatter. Some databases, such as MEDLINE, can be searched via multiple search platforms, including OVID and PubMed. To conduct an effective search, the seeker needs to know both what they want, and how to find it .
Articles can be sought by a free text word or index-term search. In the bibliographic databases of MEDLINE, a word search searches for the given word or phrase that was used by the authors, but in the title or abstract only. The search does not access the full text of the article and, so, although a word search sounds like it would be effective, if the original authors did not describe their research adequately in their title and abstract, searchers will not necessarily find the evidence they seek.
A supplementary search method uses indexing terms. These terms are allocated to the articles by indexers who have read the entire article. Different databases use different indexing terms, such as Medical Subject Headings (MeSH). These terms are selected from a controlled vocabulary by indexers at the US National Library of Medicine and allocated to all articles in the MEDLINE database . These terms have also been adopted by other databases such as the Cochrane Library , CINAHL and PsycLit as a source of thesaurus terms, enabling index searches. Indexing terms means that similar types of articles should be allocated similar indexing terms. It also means that indexers working on articles that were not described completely in the abstract can combat this by allocating an appropriate indexing term after reading the full article. This means that some research that would be missed by a word search can be identified by an index-term search, helping to overcome identification bias.
Articles can also be identified by electronic full-text searches . Many documents on the internet can be searched across the full text by Google, by full text searching of the output of individual publishers, and by full text in some databases such as the Cochrane Library and PubMed Central. Full text searching is becoming increasingly available, but it is not yet common.
Identification- and translation-bias in the medical literature
Early researchers using Medlars (a precursor to today’s MEDLINE) encountered problems identifying relevant dental articles. They found MeSH were restrictive, and were tailored toward medicine .
The MeSH library continues to increase, with the list of main heading MeSH rising from approximately 20,000 terms in 2000 to more than 27,000 in 2014 . These main headings can be further qualified by 83 individual subheadings , and more than 200,000 supplementary concepts . MeSH are manually allocated to articles, and therefore indexing variation is expected and disparity is not necessarily considered inaccurate. However, omission or misallocation of important terms clearly undermines search performance. Studies evaluating their use found lower consistency associated with subheadings, methodology categories (E: analysis) and, those categories whose definitions were less stable (N: health care) .
Errors in indexing, and problems with identifying articles is not a new problem. A well-known example relates to inaccurate indexing of randomized controlled trials (RCTs). The problem was quantified by Dickersin and colleagues in 1994 , resulting in the development of the Cochrane highly sensitive search strategy (CHSSS). It aimed to retrieve RCTs from MEDLINE and has been employed to aid systematic searches, and to identify and re-index RCTs in MEDLINE with the correct indexing terms.
Systematic reviewers use a combination of free text and index terms as part of their search methods to identify articles of interest in bibliographic databases . Although other search techniques including handsearching , assessment of reference lists , contact with experts and review of conference proceedings are also used, research from 2003 found that authors of systematic reviews of therapeutic interventions found most of the high quality articles by searching one of four bibliographic databases .
Identification of articles is challenging when it is not clear which free text words are most relevant, and which index terms have been allocated.
To this end, the abstracts play a critical role: they aid identification, help searchers decide whether the article may be relevant, and might be the only source of information for those who do not or cannot find the complete article.
Research undertaken by the National Library of Medicine (NLM) showed that between 1992 and 2005, the proportion of articles with structured abstracts indexed in MEDLINE had increased from 2.5% to 20.3%, and that this trend appeared to be continuing. The authors concluded that in addition to changes made to the abstract display in 2010 to improve utility, NLM would alter metadata allowing searching of individual sections of the abstract (such as the results, or objective) by end-users. It has also been suggested that indexing of MeSH should be weighted toward the objective and results section of abstracts rather than areas such as the “background” . Given these research objectives of the NLM, it is important that authors maximize the quality of their abstracts to allow these continued changes to aid accurate identification of their research.
Methodology researchers in medicine have recognized that problems exist in the writing and interpretation of abstracts, and those involved in designing controlled trials have provided guidance specifically regarding abstracts of controlled trials written for journals and conferences , which was incorporated into the updated CONSORT explanatory document in 2010 .
However, problems with abstract quality and transparency is largely unrecognized in dental circles, and the burden of the problem is unknown across medicine at large.
The quality of reporting of articles specifically in relation to statistical methods has been the subject of much commentary and research. While these findings do not shed light directly on the quality of abstracts, problems with the quality of the articles themselves could be considered a surrogate marker for the quality of abstracts.
Research has found that reporting of statistics in studies has improved over the past 40 years , following concerted efforts by a range of people to highlight problems and reporting solutions. Bland postulated that these improvements related to many factors, including the promotion of evidence-based medicine, increased prevalence of systematic reviews, education of researchers and journals in employing proper statistical methods and reporting, use of statistical referees by journals, publication of reporting guidelines, and continued editorials on reporting quality. Studies reporting results with specific study designs, such as cohort studies and controlled trials have received guidance to improve quality of reporting , and assessments have found some improvements, but that these improvements do not apply across the literature as a whole .
Over 20 years ago, Altman and colleagues raised concerns regarding the reporting quality of cancer-related time-to-event outcomes in the medical literature from 1991. Continuing concerns were identified almost 10 years later, in 2002 during an assessment of a larger sample of literature; and again in 2011 . A systematic review in 2012 of outcome measures used to evaluate dental implants highlighted that reporting quality continued to be problematic, and certainly extended beyond traditional medical fields. Each of the teams reporting research in this area recommended changes to reporting approaches, but recent research in relation to dental time-to-event outcomes has highlighted that the problems remain, suggesting that the message, at least in dentistry, is not reaching the target audience.
Authors have commented on the ethical implications of poor reporting quality, stating “it is a moral duty of researchers to publish as clearly as possible” and “inadequate reporting borders on unethical practice when biased results receive false credibility” .
The publication and retrieval of evidence is important across health care, and researchers have published guidelines aimed at improving reporting transparency and quality of specific study designs such as systematic reviews (PRISMA) , randomized trials (CONSORT) and observational studies (STROBE) ; and general guidelines, such as “Standards of Quality Improvement Reporting Excellence” (SQUIRE) . Additionally, specific resources are available from the EQUATOR network and a guideline for compiling reporting guidelines has been published .
These afore-mentioned efforts have focused on the reporting of specific study designs (such as randomized trials and observational studies), and identifying articles reporting research that used a particular technique brings additional challenges. Examples may include identifying studies using a specific statistical analysis such as dental “survival” time-to-event studies , those using a specific methodology, such as fractography in dental materials or those with an unexpected outcome, such as the reporting of off-label use of pharmaceutical drugs .
Cohort and controlled trials are recognized study designs, while methodology techniques are not specific to one study design, and could be employed across a range of these. Therefore, language to describe, indexing terms to categorize, and searchers’ understanding of these types of articles is probably less well established than is the case for classic study designs.
Identification- and translation-bias in time-to-event dental research
A research program assessing how time-to-event dental research was indexed , reported and identified has been undertaken. It provided additional insight into the magnitude of the problem of identification bias in this subset of the dental literature, and offered some solutions to reduce its impact.
Treatments provided to dental patients have a finite lifespan. The treatments experience complications, require maintenance, and eventually require full revision. Patients undergoing such treatments understandably wish to know how long it may be until such adverse events occur, when they are making decisions about committing time and resources to treatment. Patients seek assistance from their clinicians to explore these risks, and clinicians in turn seek assistance from their clinical experience and published data to answer their patients’ questions.
To do this, clinicians and researchers require time-to-event, “survival” data. Time-to-event analyses have three concepts in common. An event is monitored, over time, and a combined estimated cumulative proportion is calculated. For example, a group of people who have received an implant may be monitored for several years, and the occurrence and timing of an event such as implant failure is noted. The lifespan of the implant prior to failure, the lifespan of other implants within the group that are known not to have failed, and the lifespan of implants prior to becoming lost to follow up are analyzed, and the estimated cumulative survival is calculated. Other examples relating to dental care include the fracture of a crown, the debonding of orthodontic brackets, the decay of a restorative margin, the healing of endodontic lesions or the time taken until anesthesia is established.
“Events” are transitions from one state to another. The period of time taken for these events to occur can relate to a broad range of medical and non-medical topics such as time-to disease, time-to recurrence, time-to death, time-to recovery, time-to equipment failure, time-to an earthquake, or time-to a stock market crash.
When time-to-event analyses are reported they can appear similar to a proportion, and care should be taken by researchers to ensure that they clearly report their statistical method, and care should be taken by readers to seek out this information in the articles. Therefore, when writing the abstracts and full text for those studies, authors should report the event, the time and the statistical method. Equally, when such articles are cataloged in bibliographic databases, indexers should allocate relevant indexing terms that indicate that an event was monitored, over time, and analyzed with a time-to-event statistic.
Writing an article abstract is not necessarily straightforward. Authors must convey a large amount of information in a concise manner, within a given word limit. Additionally, language complexity inevitably means that many different words can be used to describe similar concepts. This can lead to confusion regarding the type of research undertaken, and errors in identification for those seeking evidence. For example, to describe time-to-event research authors could state that they completed a survival, life table or Kaplan Meier analysis. Alternatively, they could use less specific descriptors such as success, clinical performance, clinical outcome, longevity, lifespan, failure rate, incidence of aftercare, complication rate or follow-up. These terms have different meanings in different contexts, and could describe many aspects of clinical outcomes unrelated to time-to-event “survival”. Therefore, use of these terms as search words will not have high specificity, meaning that they will retrieve many irrelevant articles.
Despite being a controlled vocabulary, variation in allocation of MeSH indexing terms to the same types of articles can also occur. Indexers can chose from a variety of terms, including survival, survival rate, survival analysis, Kaplan–Meier estimate, actuarial analysis and life tables . The first two terms relate to the meaning or “subject” of survival, and the next four terms relate to a methodology. This variation can propagate identification bias.
Furthermore, the dental application of “survival” and the medical application of “survival” differ. The first two MeSH exemplify this difference. In medicine, survival and survival rate relate intimately to “life or existence” and are not transferable to inanimate objects such as dental prostheses. It is likely that such differences in interpretation of dental vocabulary also impact search yields. Specifically, if the understanding of “survival” differs between MeSH indexers and those searching, relevant studies will remain elusive, undermining the provision of evidence-based dental care. In other words, such articles risk becoming “lost in translation”.
To investigate the magnitude of identification bias amongst time-to-event dental articles, a gold standard cohort of articles was identified from 50 dental journals published in 2008 that had the highest impact factor for that year . In total, nearly 7000 articles were reviewed and those reporting survival of dental prostheses in humans with time-to-event statistics ( n = 95, ‘case’) and without time-to-event statistics ( n = 91, the active controls, likely false positives in searches), as well as all other articles that did not report studies of the survival of dental prostheses ( n = 6796, the passive controls) were identified. Thereafter, details contained within these 7000 articles were used to inform how the research was reported and indexed.
As discussed above, time-to-event research concerns time-dependent outcomes and should include information about the outcome, time and time-related statistic.
Those indexing articles use a standardized vocabulary, such as the medical subject heading terms (MeSH) to classify details about the research. The words chosen indicate who and what was studied, and where and how it was studied. This means that MeSH indicating participants in the research, the outcomes studied, the timeline involved, the study technique used and how data were analyzed are attached to the bibliographic electronic record, and allow searchers to identify articles in a standard manner.
With this in mind, stage 1 of the research reviewed whether MeSH regarding “what” (outcome), “when” (timeline), and “how” (time-to-event statistic) had been allocated to the time-to-event ‘case’ articles, and whether these terms had also been used for any of the other approximately 7000 articles . It was found that the allocation of MeSH to time-to-event dental articles in MEDLINE was inaccurate and inconsistent. Regarding one of the three groups of indexing terms studied, an important finding was that statistical MeSH were omitted from 30% of the time-to-event ‘case’ articles and incorrectly allocated to 15% of active controls (the false positives). Such errors will adversely impact search accuracy, resulting in identification bias.
The next stage in the research focused on the abstracts. It reviewed what words had been used by authors in the titles and abstracts of the time to event ‘case’ articles, and whether these words were used in any of the other approximately 7000 articles . Words describing the “what” (outcome), “when” (timeline), and “how” (time-to-event statistic) were sought.
It was found that there was great variation in the words used by authors to describe dental time-to-event outcomes. Such variation in reporting adversely impacts the ability of free text “word” searches to identify relevant articles. This makes it difficult for those searching to identify the research they seek. Specifically, two-thirds of the time-to-event ‘case’ articles did not use words in the title or abstract highlighting “how” (time-to-event statistic) the research was conducted; and that other articles in the cohort at times also used these words. Therefore, many articles that should use these words did not; and many articles that should not have used these words, did. Again, such errors will adversely impact search accuracy, resulting in identification bias.
The third stage in the research reviewed how the time-to-event methods and results were reported in the body of the articles . It found that the time-to-event articles regularly used life tables or survival curves to delineate findings, and also described the results with survival statistical terms. However, important details were regularly omitted from both statistical descriptions and survival figures, making the overall quality of reporting poor. It was likely this would make it unnecessarily difficult for such articles to be indexed correctly in databases, found by people seeking them, and understood if they were identified.
Overall, this body of research found that there was a significant reporting burden affecting time-to-event dental survival analyses, which would adversely impact on accurate article identification, and that many of the articles that are found would be difficult to understand. Therefore, the level of waste could be high for dental research using time to event analyses.
The categorization of 7000 dental articles for this research provided valuable data, which could be used to develop a search strategy to help find articles that had been published but might be “lost” in the literature. It also provided an evidence base for guidance to assist future authors to report their research clearly, and improve its usefulness when found.
The performance of the search strategy was tested in an independent validation dataset , which was derived from an additional handsearch of articles published in 2012 in the 50 dental journals with the highest impact factor for that year. The 6514 articles identified were classified as reporting dental treatments in humans with time-to-event statistics ( n = 148) and all other articles ( n = 6366). Validated search protocols with conservative sensitivities up to 92%, precisions up to 93% and number needed to read (NNR) to identify relevant records lower than two articles were constructed. Their use will improve the ability of researchers to identify time-to-event dental articles, help to overcome the burden of identification bias and reduce research waste.
Information gathered from the assessments of the quality of reporting was used to draft guidance for the reporting of time-to-event dental outcomes, and feedback on this was sought from a group of experts known to have knowledge about time-to-event reporting in dentistry. From the 78 experts contacted, 46 from 14 countries provided feedback. They endorsed the importance of improving reporting quality of time-to-event survival studies, and indicated that the draft guidance document was pertinent, timely and useful. The guidance document was amended based on the feedback. A range of participants wished to commend the document to authors submitting manuscripts, reviewers of those manuscripts, and dentists in specialty training programs, and work is ongoing to prepare the guidance recommendations for publication.
Implications for clinical practice
The implications of this research relate to two key challenges: identification-bias and translation-bias. Such bias results in a failure to identify relevant material, and a failure to use it.
Identification problems affect those seeking evidence. Searchers might be seeking primary research, or synthesized research that relies on the finding of that primary research.
When a large volume of evidence is available, there is an argument for synthesized research to seek only the highest quality and largest studies, rather than including all studies reporting outcomes on a specific topic. This may be possible in some medical fields, such as the investigation of outcomes of drug therapies in common conditions, where several high quality, large randomized trials might have been done. However, in dentistry, many studies enroll small numbers of participants, rarely follow controlled trial designs, and are often retrospective in nature. Therefore, every effort to identify all data remains extremely important to ensure findings are as balanced as possible.
Primary clinicians and policy makers who require results from systematic reviews to guide decisions will be particularly affected by poor reporting study quality, if this makes those systematic reviews unreliable. In addition, even if individual studies are identified, it will be unnecessarily difficult to understand their results, or to extract data for systematic reviews.
This may manifest itself in various ways. Primary clinicians might not find data to guide treatment decisions, such as whether to replace a missing tooth with a single implant crown or a three unit fixed dental prosthesis. Decision makers might not find data to guide funding decisions, such as whether community elders derive sufficient value with acceptable maintenance burdens from conversion of a complete mandibular denture to an implant retained overdenture to justify funding at taxpayer’s expense. Additionally, dental researchers may believe that there are gaps in a particular knowledge base, and unnecessarily duplicate research that had already been completed, but was not able to be found in the literature.
For example, a systematic review by Nkenke and Stelzle in 2009 regarding outcomes of implants placed in the posterior maxilla, with and without sinus augmentation appeared to miss at least two potentially relevant articles. The articles were not included in their paper’s final assessment, and were not listed in their exclusion bibliography. Nkenke and Stelzle had searched MEDLINE and Embase for 1 January 1966 to 31 December 2008, and completed a supplemental handsearch of a range of dental journals.
The first of the two missing articles was published in April 2008 by Schlegel and colleagues in the International Journal of Oral & Maxillofacial Implants . The article was added to the PreMEDLINE database on 14 June 2008, and its indexing for MEDLINE was completed on 12 August 2008. Therefore, when Nkenke and Stelzle undertook their search, the complete electronic record was available, and the article was printed in one of the journals that they handsearched. The article was allocated 21 MeSH, and these included two outcome MeSH (dental restoration failure, treatment outcome), one time MeSH (retrospective studies) and no statistical MeSH. In the title and abstract, the authors used no survival words, a number of general outcome words, and also indicated that the research was conducted over time. Based on the specific words the authors used, their abstract can be paraphrased to be: ‘ used no specific statistical techniques, to compare and evaluate outcomes such as implant loss and implant stability in a retrospective, follow up study over a mean functional observation period of 1.6 years. ’
This article reported implant survivals using Kaplan Meier statistics, and this was noted in the methods and results, but not in the abstract.
The second article was published in October 2008 by Bornstein and colleagues in Clinical Oral Implants Research . It was added to the PreMEDLINE database on 10 October 2008, but was not indexed in MEDLINE until 7 February 2009. Although the electronic record of the article was not available on MEDLINE when Nkenke and Stelzle completed the electronic search, it had been printed in one of the journals that they handsearched. Across their title and abstract, the authors used a variety of time and outcome related words, as well as a general statistical term, “survival rate”. Based on the specific words the authors used, their abstract can be paraphrased to be: ‘ Clinical and radiographic findings of the stability and performance, reporting success rates with no known statistical techniques over a follow up period during a prospective study also reporting lost to follow ups and drops outs when recalled at 12 and 60 months. ’ In addition, a life table and survival statistical details were provided in the body of the article, but not mentioned in the abstract. Thirty-six indexing terms were allocated to this article, including two outcome MeSH (dental restoration failure, treatment outcome), two time MeSH (retrospective studies, follow-up studies) and no statistical MeSH.
The indexing of these two articles was limited at best, and arguably incorrect, and the writing of the abstract was incomplete. They do not appear to have been identified by Nkenke and Stelzle, by electronic or hand searching, and it is possible that the omission from their review was related to poor reporting quality.
An additional example relates to the reporting of outcomes of porcelain veneers, which is an area of research for one of the authors of this article (DL). The paper reporting survival outcomes over a 16-year period were allocated two outcome MeSH (dental restoration failure, treatment outcome), one time MeSH (follow up studies) and no statistical MeSH. The title and abstract included words relating to outcomes (outcomes, failed, complications), time (prospective, up to 16 years, long term) and statistics (Kaplan Meier, survival estimation, cumulative survival and failure rate). The body of the article included both a life table and survival curve, and the survival statistics were detailed in the methods and results. This article was not quoted during two comprehensive international lectures on porcelain veneers (Academy of Prosthodontics 2009, Australian Dental Association 2009), with neither lecturer acknowledging that the article was found during their literature search. Additionally, the article was not quoted in three research articles and one review concerning porcelain veneers. This may have been because it was difficult to find when sought. The abstract included a range of descriptive words that would have aided identification, and therefore difficulty identifying the article was probably mostly related to the limited indexing. This identification problem, however, was resolved when the author wrote a subsequent systematic review, where the “lost” article was included and referenced. The article has since been cited , and is no longer “lost” in translation.
However, not all authors seek whether their research has been identified, and not all authors can resolve the problem by highlighting the missing paper in their own systematic review. Use of the proposed search tool will help identify many of these “lost” articles, and use of the proposed guidance document will improve reporting quality such that the articles may not become “lost” in the first place.
These specific research tools are applicable to the time-to-event dental field, but the concepts could be applied to other areas of research. Future tools could be developed to address identification bias and translation bias, reducing research waste, in other fields, including dental material sciences.
Clearly, the ramifications of non-identity and poor utility could affect everyday clinical treatment, and impact health care policy decisions at a societal level. Although this discussion has related to research based on time-to-event reporting, the implications are that these errors are likely widespread across the literature and largely unrecognized by authors and evidence seekers. However, understanding and accepting that the problems exist, will help to identify and implement solutions to improve identification and translation of our research.