Automatic categorization of self-acknowledged limitations in randomized controlled trial publications

Lan, Mengfei; Cheng, Mandy; Hoang, Linh; ter Riet, Gerben; Kilicoglu, Halil

Samenvatting

Objective: Acknowledging study limitations in a scientific publication is a crucial element in scientific transparency and progress. However, limitation reporting is often inadequate. Natural language processing (NLP) methods could support automated reporting checks, improving research transparency. In this study, our objective was to develop a dataset and NLP methods to detect and categorize self-acknowledged limitations (e.g., sample size, blinding) reported in randomized controlled trial (RCT) publications. Methods: We created a data model of limitation types in RCT studies and annotated a corpus of 200 full-text RCT publications using this data model. We fine-tuned BERT-based sentence classification models to recognize the limitation sentences and their types. To address the small size of the annotated corpus, we experimented with data augmentation approaches, including Easy Data Augmentation (EDA) and Prompt-Based Data Augmentation (PromDA). We applied the best-performing model to a set of about 12K RCT publications to characterize self-acknowledged limitations at larger scale. Results: Our data model consists of 15 categories and 24 sub-categories (e.g., Population and its sub-category DiagnosticCriteria). We annotated 1090 instances of limitation types in 952 sentences (4.8 limitation sentences and 5.5 limitation types per article). A fine-tuned PubMedBERT model for limitation sentence classification improved upon our earlier model by about 1.5 absolute percentage points in F1 score (0.821 vs. 0.8) with statistical significance ( ). Our best-performing limitation type classification model, PubMedBERT fine-tuning with PromDA (Output View), achieved an F1 score of 0.7, improving upon the vanilla PubMedBERT model by 2.7 percentage points, with statistical significance ( ). Conclusion: The model could support automated screening tools which can be used by journals to draw the authors’ attention to reporting issues. Automatic extraction of limitations from RCT publications could benefit peer review and evidence synthesis, and support advanced methods to search and aggregate the evidence from the clinical trial literature.

Toon meer

Thema

Algemeen

Bestand/Link	Toegang Materialen met beperkte toegang zijn alleen beschikbaar voor bepaalde hogescholen.	Licentie Voor meer informatie over de verschillende gebruiksrechten, klik op het bijbehorende icoon/link.
Bestand 1	Open access
Bekijk URL

Organisatie	Hogeschool van Amsterdam

Gepubliceerd in	Journal of Biomedical Informatics Vol. 152
Datum	2024-04
Type	Artikel
DOI	10.1016/j.jbi.2024.104628
Taal	Engels

Automatic categorization of self-acknowledged limitations in randomized controlled trial publications

Automatic categorization of self-acknowledged limitations in randomized controlled trial publications

Samenvatting

Misschien ook interessant voor jou?

“Being a bully isn’t very cool…”: Rap & Sing Music Therapy for enhanced emotional self-regulation in an adolescent school setting–a randomized controlled trial.

A self-management program for employees with complaints of the arm, neck, or shoulder (CANS): study protocol for a randomized controlled trial.

Impact of peer review on discussion of study limitations and strength of claims in randomized trial reports: a before and after study