NLP able to audit radiology reports, ID crucial information

Michael Walter | June 21, 2019 | Radiology Business | Imaging Informatics

Cheryl Petersilge, MD, MBA, with the department of regional radiology at the Cleveland Clinic, examined enterprise imaging—and how radiologists must integrate and collaborate with other departments. Her clinical perspective clinical perspective was published online in the October issue of the American Journal of Roentgenology.

Natural language processing (NLP) can provide significant value by auditing all communications related to critical findings, according to new findings published in the Journal of the American College of Radiology.

“In most radiology practices, critical findings communications are achieved through a verbal communication at the time of image interpretation,” wrote Marta E. Heilbrun, MD, department of radiology and imaging sciences at Emory University School of Medicine in Atlanta, and colleagues. “Documenting the communication process and keeping track of the communications and their timeliness are requirements of the Joint Commission rule. This documentation is typically a human resource–intensive and manual process.”

Heilbrun et al. examined the performance of a rule-based NLP system to see if it could audit such communications, saving providers time so they can focus more on providing high-quality patient care. The team adapted an existing NLP algorithm for their research.

More than 800 radiology reports related to chest imaging were randomly chosen for the study. All reports came from the same academic medical hospital from October to December 2013. Reports were divided into 185 development cases and 666 test cases.

“The task was to identify any of 18 critical findings relevant to the chest, either because they were findings specifically in the chest or could be clinically associated causes of chest symptoms: aneurysm, aortic dissection, cancer, ectasia, epiglottitis, fracture, free air, infarct, inflammation, mediastinal emphysema, pneumonia, pneumothorax, pulmonary embolism, retropharyngeal abscess, ruptured aneurysm, splenic infarct, tension pneumothorax, and thrombosis,” the authors wrote.

In the test set, 75 reports included critical findings. The NLP algorithm performed “remarkably well” and identified 69 of those 75 findings. There were also eight false-positives. This gave the algorithm a final sensitivity of 0.92 and specificity of 0.99.

“Retrospective auditing of critical findings reporting is a human labor- and time-intensive process that could produce a higher yield of information when NLP tools are deployed,” the authors wrote.

The team did note that its research had certain limitations. The study was based on a single three-month sample from one institution, for instance. Also, trainees were used as “human annotators” who measured the algorithm’s accuracy, and it is unclear if this impacted the research in any way.

Michael Walter, Managing Editor

Michael has more than 16 years of experience as a professional writer and editor. He has written at length about cardiology, radiology, artificial intelligence and other key healthcare topics.

Related Content