AI helps researchers classify unstructured radiology reports

A recurrent neural network (RNN) can be trained to automatically classify important findings in unstructured radiology reports, according to new research published in the American Journal of Roentgenology. This could help researchers analyze massive amounts of data much faster than ever before, the authors noted.

“Electronic medical records (EMRs) contain significant amounts of unformatted text that pose a challenge to their secondary use as a research data source,” wrote Changhwan Lee, department of biomedical engineering at Hanyang University in Seoul, South Korea, and colleagues. “For more efficient use, EMR text data such as physician notes and radiology reports must be converted to outcome labels that contain specific information including type or extent of disease. However, categorizing EMR text with key annotations is difficult because it contains ambiguous words and narrative sentences.”

The authors collected data from musculoskeletal x-ray examinations performed from January 1 to Dec. 31, 2016, at a single facility. An orthopedic surgeon manually selected more than 3,000 sentences from the reports; 28 percent of those sentences expressed that a fracture was present, while 72 percent of the sentences expressed that no fracture was absent. The team then built a natural language processing system that uses RNNs to differentiate between fracture cases and nonfracture cases, using 75 percent of the data to train the system and another 25 percent to test the system. The training and testing sets, the authors noted, contained the same percentages of fracture and nonfracture cases. The team than tested its system with various numbers of layers to see when it achieved its best performance.

Overall, the three-layer model of the team’s system produced the best results, including the highest precision (0.967), recall (0.967), accuracy (0.982) and F1 score (0.967).  In addition, the word error rate was 1.03 percent.

“Our results indicate that the RNN-based system can classify important findings in musculoskeletal radiography reports with a high F1 score,” the authors wrote. “It is established from many studies that an artificial neural network (ANN) has excellent performance in extracting features from unformatted data. An RNN, which is a kind of ANN, is suitable for processing narrative and sequential data such as radiology reports so that our system shows favorable results.”

The team noted that these findings show that artificial intelligence can help researchers quickly process large amount of data, an impact that could lead to improved patient care.

“Our approach could be valuable for managing imaging utilization in radiology clinical decision support,” the authors wrote. “For example, understanding the rate of examinations that produce normal findings could serve as an indirect marker of utilization appropriateness. Our system might also help screen candidates for clinical trials.”