Natural language processing (NLP) could help radiology providers anticipate fluctuations in demand and provide better patient care, according to a new study published in the Journal of the American College of Radiology.
Long-term surveillance for hepatocellular carcinoma (HCC) is often recommended for patients with liver cirrhosis, and that surveillance can lead to a high utilization of imaging resources. With this in mind, the study’s authors wondered if a NLP approach could be implemented that extracts data from free-text radiology reports and help providers anticipate when demand for imaging resources may be higher than normal as a result of HCC surveillance.
The study included data from more than 2,500 free-text radiology reports associated with HCC surveillance from Jan. 1, 2010, to Oct. 31, 2017. The authors used NLP and machine learning to explore two representations of the free-text data: a bag-of-words model and a term frequency-inverse document frequency (TF-IDF) model. Each representation was then combined with three machine learning algorithms: logistic regression (LR), support vector machine (SVM) and random forest (RF).
Overall, the authors noted the bag-of-words models were “slightly inferior” to TF-IDF models. TF-IDF + SVM outperformed the other five models, with an accuracy of 92.2 percent, sensitivity of 82.6 percent, specificity of 95.7 percent and area under the curve (AUC) of 0.971. TF-IDF + LR had an accuracy of 91.4 percent, sensitivity of 79 percent, specificity of 95.9 percent and AUC of 0.969. TF-IDF + RF, meanwhile, had an accuracy of 90.6 percent, sensitivity of 79 percent, specificity of 94.8 percent and AUC of 0.965.
“Our NLP approach demonstrates the ability to predict subsequent radiology resource utilization from the imaging results of HCC surveillance examinations with a high degree of accuracy,” wrote authors A.D. Brown, MD, MBA, and J.R. Kachura, MD, of the department of medical imaging at Toronto General Hospital in Toronto, Ontario, Canada. “These findings suggest that an algorithmic approach to text analysis could be used as a tool to help radiology administrators better predict changes in demand and proactively institute capacity management strategies to address fluctuations in demand.”
Brown and Kachura noted that “few methods have been found to significantly outperform TF-IDF” when it comes to NLP.
“The advantage of TF-IDF is its intuitive nature: the more reports that a specific term occurs in, the less discriminating the term is between reports and, consequently, the less useful it will be in categorizing them,” they wrote.
And why did the TF-IDF + SVM model outperform the others? The authors noted that “one reason for this could be the fact that SVM models are tolerant to class imbalances in the training set.”
Looking forward, Brown and Kachura explained how these models—and others—could someday be used to improve patient care.
“In the future, these techniques could be used to create systems that analyze finalized radiology reports in real time and provide administrators with updated predictions of radiology resource demand,” the authors wrote. “To provide robust forecasts of radiology demand, the NLP approaches used here could be combined with time series forecasting methods to provide forecasts for radiology demand across many different clinical indications and imaging modalities.”