How the FDA assesses AI-based imaging algorithms

The FDA is responsible for regulating AI solutions designed and developed to provide care for patients, a task that leads to certain unique challenges.

For instance, according to a new analysis published in the American Journal of Roentgenology, the “inherent variability” of a deep learning algorithm makes it more difficult to know just how effective it may be. Likewise, these algorithms are often “trained” with large datasets, leading them to evolve at an almost unpredictable rate.  

“A technology that learns on its own has an explainability problem—that is, we do not know how it arrived at the rules it derived from the data,” wrote Ajay Kohli, MD, department of radiology at Drexel University College of Medicine in Philadelphia, and colleagues. “The explainability problem makes it difficult to benchmark AI.”

The 3 primary criteria

So how does the FDA assess AI-based algorithms? As with other new solutions, it comes down to three primary criteria: the risk to patient safety, the presence of a predicate algorithm and the amount of input from a human user.

Risk to patient safety

In all instances, the assessment process begins with a benefit-to-risk ratio, the authors explained.

“The benefit is assessed in terms of the type of benefit, its magnitude, its likelihood, and its duration,” they wrote. “The risks are measured in terms of the severity, likelihood, and duration of harm caused by false-positive or false-negative findings.”

The device is then put into one of three categories, class 1 (low risk), class 2 (intermediate risk) or class 3 (high risk).

“Most AI algorithms are categorized as class I devices or are excluded from being designated as a device as outlined by the recent 21st Century Cures Act updated draft guidelines,” Kohli et al. wrote. “The FDA has only just begun to develop procedures to evaluate the safety of AI-based medical imaging algorithms. Because of the absence of an established framework, the benefit-to-risk ratio of AI algorithms must be defined on a case-by-case basis.”

The presence of a predicate algorithm

Evolutionary devices are those that contain a predicate device or processor, the researchers explained. Such devices are “incremental innovations” and can be compared to previous “ancestral” technologies. This means they will typically be approved through 510(k) premarket submissions. A revolutionary device, however, is brand new.

“Quantification of coronary calcium or detection of a lung nodule with the use of machine learning techniques would be considered evolutionary because these tasks have already been performed by software using rule-based automatic and semiautomatic methods,” Kohli and colleagues wrote. “An example of an AI algorithm that would be considered a revolutionary technology is an algorithm to determine whether there is arterial occlusion in patients with suspected stroke. The important point is that the use of machine learning for detection, as revolutionary as this technique might be, is not necessarily revolutionary in terms of the FDA's framework.”

Human clinician input

While any AI-based application associated with radiology is technically viewed as clinical decision support software (CDS), they are also split into two categories: computer-aided detection (CAD) and computer-aided diagnosis (CADx). Knowing the difference is key for understanding the FDA’s assessment process.

“CAD systems flag abnormalities for review by radiologists but do not assist in diagnostic or clinical decision making,” the authors wrote. “They focus on the detection of abnormalities rather than their characterization.”

CADx systems, meanwhile, include more analytics—they identify a finding and assess it.

Approval pathways

Another key point of the team’s analysis was that vendors face an evolving world of approval pathways for their medical devices when AI is involved. The premarket approval process has been common for a long time in the United States, but the 21st Century Cures Act changed the landscape a bit by making CDS software exempt when certain criteria are met. Many machine learning algorithms meet all four criteria and “can avoid the formal FDA approval process.”

“More recently, the FDA developed the Digital Health Software Precertification (Pre-Cert) Program,” the authors added. “This program is based on the assumption that because medical software evolves so rapidly, every iteration of a particular technology cannot realistically be reviewed by the FDA. This approach specifically regulates software by primarily evaluating the developer of the product rather than the product itself, thus deviating from the traditional approval processes that directly evaluated a particular product.”

Know your options

When looking to market an AI algorithm related to medical imaging in the United States, the researchers emphasized that vendors should know their strategic options. While some vendors seek approval in the United States first, others start by going for a CE mark in Europe. The third option is to seek both at once. Developing a strategy in advance is crucial—and it is something venture capitalists keep an eye when considering an investment.

“In general, release in the United States requires a higher capital investment but gives a company access to the widest market, better intellectual property protection, and less foreign competition. International release is cheaper, but the market size is also limited,” Kohli et al. wrote. “The European Union is the second largest consumer of medical devices in the world, representing 30% of the global market share, followed by Japan, which represents approximately 10% of the global medical device market.