A machine-learning algorithm that uses a technique known as reinforced learning can dramatically cut toxic chemotherapy and radiotherapy by optimizing treatment plans and drug dosages for glioblastoma patients, according to research out of the Massachusetts Institute of Technology.
A principal investigator in the study and research scientist with MIT’s Media Lab, Pratik Shah, PhD, said in a release from the university the goal of the novel artificial intelligence (AI) technique was to improve the quality of life for patients suffering from glioblastoma, whose prognosis is typically no more than five years and who often turn to maxed-out drug doses and radiation therapy for relief. While physicians won’t push toxicity limits, maximum doses of some pharmaceuticals still result in debilitating side effects.
“We kept the goal, where we have to help patients by reducing tumor sizes, but, at the same time, we want to make sure the quality of life—the dosing toxicity—doesn’t lead to overwhelming sickness and harmful side effects,” Shah said.
Shah and his team at MIT employed an AI method known as reinforced learning, a technique molded by behavioral psychology, to circumvent that problem. With reinforced learning, AI models are taught to favor certain behaviors that lead to desired outcomes, optimizing their courses of action based on a series of rewards and penalties.
Shah said he and his colleagues’ work required an “unorthodox reinforced learning model,” because traditional models are taught to work toward a single outcome, like winning a game, adjusting for that goal along the way. For glioblastoma patients, working toward one outcome would mean the algorithm would opt for maximum drug doses in just about every case.
“If all we want to do is reduce the mean tumor diameter and let it take whatever actions it wants, it will administer drugs irresponsibly,” he said. “Instead, we said, ‘We need to reduce the harmful actions it takes to get to that outcome.’”
The team’s algorithm—built to treat glioblastoma with temozolomide, procarbazine, lomustine and vincristine—was taught to optimize tumor reduction while balancing a patient’s drug dose. The model first combs through a database of traditionally administered regimens drawn from decades of animal studies, observational trials and clinical practice, then decides whether to initiate or withhold a dose during each planned dosing interval. If the model decides to administer a dose, it then determines whether the full dose, or just a portion, is necessary for optimal treatment.
“We said [to the model], ‘Do you have to administer the same dose for all the patients? And it said, ‘No. I can give a quarter dose to this person, half to this person and maybe we skip a dose for this person,’” Shah said. “That was the most exciting part of this work, where we are able to generate precision medicine-based treatments by conducting one-person trials using unorthodox machine-learning structures.”
Shah and his team tested the algorithm in 50 simulated patients, whose characteristics were borrowed from previous glioblastoma patients who’d undergone traditional treatment. The algorithm generated around 20,000 trial-and-error test runs for each patient, and, when given no dosage penalty, the model designed regimens nearly identical to those of human experts. With small and large dosing penalties, it cut dose frequently and potency “substantially”—by around a quarter or half—while reducing tumor size.