William Hersh, M.D., who has taught generations of medical and clinical informatics students at Oregon Health & Science University, found himself interested by the growing influence of artificial intelligence. He wondered how AI would perform in his own class.
So, he decided to try an experiment.
He tested six types of generative, large-language AI models — for instance ChatGPT — in a web-based version of his popular introductory course in biomedical and health informatics to see how they performed compared with living, considering students. A study published within the journal npj Digital Medicine, revealed the reply: Higher than as many as three-quarters of his human students.
“This does raise concern about cheating, but there’s a bigger issue here,” Hersh said. “How will we know that our students are literally learning and mastering the knowledge and skills they need for his or her future skilled work?”
As a professor of medical informatics and clinical epidemiology within the OHSU School of Medicine, Hersh is particularly attuned to latest technologies. The role of technology in education is nothing latest, Hersh said, recalling his own experience as a highschool student within the Nineteen Seventies through the transition from slide rules to calculators.
Yet, the shift to generative AI represents an exponential step forward.
“Clearly, everyone must have some form of foundation of information of their field,” Hersh said. “What’s the muse of information you expect people to should have the opportunity to think critically?”
Large-language models
Hersh and co-author Kate Fultz Hollis, an OHSU informatician, pulled the knowledge assessment scores of 139 students who took the introductory course in biomedical and health informatics in 2023. They prompted six generative AI large language models with student assessment materials from the course. Depending on the model, AI scored in the highest 50th to 75th percentile on multiple-choice questions that were utilized in quizzes and a final exam that required short written responses to questions.
“The outcomes of this study raise significant questions for the longer term of student assessment in most, if not all, academic disciplines,” the authors write.
The study is the primary to check large-language models to students for a full academic course within the biomedical field. Hersh and Fultz Hollis noted that a knowledge-based course comparable to this one could also be especially ripe for generative, large-language models, in contrast to more participatory academic courses that help students develop more complex skills and talents.
Hersh remembers his experience in medical school.
“Once I was a medical student, one in every of my attending physicians told me I needed to have all of the knowledge in my head,” he said. “Even within the Eighties, that was a stretch. The knowledge base of drugs has long surpassed the capability of the human brain to memorize all of it.”
Maintaining the human touch
Yet, he believes there is a superb line between making sensible use of technical resources to advance learning and over-reliance to the purpose that it inhibits learning. Ultimately, the goal of a tutorial sanatorium like OHSU is to coach health care professionals able to caring for patients and optimizing the use of knowledge and knowledge about them in the true world.
In that sense, he said, medicine will all the time require the human touch.
“There are quite a lot of things that health care professionals do which are pretty straightforward, but there are those instances where it gets more complicated and you’ve to make judgment calls,” he said. “That is when it helps to have that broader perspective, without necessarily needing to have every last fact in your brain.”
With fall classes starting soon, Hersh said he is not apprehensive about cheating.
“I update the course annually,” he said. “In any scientific field, there are latest advancements on a regular basis and large-language models aren’t necessarily up so far on all of it. This just means we’ll have to take a look at newer or more nuanced tests where you will not get the reply out of ChatGPT.”