Savvy medtech developers and manufacturers looking to give their medical device software and Software as a Medical Device (SaMD) a competitive edge explore ways to incorporate technologies from the wider tech industry — and no technology is hotter than generative AI.1
Since the release of ChatGPT in November 2022, generative AI (Gen AI) has become the new trend across almost every sector of business. Organizations are rushing to find ways to use Gen AI in their products and operations, with the goal of transforming their business before their competitors.
While the medtech industry is starting to get value out of using Gen AI, discussions on whether live generative AI algorithms could be put inside medical devices have been more subdued. Unlike other industries, where regulations are looser (or nonexistent), medtech must ensure that its software and devices comply with FDA regulations and are built under design controls using ISO 13485 and IEC 62304. In the past, we’ve found safe and compliant ways to incorporate emergent technologies into our products, but as a technology, Gen AI presents unique challenges to industry adoption.
It’s Orthogonal’s position — and the consensus among many of our colleagues — that barring specific use cases, Gen AI is not yet ready to be placed inside medical devices. In this editorial, we take a look at what makes Gen AI different from other technologies, how it may someday be used in our industry, and the concerns its use raises.
Why Gen AI Isn’t Ready for Medical Devices
Like others in medtech, Orthogonal has adopted technologies like Bluetooth Low Energy and cloud computing into our medical devices.2 With the help of our colleagues and industry standards organizations like AAMI, we’ve been able to safely mitigate these technologies’ risks so that the benefit to patients outweighs the uncertainties.
Generative AI is unlike any new technology that has come before it. It’s essentially a black box with an infinite number of inputs and outputs and designed to try to answer any question, making it impossible to guarantee an outcome through repeated testing. In medtech, if we can’t test it, we can’t validate it. If we can’t validate it, we can’t guarantee its safety when put into a medical device.
At the moment, it’s unclear whether it would be practical for a device manufacturer to incorporate Gen AI into a device, due to the astronomical costs required to prove out new methods of validation. Outside of a major health emergency where Gen AI is the only solution (similar to how the development of relatively untested mRNA vaccines were accelerated to combat COVID-19), it’s unlikely that we’ll see Gen AI in the function of medical devices in the near future.
Isn’t AI Already in Medical Devices?
Artificial intelligence (AI)/machine learning (ML) algorithms, including algorithms with generative abilities, have increasingly become part of medicine over the last decade. An article in the Journal of Medical Internet Research, “Generative AI in Medical Practice:In-Depth Exploration of Privacy and Security Challenges,” features examples of current generative AI/ML algorithms in healthcare (see Table 1).3
AI/ML in the context of medical devices work on wholly different principles than ChatGPT and similar large language models (LLMs). ChatGPT is a general-purpose algorithm that attempts to answer any question posed to it, while medical device AI/ML are focused on providing answers within a tightly bounded set of parameters. ChatGPT could tell you how to make a PB&J sandwich, but an Al/ML algorithm trained to detect wrist fractures could not.
The FDA is the one who sets those tight parameters on AI/ML algorithms as a way to test, verify, and validate, and mitigate their potential risks to patients. Because Gen AI has no boundaries, the current regulatory guidance for AI/ML cannot be applied to Gen AI.
Though we’re far off from an official Gen AI policy, the FDA is opening the floor to discussion on the topic at a public meeting scheduled for November 2024. In the meantime, we can look to the FDA’s draft guidance on Predetermined Change Control Plans (PCCP) as an indication of their approach to updating AI algorithms.4
Potential Applications of Gen AI in Medical Device Software Development
While implementation in medical devices is off the table for now, we anticipate that Gen AI will soon work its way into the production of medical device software and the operations of medical device manufacturers. In a recent webinar, Orthogonal and industry expert Clay Anselmo shared some potential applications:
As a software coding assistant, an LLM could point out where developers are missing specific items in code, as well as potentially write first draft code.
As a text producer, it could generate a first draft of a 10,000+ page regulatory submission based on the existing body of highly templated regulatory documentation.
As an image generator, it could derive additional training data from a fixed data set.
Why We Need to Be Careful
The potential applications of Gen AI come with a massive caveat — the need for robust quality control and assurance methods from human developers to catch problems in the output. Until the day comes that we can peer into that black box and understand exactly how algorithms like ChatGPT work, medtech needs to approach this technology with even more caution than typical for a new computing technology entering a medical device.
Bias in AI
Algorithms are not neutral. They extrapolate outputs based solely on what they are fed, and if their inputs are heavily skewed or biased in some way, the outputs will be too. For example, if an algorithm for detecting heart disease is trained solely on images of cisgender men, it will be biased against finding signs of heart disease in cisgender women.
The issue of bias affects both AI/ML and Gen AI, but it is heightened in the latter. To answer any and all questions, popular Gen AI LLMs like ChatGPT were trained on undocumented, unstructured data indiscriminately scooped up from the Internet. Training an algorithm this way may make it powerful, but it gives the manufacturer no ability to control quality or see what biases are being baked into the code — biases that will inevitably be reproduced.
Hallucinations: A Serious Risk
Gen AI is prone to making things up, or hallucinating. The blog AI Weirdness highlights writer Janelle Shane’s attempts to get consistent, reliable information out of ChatGPT.5 She asked ChatGPT to draw an illustrated graphic of basic geometry shapes. The results speak for themselves (see Figure 1).
Harmless hallucinations can be humorous, but in the context of medical device software, where decisions are being made about patients’ health, they are inexcusable. We cannot legally or ethically allow our products to give incorrect information to users or simply “wing it.” Even if Gen AI was limited to lower-risk activities, like guiding a user through device setup, there is still potential to harm patients by giving them unhelpful or misleading answers.
Environmental Impact of Gen AI
All computing technologies need energy to function, but Gen AI algorithms use an incredible amount, with one estimate stating that ChatGPT consumes the equivalent amount of energy to 33,000 U.S. households. Computers running Gen AI also require massive quantities of fresh water to cool processors and generate electricity.6
Environmental conditions are a social determinant of health. As our industry moves toward making our products and processes more sustainable, it’s worth reflecting on the trade-off between the potential of Gen AI to improve health outcomes and the possible negative impact on energy consumption, natural resources, human health, and the environment.7
Conclusion
Gen AI is an impressive technology with seemingly limitless applications. We’ve barely begun to scratch the surface of what it can do. But it’s this very limitlessness that makes it a poor fit for use inside of a medical device. Gen AI won’t be ready to be used in the functioning of medical device software or SaMD until medtech manufacturers and developers can confidently demonstrate its safety and effectiveness. We owe it to our stakeholders, patients, providers, and payers to take a methodological and thoughtful approach to addressing these barriers so that Gen AI can be safely used to enhance the safety and effectiveness of medical devices.
References
- Bernhard Kappe, “Software as a Medical Device (SaMD): What It Is & Why It Matters,” Orthogonal, Feb. 7, 2024.
- Bernhard Kappe, “AAMI’s Groundbreaking Consensus on Cloud Technology in Medical Devices,” Sept. 28, 2021.
- Y. Chen, et al., “Generative AI in Medical Practice:In-Depth Exploration of Privacy and Security Challenges,” Journal of Medical Internet Research, 2024: Vol 26.
- “Marketing Submission Recommendations for a Predetermined Change Control Plan for Artificial Intelligence/Machine Learning (AI/ ML)-Enabled Device Software Functions,” Draft Guidance, FDA, April 2023.
- Janelle Shane, “An Exercise in Frustration, AI Weirdness, The Strange Side of Machine Learning".
- Kate Crawford, “Generative AI’s environmental costs are soaring — and mostly secret,” Nature , 20 Feb 2024.
- “Social Determinants of Health (SDOH)", CDC, January 17, 2024.
This editorial was written by Randy Horton, Chief Solutions Officer, Orthogonal, Chicago, IL. For more information, e-mail him at