https://www.youtube.com/watch?v=mqCxE1tz1lMAutotranscript at YouTube
Summary
#
The webinar opens with moderator Xander van Wijk from the ADLM Innovation and Technology Division explaining that the session will focus on the intersection of artificial intelligence and laboratory-developed tests (LDTs) in clinical diagnostics, a topic many labs are anxious about post–LDT final rule and lawsuit. He introduces the two speakers: regulatory attorney Christine P. Bump, who has worked on LDT issues since 2004 and advises labs, academic medical centers, and companies on FDA strategy, and Shannon Bennett, a regulatory affairs leader and Mayo Clinic assistant professor with deep experience in clinical test development, verification, validation, and quality systems. The stated learning objectives are to define the boundaries of FDA oversight for LDTs versus regulated devices like software as a medical device (SAMD), to identify when AI/ML software becomes a medical device, and to describe risk-mitigation strategies for introducing AI into LDT and broader lab workflows.
Christine begins by “level setting” what an LDT is and, just as important, what it is not, especially in light of the recent litigation. She reviews that since around 1992 FDA asserted that LDTs were medical devices but said it would exercise “enforcement discretion”—claiming authority but mostly choosing not to regulate. This culminated in the 2024 LDT final rule, which explicitly declared that LDTs are devices by modifying the regulatory definition of in vitro diagnostic (IVD) devices. That rule was challenged by ACLA and AMP, and in March 2024 Judge Jordan in the Eastern District of Texas sided with them, ruling that LDTs are not “devices” under the Food, Drug, and Cosmetic Act. The crux of his reasoning was the statutory device definition: a device must be a tangible, physical product introduced into interstate commerce. He found LDTs to be categorically different—services performed within a single lab rather than tangible goods shipped as kits.
Christine explains that, in the decision, the judge effectively gives a functional definition of an LDT: a single laboratory develops its own method and process, receives a specimen on a physician’s order, tests the specimen using that internally developed protocol, and issues a result. The protocol and method never leave the lab. That’s what makes it a laboratory service rather than a device or kit; only the test result leaves the lab. This fits with FDA’s own historical distinction between IVD kits—where a manufacturer can put reagents, protocol, and components in a box and ship it—and LDTs, where no tangible product is shipped. Although FDA had argued LDTs were also devices, the court rejected that. After Judge Jordan vacated the LDT final rule, FDA had until late May to appeal and chose not to. Then, in September, the agency formally rescinded the rule, deleting the ten words that had added “including when the manufacturer of these products is a laboratory” to the IVD definition. That put the regulatory text back to the pre–May 2024 IVD definition in 21 CFR 809.3.
Christine stresses that while many are rightly celebrating that FDA cannot independently regulate LDTs as devices, this does not mean FDA has lost all leverage over labs. The agency retains its full general device authority. She walks through three main regulatory “hooks” that still touch LDTs. First, specimen collection kits are separately regulated devices, often Class I or Class II with 510(k) requirements. If a lab promotes its LDT in a way that bundles or implies use of a particular collection kit, FDA can act against the kit, effectively reaching the LDT indirectly. Second, many LDTs use components that are themselves regulated devices: analyte-specific reagents (ASRs) have their own device classification because FDA determined in the 1990s that they independently meet the device definition, and RUO reagents are subject to strict rules on permitted uses and promotional claims. A lab that misuses or mis-promotes ASRs or RUOs can draw FDA enforcement even if the overarching LDT is outside the device framework. Third, software used in testing can be either a component of a test or a standalone medical device. Christine notes that device law defines a device broadly enough to include software, and FDA has also been given explicit authority over certain software functions by the 21st Century Cures Act. The Texas LDT decision cannot be generalized to argue that software is not a device, because FDA’s software authority does not rest solely on the LDT/device argument.
Shannon then takes over to define software as a medical device and distinguish it from software embedded in instruments. Software as a medical device (SAMD) is software that, by its intended use, is meant for diagnosis, prevention, monitoring, treatment, or alleviation of disease—or to assist with diagnosis, screening, prognosis, or prediction—by analyzing data rather than specimens. A critical concept is “manufacturer”: whoever designs and controls the software has the regulatory responsibility. If a lab builds its own AI pipeline, the lab becomes the manufacturer and owns the regulatory burden; if the lab purchases a commercial tool, the vendor typically carries that burden unless the lab modifies or co-develops it. Shannon distinguishes SAMD from “software in a medical device” (SiMD), which is software that only functions in the context of a specific instrument, like the software embedded in a flow cytometer or PCR platform. In those cases the instrument manufacturer is usually responsible for both hardware and software as a single regulated device.
Shannon explains that the 21st Century Cures Act carved out certain categories of software that are not considered SAMD. These include administrative support software for billing, scheduling, and business analytics; “healthy lifestyle” apps that track steps, diet, or fitness without diagnostic claims (though blood pressure tracking is becoming more scrutinized); electronic records systems like LIS and EHRs that merely store and display data without independent interpretation; and basic data management tools. There is also a special category, clinical decision support (CDS), where software could conceptually be a device but FDA is deprioritizing enforcement if it meets four criteria. To qualify as CDS, the software must not acquire, process, or analyze medical images or instrument signals (a hurdle most lab AI fails because they analyze slides or signals); must not directly display or manipulate detailed patient medical information in a diagnostic way; must provide recommendations rather than dictating decisions; and must be independently reviewable, with transparent logic rather than a black box. If the software fails any of these, it falls outside CDS and is more likely treated as SAMD.
To make this concrete, Shannon offers practical lab examples. Software to document PTO requests is clearly administrative. Simple Excel spreadsheets used to track reagent lots are just data storage. Software that drives a flow cytometer and performs analysis is software in a medical device, regulated as part of the instrument. On the other hand, next-generation sequencing (NGS) bioinformatics pipelines that process signals from multiple instruments or tests and generate interpreted reports are strong candidates for SAMD, and FDA explicitly refers to them in CDS guidance as typically not qualifying for CDS. Software that automatically orders antibiotics based on susceptibility results with no human review, or a black-box algorithm producing colon cancer risk scores without explainability, are also likely to be treated as SAMD because they directly influence treatment and lack transparency.
Christine returns to explain how FDA applies its existing risk-based device framework to software and AI. The agency continues to use three device classes: Class I (low risk, often exempt from premarket review but subject to quality system requirements); Class II (moderate risk, generally requiring 510(k) review); and Class III (high risk requiring PMA). Low-risk software includes general wellness apps that do not reference disease; moderate-risk examples include tools that monitor parameters like heart rate and feed information into clinical decision-making; high-risk examples include AI systems used for cancer detection or HIV-related decisions. She points out that the classical device framework assumes relatively static products—once cleared or approved, substantive changes trigger new submissions. That model simply doesn’t work for learning systems; AI and ML are designed to change as they ingest new data, and many of those changes could affect safety and effectiveness.
Recognizing this, FDA has shifted toward a “total product life cycle” approach for AI-enabled software, built out largely through guidance documents. Christine describes three main components. First, predetermined change control plans (PCCPs), for which FDA has issued final guidance for AI-enabled device functions. When a sponsor seeks clearance or approval, they also submit a plan describing what aspects of the model may change over time and how those changes will be validated and controlled, effectively pre-approving a bounded evolution of the algorithm. Second, life cycle management guidance (currently in draft form) addresses how user interfaces, risk profiles, and data handling strategies should be managed as software evolves. Third, FDA is placing heavy emphasis on real-world evidence (RWE), seeking to understand how AI-enabled devices perform post-market and even proposing structured ways to measure and evaluate AI tools in real-world settings; the agency has recently requested public comment on such frameworks.
Christine then outlines how FDA assesses software risk more generally. Key dimensions include the intended use (what exactly the software does; whether it is adjunctive or determinative; whether it targets high-risk diseases like cancer or HIV), the level of control over care decisions (is it supporting or replacing human judgment), who uses it (patients vs clinicians, inside or outside controlled settings), the nature of the functionality (just organizing data vs analyzing and recommending treatment), and how it is integrated (standalone or deeply embedded in complex systems). For AI and ML specifically, she notes that FDA has repeatedly highlighted concerns about lack of transparency and explainability (black-box models that clinicians cannot interpret), data drift (performance degradation over time as real-world data shifts), data bias (training data not representing all relevant populations), cybersecurity vulnerabilities (especially for network-connected tools handling medical data), and integration effects (a model that is safe in one system may behave differently when integrated into another environment, creating new safety issues).
In the final substantive section, Shannon focuses on what all this means practically for laboratories that are developing or co-developing software. His central message is that everything starts with the intended use statement, and that intended use must be objective, “boring,” and match what the lab can actually prove. He contrasts a fictional, over-the-top intended use—an AI that “automatically diagnoses all major diseases with 99.9% accuracy,” replaces doctors, and ushers in the “world of individualized medicine”—with a more realistic example: software that assists lab professionals and clinicians by organizing and displaying test results and highlighting values outside reference ranges, while leaving final decisions to professional judgment. The former invites impossible evidentiary burdens and high-risk classification; the latter frames the tool as a supportive aid and is more defensible. Shannon cautions against marketing language in intended use, overly precise performance claims, assertions about replacing physicians, and generic references to “life-threatening conditions” unless those are clearly justified.
Shannon urges labs that have developed or co-developed software to build robust documentation proactively rather than trying to retrofit an FDA-ready file under pressure. He describes three layers of documentation. Foundational documents include a clear intended use statement and a device description mapping data sources, interfaces, and algorithms. Ongoing documentation covers training data (what populations were used, and where biases or exclusions might lie), verification and validation work (expected vs actual outputs and associated performance metrics), and systematic change management. On change control, he stresses the risks of uncontrolled “tinkering” by enthusiastic developers and recommends formal versioning (bundling changes into releases), scoping validation appropriately (more for algorithmic changes, less for cosmetic ones), and centralizing all records—akin to a master file—so they are easy to retrieve. Standard operating procedures should cover how the software is used, how issues are handled, and how routine changes (like adding a new gene to a pipeline or extending to a new patient group) will be designed, validated, and implemented.
In the Q&A, Xander and the audience raise practical questions. One asks how NGS pipelines differ from mass spec or PCR analysis software. Shannon explains that instrument-tied software embedded in a mass spectrometer or qPCR platform is usually “software in a medical device,” with regulatory responsibility resting on the instrument manufacturer, whereas bioinformatics pipelines that aggregate signals from multiple platforms and tests and deliver interpreted outputs are more likely to be treated as SAMD—especially since FDA singles them out in its CDS guidance as typically not qualifying as CDS. Both Shannon and Christine emphasize there is a lot of gray and that labs should behave as if they might be regulated, documenting and validating accordingly. Another question asks what the actual “chance” is that FDA will consider a specific pipeline as SAMD; Christine declines to give a numeric probability, noting that from the lab’s perspective, if you are the one selected for scrutiny the risk is effectively 100%, and that FDA staff routinely review marketing, attend conferences, and respond to perceived safety or effectiveness concerns. Questions also touch on how IQ/OQ/PQ concepts apply to AI (conceptually similar to wet-lab tests: define what the system should do and confirm that it behaves accordingly), whether model calibration metrics like Brier score should be reported (any robust evidence of performance and calibration is helpful), and whether an AI that generates SOPs and validation documents would be a device (generally no, because it supports documentation and does not directly drive diagnosis or treatment). A hypothetical AI that flags IV-contaminated chemistry specimens might or might not be SAMD depending on how it ingests data (raw instrument signals versus processed numeric results) and how tightly it is integrated into decision-making. In response to a final question about whether new regulations are likely as the administration promotes AI and machine learning, Christine is clear that this space will continue to be very active: FDA has already issued an unusually large volume of guidance for software and AI compared to other areas, and more guidance—and eventually more detailed regulations under the existing device authority—are expected. The session concludes with Xander thanking the speakers, reminding attendees about continuing education credit, and noting that the recording will be posted on the ADLM artery site.
#
Here’s a detailed but organized summary of the webinar.
1. Context and Goals of the Webinar
Moderator Xander van Wijk opens the session, hosted by the ADLM Innovation and Technology Division, focused on the intersection of artificial intelligence (AI) and laboratory-developed tests (LDTs) in clinical diagnostics.
Two speakers:
-
Christine P. Bump – Regulatory attorney with ~20 years of FDA experience, especially in diagnostics, genomics, digital health, and LDT issues since 2004.
-
Shannon Bennett – Regulatory affairs leader and Mayo Clinic assistant professor with long experience in lab test development, verification, validation, and quality systems.
Learning objectives:
-
Define regulatory boundaries of FDA oversight for LDTs and for regulated software devices (e.g., software as a medical device, SAMD).
-
Identify criteria that classify AI/ML software as a medical device.
-
Apply risk-mitigation strategies to AI integration into LDT and lab workflows.
2. Christine: What Is (and Is Not) an LDT, Post-Lawsuit
2.1 Historical FDA Position and the 2024 LDT Final Rule
-
Since ~1992, FDA claimed LDTs were medical devices, but under “enforcement discretion” (they said they could regulate them but usually chose not to).
-
In 2024, FDA issued the LDT final rule, explicitly stating that LDTs are devices, by modifying the regulatory definition of in vitro diagnostic (IVD).
2.2 ACLA/AMP Lawsuit and the Texas Decision
-
ACLA and AMP sued; in March 2024, Judge Jordan (E.D. Texas) ruled that LDTs are not devices under the Federal Food, Drug, and Cosmetic Act.
-
The ruling hinged on the statutory definition of a device:
-
A device must be a tangible, physical product introduced into interstate commerce.
-
Judge Jordan held that LDTs are categorically different:
-
An LDT is a service performed in a single lab using the lab’s own method and process.
-
No tangible product (like a kit) is shipped in interstate commerce.
-
Only results leave the lab; the protocol never leaves.
He effectively gave a working definition of an LDT:
-
A single lab:
-
Develops its own method and process.
-
Receives a specimen on a physician order.
-
Tests that specimen using its own protocol.
-
Releases a result.
-
Crucially, each lab’s protocol is unique, created with its own knowledge, and does not leave that lab. That makes it a service, not a shipped device/kit.
2.3 Rescission of the LDT Final Rule
-
Judge Jordan’s decision vacated the LDT final rule. FDA had until end of May to appeal and chose not to, so the rule could not be enforced.
-
In September, FDA formally rescinded the rule by deleting the 10 added words from the IVD definition:
-
They had added: “including when the manufacturer of these products is a laboratory.”
-
Removing those words restored the IVD definition to pre-May 2024, under 21 CFR 809.3.
Bottom line:
FDA cannot independently regulate LDTs as devices. But all other FDA authorities remain intact.
3. Christine: FDA Still Has Hooks into Labs via Components and Software
Christine emphasizes: labs cannot be complacent.
Even though LDTs themselves are not devices, FDA still has authority over:
-
Specimen collection kits
-
These are devices on their own (often Class I or II; Class II usually needs 510(k)).
-
If a lab markets an LDT in a way that implicitly or explicitly includes a collection kit, FDA can go after the kit (and effectively reach the LDT via that route).
-
Components like ASRs and RUO reagents
-
ASRs (analyte-specific reagents):
-
Are independently regulated devices with their own device classification.
-
FDA in the 1990s decided ASRs independently meet the definition of a device, based on risk seen in LDTs.
-
RUO reagents:
-
LDTs often use these components, so labs must stay within FDA’s rules on ASRs and RUOs.
-
Software used in testing
3.1 Device vs Component – Definitions
Normally, components are not independently regulated; they’re part of the finished device that gets a single clearance/approval.
But if a component itself meets the full device definition, FDA can regulate it as its own device (e.g., ASRs, some software).
3.2 Why the LDT Decision Doesn’t Save Software
So we cannot extrapolate the LDT ruling to argue that software shouldn’t be a device. FDA’s framework for software remains intact.
4. Shannon: Software as a Medical Device (SAMD) vs Software in a Medical Device
4.1 Definition and the Role of the “Manufacturer”
Software as a Medical Device (SAMD):
-
Intended to diagnose, prevent, monitor, treat, or alleviate disease, or to support diagnosis, screening, prognosis, prediction, etc.
-
Key idea: analyzes data, not specimens directly.
Manufacturer = the entity responsible for design and regulatory compliance:
-
If your lab develops the software, your lab is the manufacturer and carries the regulatory burden.
-
If you buy software, typically the vendor is the manufacturer (unless you modify/co-develop).
-
Co-development creates gray zones; who is manufacturer must be clearly agreed upon.
4.2 SAMD vs “Software in a Medical Device”
5. Shannon: Software Exemptions under the 21st Century Cures Act
Certain software functions are explicitly carved out and not considered SAMD:
-
Administrative support software
-
Healthy lifestyle / wellness apps
-
Track diet, workouts, steps, sometimes heart rate, etc., without a diagnostic claim.
-
Note: blood pressure tracking is becoming a gray area as FDA scrutiny increases.
-
Electronic patient records / LIS / EHR
-
Data management / databases
5.1 Clinical Decision Support (CDS) – “Deprioritized” but Not Totally Outside FDA
Some software could be SAMD but is treated as lower regulatory priority if it qualifies as CDS. To qualify, it must pass four tests:
-
Does not acquire/process/analyze medical images or signals.
-
Does not display/analyze/print detailed patient medical information beyond helping a clinician review data.
-
Provides recommendations, not directives.
-
Is independently reviewable.
If it fails any step, it does not qualify as CDS and is more likely to be treated as SAMD.
6. Shannon: Practical Examples in the Lab
Not SAMD (generally):
-
PTO tracking software – administrative support.
-
Simple Excel sheets tracking reagent lots – basic data storage only.
-
Flow cytometer software that drives the instrument and performs analysis – software in a medical device; regulated as part of the instrument.
Likely SAMD (higher concern):
-
NGS/bioinformatics pipelines:
-
Often process signals from multiple tests and instruments, interpret results, and may be used across many assays.
-
Explicitly called out by FDA in CDS guidance as often not qualifying as CDS → more likely SAMD.
-
Software that automatically orders antibiotics based on susceptibility results with no human review.
-
Black box risk-scoring algorithms (e.g., colon cancer risk score) with no explainability and no human in the loop.
7. Christine: FDA’s Risk-Based Framework for Software & AI
7.1 Classic Device Classes Applied to Software
FDA uses the same three device classes:
If the software references high risk indications (e.g., cancer, HIV), FDA will tend to treat it as higher risk.
7.2 Why Classic Device Rules Don’t Fit AI/ML
Traditional device frameworks assume:
-
You get a clearance/approval.
-
If you change the device in a way that may affect safety/effectiveness, you file a new 510(k) or PMA supplement.
For AI/ML:
So FDA has shifted to a Total Product Life Cycle (TPLC) model, heavily supported by guidance documents since 2019.
8. Christine: Total Product Life Cycle – Three Buckets
-
Predetermined Change Control Plans (PCCPs)
-
Final guidance (Aug 18) for AI-enabled device software functions.
-
When you first seek clearance/approval, you also propose:
-
What might change as the model learns.
-
How you will manage and validate those changes (protocols, performance standards, validation plans).
-
This allows some pre-authorized evolution of the software without new submissions for every minor change.
-
Life Cycle Management
-
Real-World Evidence (RWE)
-
FDA increasingly relies on post-market data to monitor AI tools.
-
Might require Phase 4/post-market studies.
-
In Sept, FDA requested public comment on methods to measure and evaluate AI-enabled devices in the real world (comment period open through Dec 1).
9. Christine: FDA’s Risk Assessment Dimensions for Software & AI
FDA evaluates software risk using five broad dimensions:
-
Intended use
-
What does the software actually do?
-
Is it adjunctive or does it directly drive diagnosis/treatment?
-
Is the indication high-risk (e.g., cancer, HIV)?
-
Risk level
-
Functionality
-
Is it analyzing patient data?
-
Is it making direct treatment recommendations?
-
Who uses it (lab staff vs physicians vs patients)?
-
User
-
Integration
9.1 AI/ML-Specific Risks FDA Cares About
FDA has highlighted several unique AI/ML risks:
10. Shannon: What Labs Should Do – Practical Risk Mitigation & Documentation
Shannon shifts to “what it means for your lab”, especially if you:
Goal: Be prepared so that if FDA ever appears (or if you choose to submit), you’re not scrambling.
10.1 Intended Use: The Central Pillar
Everything flows from the intended use statement:
He contrasts a terrible intended use vs a good one.
Bad example:
“The best AI software ever is a revolutionary AI platform that automatically diagnoses all major diseases with 99.9% accuracy using any laboratory or imaging data. It replaces the need for physician interpretation by instantly identifying life-threatening conditions and providing precise treatment pathways for each patient. The world of true, individualized medicine is upon us.”
What’s wrong:
-
Marketing language (“revolutionary,” “world of individualized medicine”).
-
Extremely specific and high accuracy claim (99.9%) – would require enormous proof and creates problems if performance drifts even slightly.
-
Claims to replace physician interpretation → high risk, no human in the loop.
-
“Life-threatening conditions” language invites regulatory scrutiny.
Shannon’s advice:
-
If you must reference performance, use “greater than X%” rather than an overly precise fixed number (e.g., “>95% accuracy”) to allow some margin.
-
Avoid hyperbole, especially about life-threatening conditions unless absolutely justified.
Better example:
“Pretty good AI software is designed to assist lab professionals and clinicians by organizing and displaying patient test results and highlighting values that differ from reference ranges. Software allows users to review underlying data and apply their professional judgment in clinical decision making.”
Key characteristics:
-
Boring, objective, factual.
-
Emphasizes assistive role and human professional judgment.
-
Makes no extreme claims.
10.2 Foundational Documentation
Labs should proactively maintain:
-
Intended use statement (final, clean, boring).
-
Device description:
-
How data flows through the system.
-
Interfaces – where data comes from and goes to.
-
What algorithms are used and how they transform data.
10.3 Ongoing Documentation
This is not one-and-done; it’s continuous:
-
Training data documentation
-
What datasets were used? From which populations?
-
What populations are excluded (e.g., no pediatrics, certain racial/ethnic groups)?
-
How potential biases map to intended use:
-
Verification and validation
-
Define expected inputs and expected outputs.
-
Run the validation protocol and document actual outputs.
-
Use results to calculate familiar metrics (sensitivity, specificity, PPV, NPV, etc.).
-
Change management
-
Centralized documentation storage
-
Keep validation reports, training data documentation, change logs, and intended use in one “master file” so you’re not hunting across desks and servers.
-
SOPs
11. Q&A Highlights
Several questions are discussed near the end:
-
NGS pipeline vs mass spec or PCR software
-
Mass spec and PCR software often come as part of the instrument → typically software in a medical device; manufacturer bears the burden.
-
NGS bioinformatics pipelines:
-
Often ingest data from multiple instruments/tests, sometimes used for many assays.
-
FDA singles them out in CDS guidance as not typically CDS, so more likely SAMD.
-
There is substantial gray area; labs should assume they may be regulated and prepare accordingly.
-
Likelihood that FDA will actually enforce on specific software
-
Answer: There’s no precise percentage.
-
FDA reviewers attend conferences, read marketing materials, and may act if they perceive safety/effectiveness concerns or risk-raising claims.
-
For the lab, if you get targeted, your risk is effectively 100%. So build your risk assessment and documentation as if you might be.
-
Validation and IQ/OQ/PQ concepts for AI
-
Model calibration and metrics like Brier score
-
AI that generates SOPs and validation documents – SAMD?
-
Likely not SAMD, as it falls under administrative/quality documentation support rather than diagnosis/treatment.
-
As always, details matter, but this would usually fall under the Cures Act exemptions.
-
AI to catch IV-fluid–contaminated specimens
-
Whether it’s SAMD may depend on:
-
If it processes signals directly, more likely SAMD; if it just flags unusual result patterns from numeric outputs, there may be more room to argue CDS or lower-risk use—but it depends on the full context.
-
Future of AI/ML regulation
-
Christine: Expect more guidance and eventually more detailed regulations.
-
Current administration is very focused on AI/ML; FDA has already released many guidance documents and is working continuously to adjust as technology evolves.
-
Regulations must still operate within the existing device framework, but we may see sub-frameworks or clarifying regulations specifically for AI/ML.
12. Final Takeaways
-
LDTs are not devices and cannot be directly regulated as such—but:
-
For lab-developed software and AI used with LDTs:
-
The line between “just part of an LDT” and “its own regulated device” is case-by-case and gray.
-
The intended use statement is the critical anchor for risk classification and regulatory exposure.
-
Labs should act as if their software might be regulated and build strong, proactive documentation, validation, and change-control frameworks.
-
FDA is actively evolving its guidance and TPLC approach to cope with AI/ML, focusing on transparency, bias, data drift, cybersecurity, and integration risks.
If you’d like, I can next turn this into a one-page executive summary or a slide outline tailored for a lab leadership or compliance audience.
#
#