Fault Tree Analysis for Medical Device Risk Management: FTA vs FMEA

Guide to Fault Tree Analysis under ISO 14971, including FTA vs FMEA, AND/OR gates, probability calculations, examples, and when top-down risk analysis works best.

Why Most Medical Device Teams Overlook FTA — and When That Is a Mistake

Most medical device manufacturers default to FMEA for risk analysis. It is systematic, familiar to engineers, and maps neatly to components and process steps. But FMEA has a fundamental limitation: it examines single faults in isolation. It cannot model combinations of events, logical dependencies, or cascading failure sequences — precisely the scenarios that cause catastrophic device failures.

Fault Tree Analysis (FTA) is a top-down, deductive method that starts with an undesirable event (a harm or hazardous situation) and works backward to identify all the combinations of causes that could produce it. ISO/TR 24971:2020 Annex B lists FTA as one of the complementary risk analysis techniques alongside FMEA, PHA, and HAZOP. FDA's own risk management training materials reference FTA as a tool for analyzing hazards identified during development.

Despite this, few medical device manufacturers use FTA systematically. This guide explains when FTA is the right tool, how to perform it, how it differs from FMEA, and how to combine both for comprehensive risk analysis under ISO 14971.

FTA vs FMEA: Fundamental Differences

Direction of Analysis

Dimension	FTA	FMEA
Direction	Top-down (deductive): starts with the harm, traces back to causes	Bottom-up (inductive): starts with components, traces forward to effects
Starting point	Known undesirable event (top event)	Known components or process steps
Question asked	"What could cause this specific harm?"	"What could fail in this component, and what happens?"
Logical relationships	Models AND/OR gates — combinations and sequences of events	Models single faults in isolation
Probability	Quantitative (calculates top event probability from component probabilities)	Qualitative (RPN based on subjective severity/occurrence/detection ratings)
Scope	System-level, event-focused	Component-level, part-focused
Best for	Complex causal chains, safety-critical systems, rare catastrophic events	Systematic component review, manufacturing processes, single-fault analysis
Representation	Tree diagram with logic gates	Tabular format (failure mode, effect, severity, occurrence, detection)

The Critical Limitation of FMEA for Medical Devices

FMEA evaluates each failure mode independently. For a heart-lung machine, FMEA might identify that "pump motor fails" and "backup battery fails" as separate failure modes, each rated individually. But FMEA does not evaluate the probability of both happening simultaneously — the exact combination that could lead to patient death.

FTA, using an AND gate, models exactly this: "Patient death from pump stoppage" requires both "pump motor fails" AND "backup battery fails." The probability of the top event is the product of the individual probabilities — potentially orders of magnitude lower than either individual failure, but critically important to calculate.

ISO 14971 explicitly requires consideration of sequences and combinations of events in risk analysis (Clause 5.3). FMEA alone does not satisfy this requirement. FTA does.

When to Use FTA Instead of FMEA

Use FTA When

Analyzing safety-critical failures — any scenario where the top event is patient death, serious injury, or life-threatening harm
Investigating combinations of events — situations where multiple independent or dependent failures must coincide to produce harm
Root cause investigation of complaints and adverse events — starting from the known harm and tracing backward through the causal chain
Estimating quantitative probabilities — when you need to calculate the probability of a hazardous situation using component failure rate data
Validating risk control effectiveness — modeling how the addition of a risk control (e.g., alarm, redundant sensor) reduces the top event probability
Complex system architecture with redundancy — where AND gates model the probability that all redundant paths fail simultaneously

Use FMEA When

Systematically reviewing all components — when you need to ensure every part, subsystem, and process step has been analyzed
Analyzing manufacturing processes — process FMEA (pFMEA) is the standard tool for manufacturing risk analysis
Evaluating single faults — when individual component failures are the primary concern
Software failure modes — analyzing individual software functions for potential failure behaviors
Early-stage design — when the design is not yet mature enough to define specific top events for FTA

Use Both Together When

Most medical devices benefit from using both techniques. ISO/TR 24971:2020 states: "These techniques are complementary, and it can be necessary to use more than one of them in order to support a thorough and complete risk analysis."

The recommended approach:

Start with Preliminary Hazard Analysis (PHA) early in development when few design details are known
Use FTA for critical hazard scenarios — model the causal chains for each high-severity harm
Use FMEA for comprehensive component-level analysis — ensure every subsystem and component has been systematically reviewed
Transfer FTA outputs to the FMEA — the fault tree's causal chains feed into the FMEA's failure mode columns, providing a richer analysis than either method alone

How to Perform Fault Tree Analysis

Step 1: Define the Top Event

The top event is the undesirable outcome you want to analyze. It must be specific and measurable. Examples for medical devices:

"Patient receives air embolism from infusion pump"
"Ventilator delivers incorrect tidal volume >20% above set value"
"Defibrillator fails to deliver shock when triggered"
"Insulin pump delivers incorrect dose >2x programmed amount"
"Surgical robot arm moves to unintended position during procedure"

The top event should correspond to a hazardous situation identified in your ISO 14971 risk analysis, specifically one with high severity (serious injury or death).

Step 2: Identify Immediate Causes (First-Level Events)

For the top event, ask: "What immediate conditions could cause this to happen?" Each answer becomes a node in the fault tree.

Example — Infusion Pump Air Embolism:

"Air enters the IV line" OR "Air present in the pump cassette"
Connected with an OR gate (either condition alone can cause the top event)

Step 3: Decompose Each Cause into Sub-Causes

Continue decomposing each event into its contributing causes, using AND/OR gates:

AND gate: ALL input events must occur simultaneously for the output event to occur. Probability is the product of individual probabilities: P = P1 × P2 × ... × Pn
OR gate: ANY single input event can cause the output event. Probability is calculated as: P = 1 - (1-P1) × (1-P2) × ... × (1-Pn)

Step 4: Continue Until Base Events Are Reached

Base events (also called "basic events" or "leaf nodes") are failures or conditions that cannot be decomposed further — typically component failures, human errors, or environmental conditions for which you have (or can estimate) failure probability data.

Step 5: Assign Probabilities and Calculate

For quantitative FTA, assign probability estimates to each base event using:

Component failure rate data from manufacturer specifications
Historical field data from similar devices
Published reliability databases (MIL-HDBK-217, IEC 62380)
Expert judgment (with documented rationale)

Step 6: Identify Minimal Cut Sets

A cut set is a combination of base events that, if they all occur, will cause the top event. A minimal cut set is the smallest such combination — removing any event from the set means the top event can no longer occur.

Cut set analysis reveals:

Single points of failure: Minimal cut sets with only one event (highest risk)
Most vulnerable paths: Cut sets with the highest combined probability
Risk control priorities: Adding a risk control that breaks any event in a cut set eliminates that failure path

Step 7: Evaluate Risk Control Effectiveness

Model risk controls as additional events or barriers in the fault tree. For example, adding an air detection alarm creates an AND gate: "Air embolism reaches patient" requires both "Air enters IV line" AND "Air detection alarm fails." The probability drops dramatically.

This quantitative comparison of "before" and "after" risk control effectiveness is one of FTA's most powerful capabilities — and one that FMEA's RPN system cannot provide.

FTA Symbols and Notation

FTA uses a standard set of symbols defined in IEC 61025:

Symbol	Name	Meaning
Rectangle	Event (intermediate or top)	A fault or condition that results from lower-level causes
Circle	Basic event	A component failure or error that cannot be decomposed further
Diamond	Undeveloped event	An event that could be analyzed further but is outside the scope of this analysis
AND gate	All inputs required	Output occurs only if ALL input events occur
OR gate	Any input sufficient	Output occurs if ANY input event occurs
Triangle	Transfer symbol	Connects to another part of the fault tree (for large trees)
House	House event	An event that is known to occur (probability = 1) or known not to occur (probability = 0)

Worked Example: Infusion Pump Air Embolism

Top Event

"Patient receives air embolism from infusion pump"

Fault Tree Structure

Patient receives air embolism
├── [OR] Air reaches patient bloodstream
│   ├── [AND] Air enters IV line AND air-in-line detector fails
│   │   ├── [OR] Air enters IV line
│   │   │   ├── IV bag runs empty (operator does not respond)
│   │   │   ├── Cassette loading error introduces air
│   │   │   └── Disconnection at IV connector
│   │   └── [OR] Air-in-line detector fails
│   │       ├── Sensor contaminated
│   │       ├── Software alarm logic error
│   │       └── Alarm disabled by user
│   └── [AND] Air present in pump mechanism AND pressure monitoring fails
│       ├── Air introduced during priming
│       └── [OR] Pressure monitoring fails
│           ├── Pressure sensor out of calibration
│           └── Software occlusion detection error

Quantitative Analysis (Illustrative)

Base Event	Estimated Probability
IV bag runs empty (operator does not respond)	1 × 10⁻³
Cassette loading error	5 × 10⁻⁴
Disconnection at IV connector	2 × 10⁻⁴
Sensor contaminated	1 × 10⁻⁴
Software alarm logic error	1 × 10⁻⁵
Alarm disabled by user	5 × 10⁻³
Air introduced during priming	3 × 10⁻³
Pressure sensor out of calibration	5 × 10⁻⁵
Software occlusion detection error	1 × 10⁻⁵

Path 1: Air enters IV line AND air-in-line detector fails

P(air enters IV line) = 1 - (1-0.001)(1-0.0005)(1-0.0002) ≈ 1.7 × 10⁻³
P(detector fails) = 1 - (1-0.0001)(1-0.00001)(1-0.005) ≈ 5.1 × 10⁻³
P(Path 1) = 1.7 × 10⁻³ × 5.1 × 10⁻³ ≈ 8.7 × 10⁻⁶

Path 2: Air in pump mechanism AND pressure monitoring fails

P(air during priming) = 3 × 10⁻³
P(pressure monitoring fails) = 1 - (1-0.00005)(1-0.00001) ≈ 6 × 10⁻⁵
P(Path 2) = 3 × 10⁻³ × 6 × 10⁻⁵ ≈ 1.8 × 10⁻⁷

Total top event probability (OR of both paths)

P(top event) ≈ 1 - (1-8.7×10⁻⁶)(1-1.8×10⁻⁷) ≈ 8.9 × 10⁻⁶

This means approximately 1 in 112,000 pumping sessions could result in an air embolism reaching the patient — a figure that can now be evaluated against the risk acceptability criteria defined in the risk management plan.

Risk Control Effectiveness

If the manufacturer adds a second independent air detection method (ultrasonic sensor) and a mandatory priming verification step, the updated fault tree introduces additional AND gates:

P(revised Path 1) drops by approximately 2 orders of magnitude
The quantitative comparison directly demonstrates risk control effectiveness — evidence that auditors and notified bodies accept

FTA in the ISO 14971 Risk Management Process

Where FTA Fits in the Process

ISO 14971 Phase	FTA Application
Risk analysis (Clause 5)	Identify causal chains for hazardous situations; estimate probabilities for risk estimation
Risk evaluation (Clause 6)	Quantitative probability estimates support risk evaluation against acceptability criteria
Risk control (Clause 7)	Model risk controls in the fault tree to quantify effectiveness before implementation
Overall residual risk (Clause 8)	Calculate overall risk by combining top event probabilities across all fault trees
Post-production (Clause 10)	Use FTA to investigate complaints — start from the reported harm and trace back to root causes

ISO/TR 24971:2020 Annex B Reference

ISO/TR 24971, the companion guidance to ISO 14971, lists FTA in Annex B as one of the techniques that support risk analysis. The document explicitly recommends using multiple complementary techniques: "These techniques are complementary, and it can be necessary to use more than one of them in order to support a thorough and complete risk analysis."

The listed techniques include:

Preliminary Hazard Analysis (PHA)
Fault Tree Analysis (FTA)
Failure Mode and Effects Analysis (FMEA)
Hazard and Operability Study (HAZOP)
Use-Related Risk Analysis

FDA Recognition of FTA

FDA's risk management training materials for CDRH staff (publicly available) identify FTA as one of three core risk analysis techniques alongside PHA and FMEA. The training describes FTA as "a top-down method to estimate fault probability and identify single faults that result in hazardous situations."

Transferring FTA Results to FMEA

A practical approach used by experienced risk managers:

Complete the FTA for critical hazard scenarios
Transfer each path from the FTA into the FMEA as failure modes
Reference the FTA diagram as an attachment to the FMEA row
Retain the FTA's causal chain detail in the risk management file — the FMEA table captures the overall risk and mitigations, while the FTA preserves the logical relationships

This approach addresses FMEA's weakness (no combination modeling) while maintaining FMEA's strength (systematic tracking of mitigations and verification). The FTA document is maintained as a reference artifact in the risk management file, available for auditors to review alongside the FMEA tables.

FTA for Complaint Investigation and Post-Production

FTA is particularly valuable in the post-production phase for analyzing adverse events and complaints:

Start from the reported harm (the top event is already known)
Trace backward through the device's fault tree to identify which causal path was activated
Identify root causes that may not be apparent from direct investigation
Update the risk file with actual field data, replacing estimated probabilities with observed rates

The Johner Institute recommends using FTA alongside PHA and FMEA during both development and post-production, noting that FTA is especially useful when investigating the logical chain of events that led to a complaint.

Practical Tips for Implementing FTA

Tooling

Dedicated FTA software tools (such as Item Software, ReliaSoft, or SAP FTA) handle complex trees and automated probability calculations. For smaller trees, Microsoft Visio or draw.io with custom shapes works adequately. The fault tree diagram itself is the primary deliverable — the tool is secondary.

Common Mistakes to Avoid

Defining the top event too broadly — "Device fails" is not actionable; "Infusion pump delivers 2x the programmed dose" is specific and analyzable
Mixing AND and OR logic incorrectly — this is the most common error; review each gate to confirm the logic is correct
Ignoring common-cause failures — if two redundant components share a power supply, a single power failure can defeat both; model this with a shared basic event
Using FTA to replace FMEA entirely — FTA targets specific scenarios; FMEA provides comprehensive coverage. Use both
Treating probability estimates as precise — probability data is often uncertain; use sensitivity analysis to understand which base events dominate the top event probability

How Deep to Go

For most medical device risk analyses, 3–5 levels of decomposition are sufficient. Deeper trees may be needed for Class III devices with complex safety architectures. The stopping criterion is practical: stop when you reach a base event for which you can assign a meaningful probability estimate or implement a risk control.

Key Takeaways

FTA is a top-down, deductive technique that starts with a known undesirable event and traces back to all possible causes using AND/OR logic gates.
FTA excels where FMEA cannot: modeling combinations of events, logical dependencies, and quantitative probability calculations for safety-critical scenarios.
ISO 14971 requires analysis of combinations and sequences of events — FTA satisfies this requirement directly; FMEA alone does not.
ISO/TR 24971:2020 Annex B lists FTA as a complementary technique and recommends using multiple techniques for thorough risk analysis.
FDA recognizes FTA as a core risk analysis technique alongside PHA and FMEA.
Use FTA and FMEA together: FTA for critical hazard scenarios and causal chain modeling, FMEA for comprehensive component-level coverage.
FTA is valuable for complaint investigation: start from the reported harm, trace backward through the fault tree to identify root causes and update risk files with actual field data.
Quantitative FTA enables risk control effectiveness demonstration: calculate the probability reduction from adding a risk control — evidence that auditors and notified bodies accept.