FTA + FMEA + ETA — when to use which
Fault Tree Analysis, Failure Mode and Effects Analysis and Event Tree Analysis are not alternatives — they are complements. They answer three different questions, in three different directions, with three different outputs. Most safety standards demand at least two of them; the larger, more demanding ones (ARP 4761, ISO 26262, IEC 61511) demand all three, with explicit traceability between them. Picking one as your "preferred" technique is usually a sign you don't yet understand what each delivers.
Why three techniques exist at all
A safety analyst looking at a system needs to answer three different questions, and no single technique answers all three:
- "Given an undesired outcome, what combinations of failures cause it?" — the question FTA answers. Top-down, deductive, Boolean.
- "For each component in my system, what can go wrong, and what does it affect?" — the question FMEA answers. Bottom-up, inductive, tabular.
- "Given that something has gone wrong, what consequences can develop?" — the question ETA answers. Forward, inductive, branching.
Each direction catches things the others miss. FTA, working back from a known hazard, finds the cut sets that cause it — but only for hazards you already know to ask about. FMEA, working forward from each component, finds the failure modes you might not have thought to include in your tree — completeness insurance. ETA, working forward from an initiating event, finds the downstream consequence chain — what happens between "fault occurred" and "person harmed", which is invisible to FTA stopping at the top event.
The three together form a closed loop: FMEA's component failure modes become FTA's basic events; FTA's top event becomes ETA's initiating event; ETA's end-states feed back into FMEA-driven mitigation review. A safety case using only one of the three has either a coverage gap or a defensibility gap, often both.
Step 1What each technique is, in one card each
One paragraph each, calibrated so that anyone who has done one of the three can recognise the others by family resemblance.
FTA
Start from a single undesired top event. Decompose deductively through AND / OR gates until each leaf is a basic event with a quantifiable failure rate or per-demand probability. Compute minimal cut sets and the top-event probability. Output is a tree (graphical) plus a cut-set table plus P(TOP). Best at: defending a quantitative claim (P(hazard) < target) for a known, named hazard. Not designed for: discovering hazards you haven't already identified.
FMEA
Walk every component (or every function in FMECA / DFMEA / SFMEA variants), enumerate its failure modes, and for each mode record the local effect, the system effect, the detection mechanism, and a Severity / Occurrence / Detection trio that yields a Risk Priority Number (RPN = S × O × D). Output is a wide table, often hundreds of rows. Best at: completeness — making sure no component-level failure mode escaped the safety case. Not designed for: quantifying combinations of failures or producing a single probability for a system hazard.
ETA
Start from an initiating event (something has just gone wrong). At each subsequent barrier or response, branch into "barrier worked" vs "barrier failed". Continue until every path terminates in a labelled end-state. Output is a left-to-right tree with end-state probabilities computed by multiplying along each path. Best at: modelling consequence-chain progression — the sequence of barriers between a fault and an outcome. Not designed for: discovering the cause of the initiating event itself (that's FTA's job).
Two structural points worth flagging now. First, FTA and ETA are both tree analyses but they read in opposite directions: FTA reads root-at-top with leaves at the bottom (failures consolidate upward to a single hazard), ETA reads initiating-event-at-the-left with end-states at the right (one event diverges into many consequences). Drawing them side by side makes the complementarity literal. Second, FMEA isn't a tree at all — it's a structured table — which is why it sits awkwardly in any "tree-of-trees" mental model and why people sometimes mistakenly leave it out of a "we did the trees" claim.
Step 2Side by side — what each looks like in practice
The differences are easier to see in a table than a description. Eight rows that consistently distinguish the three:
| Dimension | FTA | FMEA | ETA |
|---|---|---|---|
| Starts from | One named top event | A list of components or functions | One initiating event |
| Direction | Top-down (cause-finding) | Bottom-up (mode-finding) | Forward (consequence-finding) |
| Inference style | Deductive (Boolean) | Inductive (enumerative) | Inductive (sequential) |
| Primary output | Minimal cut sets + P(TOP) | RPN-ranked failure-mode table | End-state probabilities |
| Quantitative basis | Boolean algebra over basic-event probabilities | RPN = S × O × D, dimensionless | Path products of barrier-success probabilities |
| Coverage strength | Combinations of failures behind a known hazard | Component-level completeness | Consequence-chain progression |
| Coverage gap | Hazards you didn't think to model | Combinations of failures across components | Causes upstream of the initiating event |
| Effort to maintain | Medium — re-quantification per data update | High — every component change ripples through | Low–medium — barrier set is small |
How to read each output
Each technique has its own set of "what does the reviewer actually look at" conventions. Three thirty-second mental models:
- FTA — read the cut sets first, then importance. Order-1 cut sets are single points of failure; they're the conversation to have first. Order-2 (and higher) cuts under AND gates are where the redundancy genuinely buys defence. Once cut sets are in hand, the F-V importance ranking tells the design team which events are actually driving P(TOP). The top-event probability is the result; the cut-set rank is the action item.
- FMEA — sort by RPN, but argue at S = 10. RPN is a dimensionless product of three subjective scales, so its absolute magnitude is meaningless and its rank order is fragile to scale-definition choices. The actually defensible reading is "show me the rows where Severity is at the top of its scale that have no mitigation in place" — the catastrophic-but-uncontrolled rows. The "Current Detection" column is where most of the value lives; an undetected high-severity failure mode is an FMEA finding regardless of what its RPN happens to compute to.
- ETA — read left-to-right, focus on the high-severity terminals. The end-states ranked by probability × severity are the safety-case-relevant outcomes. The branches between the initiating event and those terminals encode the barriers — which is where mitigation effort goes. ETA's quantification depends critically on the conditional independence assumption between barriers; reviewers will ask about it, so document any common-cause links between barriers explicitly.
How the three connect — the dataflow loop
Done properly, the three techniques produce inputs and outputs that flow into each other:
FMEA component failure modes ──→ FTA basic events
FTA top event ──→ ETA initiating event
ETA barrier failures ──→ FMEA mitigation-review rows
│
└──→ next iteration
FMEA enumerates every failure mode of every component; the ones with non-negligible severity become FTA basic events. FTA combines them into a top-event probability, and that top event — when it has consequences worth tracking — becomes the initiating event of an ETA. The ETA's barriers (track circuit, signaller intervention, ATP back-up, etc.) introduce new components and failure modes that the original FMEA may not have covered, which feeds back into the next FMEA pass. Three iterations of this loop is roughly what a competent ARP 4761 SSA looks like.
Step 3The same SPAD scenario through all three lenses
Words about complementarity become concrete when you see the same scenario produce three different deliverables, each addressing a different question. We'll re-use the railway SPAD example from Article 1 — same components, same probabilities — and look at what each technique would output.
FTA — what we already have from Article 1
Article 1's FTA decomposed the SPAD top event into eight minimal cut sets (three signalling singletons, four ATP×brake pairs, one driver double-miss), and the wrong-side-corrected quantification gave P(SPAD) ≈ 4.65×10⁻³ per train per year, with BE-001 (lamp wrong-side) accounting for 94% by F-V importance. That answers the cause-finding question: given that we don't want SPADs, what combinations cause them?
Two things FTA didn't tell us, and now wants the other two techniques to fill in.
FMEA — completeness around the dominant component
BE-001 is "signal lamp / LED unit failure" lumped as a single basic event. An FMEA on the lamp pulls that lump apart — every distinct failure mode, its local and system effect, current detection mechanism, and an S/O/D scoring. A typical four-row excerpt looks like:
| Failure mode | Local effect | System effect | S | O | D | RPN |
|---|---|---|---|---|---|---|
| Bulb / LED open circuit | Dark signal | Driver treats as most restrictive — train stops, no SPAD | 10 | 5 | 2 | 100 |
| Wrong-side colour shift | Green/yellow displayed when red commanded | Driver and ATP both authorise — direct SPAD pathway | 10 | 2 | 8 | 160 |
| Intermittent / flickering | Aspect transitions unpredictably | Driver may misread; possible SPAD pathway | 8 | 3 | 7 | 168 |
| Lens contamination / dim | Aspect not readable at sighting distance | Driver applies brakes precautionarily — no SPAD | 4 | 6 | 4 | 96 |
What the FMEA adds beyond the FTA: it disaggregates BE-001 into the failure modes that actually matter (wrong-side and intermittent) versus the ones that don't propagate through the operating-rule mitigation (open circuit, dim). The wrong-side fraction we used in Article 1 (1% of all lamp failures) is justified at this level — the FMEA's "Occurrence" column is where that 1% comes from.
Two structural observations from the FMEA that the FTA alone wouldn't surface:
- Intermittent failure has the highest RPN (168) and is currently undetected. The FTA didn't flag it because BE-001's λ already includes intermittent failures — but the FTA's basic-event probability is for "lamp wrong-side", which intermittent failures don't cleanly fit. This is a coverage gap in the FTA model that the FMEA exposes: intermittent failures may need their own basic event with separate quantification.
- Detection is the design lever, not occurrence. The wrong-side mode has D = 8 because the lamp-proving circuit doesn't catch colour-shift faults. Improving D from 8 to 4 (e.g. by adding wavelength monitoring) drops the wrong-side P from 1% of total to ~0.5%, which feeds back into the FTA as a 2× reduction in BE-001's contribution — and therefore a 2× reduction in P(TOP). FMEA's S/O/D ranking and the FTA's importance ranking both point at the same intervention, but FMEA tells you which dimension to push on.
ETA — what happens once a SPAD occurs
FTA stops at the top event. ETA picks up there: given a SPAD has occurred, what consequence develops? Three barriers stand between SPAD and collision:
SPAD (P = 4.65×10⁻³/yr) │ ├── Track circuit detects unauthorised occupation │ └── ✓ (P = 0.95) ──────────→ No consequence (signal sequence trips) │ └── ✗ (P = 0.05) │ │ │ ├── Signaller intervenes (radio / emergency stop) │ │ └── ✓ (P = 0.50) ─→ No consequence │ │ └── ✗ (P = 0.50) │ │ │ │ │ ├── Approach speed allows stop in available distance │ │ │ └── ✓ (P = 0.70) ─→ Near-miss / minor incident │ │ │ └── ✗ (P = 0.30) ─→ Collision
End-state probabilities, per SPAD and per train-year:
| End state | P per SPAD | P per train-year |
|---|---|---|
| No consequence | 0.95 + 0.05×0.5 = 0.9750 | 4.53×10⁻³ |
| Near miss / minor incident | 0.05×0.5×0.7 = 0.0175 | 8.1×10⁻⁵ |
| Collision | 0.05×0.5×0.3 = 0.0075 | 3.5×10⁻⁵ |
That last row is the number that matters for the safety case. A collision rate of 3.5×10⁻⁵ per train per year is what gets compared against tolerable risk targets (TfL, ORR, IRSE all publish these), not the 4.65×10⁻³ per-train-year SPAD rate from FTA. The FTA tells us how often the dangerous fault occurs; the ETA tells us how often the dangerous fault produces the dangerous consequence. Same example, very different numbers, very different conversations with the regulator.
Step 4Which standards demand which combination
Mapping the three techniques onto the standards reveals which are explicit, which are implicit-but-required-in-practice, and which leave the choice to the analyst.
| Standard | FMEA | FTA | ETA | How they're orchestrated |
|---|---|---|---|---|
| ARP 4761 (aerospace) | Required (PSSA) | Required (FHA → SSA) | Required (CCA / particular risk) | Explicit triple. FHA identifies hazards; FMEA at component level; FTA decomposes each hazard top-down; CCA / ETA models common-cause and particular risks. Traceability between them is itself an audit item. |
| ISO 26262 (automotive) | Required at HW (HAZOP-FMEA) | Required at safety-goal level for ASIL C/D | Implicit (residual-risk modelling) | FMEA at hardware part level (random failures), FTA at violation-of-safety-goal level. ETA isn't named explicitly but residual-risk and end-to-end fault propagation arguments are essentially ETA in effect. |
| IEC 61508 / 61511 (functional safety / SIS) | Required for SIF design | Required for SIL verification | Required (LOPA = simplified ETA) | LOPA — Layer of Protection Analysis — is structurally an ETA reduced to single-branch arithmetic per layer. SIL verification is FTA. FMEA underpins both. |
| EN 50126 (rail RAMS) | Required at component / subsystem | Required at SIL 3 / 4 | Required (CSM-RA scenarios) | Common Safety Method on Risk Assessment (CSM-RA, EU Reg 402/2013) requires explicit consequence-chain analysis. FMEA at the component level, FTA at the function, ETA at the operational scenario. |
| MIL-STD-882E (defence) | Required (where mishap risk warrants) | Required for Catastrophic / Critical hazards | Recommended for catastrophic chains | The standard mandates a system safety program with techniques selected for the hazard severity. Catastrophic and Critical categories typically get all three; lower categories may get FMEA only. |
If you're operating outside these and the standard doesn't pin down a combination, the FMEA + FTA + LOPA triple (component completeness + cause quantification + simplified consequence chain) is the pragmatic minimum — that's what most regulators end up asking for once they read past whatever the standard literally says.
Operational pitfalls — three that show up in audit
- FMEA without traceability back to FTA basic events. The FMEA finds 400 failure modes; the FTA uses 50 basic events. Of the 400, how many are represented in the 50? The mapping has to exist and be defensible. Failure-mode rows that don't map to either an FTA basic event or an "out of scope, justified" category are coverage gaps the auditor will find.
- FTA without ETA when the top event has consequence variation. SPAD doesn't always cause collision; loss-of-thrust doesn't always cause crash; pressure-vessel rupture doesn't always cause release. If your top event is a fault and the regulator is interested in the harm, ETA is what bridges them. Quantifying the fault rate and calling it the harm rate is a common arithmetic error that ETA prevents.
- ETA without modelling barrier dependence. ETA's path-product math assumes barrier-success probabilities are conditionally independent given the initiating event. They often aren't — fatigue degrades both driver and signaller; a power outage takes down both ATP and emergency lighting. Common-cause failure between barriers is the same problem β-factor models address in FTA, and it has to be modelled explicitly in the ETA too. Otherwise the ETA's collision-rate number is optimistic by an unknown factor.
Where to go next
- For the FTA leg of the workflow, Article 1 walks the SPAD decomposition end-to-end; Article 2 covers the importance-measure rankings; Article 3 covers the algorithm choice.
- For FMEA, FTA Studio's FMEA cross-reference feature (Enterprise) lets you tag basic events with their corresponding FMEA failure modes and exports a unified table — closing the coverage-traceability gap mentioned above.
- For ETA, FTA Studio's Event Tree editor (Enterprise) supports the same Boolean-quantification engine as the FTA side, so the conditional probabilities propagate consistently and barrier-CCF can be modelled via shared basic events.
- For the standards-driven workflow, the ARP 4761 reference page lays out the canonical "all three" sequence and the traceability artefacts the FAA and EASA expect to see.