What Is FMEA?

Failure Mode and Effects Analysis (FMEA) is a systematic, proactive method for identifying where and how a product or process might fail, assessing the consequences, and prioritizing actions to reduce the most serious risks before they reach the customer. Rather than waiting for failures and reacting, FMEA anticipates them during design and development, when prevention is cheapest. It originated in aerospace and the military and is now a backbone of quality engineering, especially in automotive (under the AIAG-VDA standard).

The Building Blocks

For each function being analyzed, the team works through a chain:

  • Failure mode — the manner in which the item could fail (e.g., "weld cracks").
  • Effect — the consequence of that failure on the customer or system (e.g., "joint separates, loss of function").
  • Cause — the mechanism or reason the failure mode occurs (e.g., "insufficient weld penetration").
  • Controls — current methods that prevent the cause or detect the failure (e.g., "weld inspection").

Severity, Occurrence, and Detection

Each failure mode is rated on three dimensions, each scored 1–10:

RatingMeaning1 (best)10 (worst)
Severity (S)How serious is the effect?No noticeable effectHazardous, safety/regulatory
Occurrence (O)How likely is the cause?Extremely unlikelyAlmost certain
Detection (D)How poorly is it detected?Certain to be caughtCannot be detected

Note the direction of Detection: a high detection rating is bad — it means controls are unlikely to catch the failure before it escapes.

The Risk Priority Number

The classic prioritization metric is the Risk Priority Number:

RPN = Severity × Occurrence × Detection

Because each factor ranges 1–10, RPN ranges from 1 to 1000. Higher RPN signals higher risk and higher priority for action. Teams traditionally set a threshold (and always act on high Severity regardless of RPN) and then work to reduce O and D through design or process changes, recalculating the RPN afterward to confirm the risk dropped.

Limitations of RPN and the Action Priority Method

RPN has a well-known flaw: multiplication treats the three factors as interchangeable, so very different risk profiles can share the same number. A trivial nuisance (S=2, O=9, D=9, RPN=162) can outrank a near-miss safety hazard (S=9, O=3, D=3, RPN=81), which is backwards. To fix this, the 2019 AIAG-VDA FMEA standard replaced RPN with the Action Priority (AP) method. AP uses a structured logic table that weighs Severity first, then Occurrence, then Detection, assigning each combination a priority of High, Medium, or Low. This ensures high-severity items are never buried by a low product.

SeverityOccurrenceDetectionAction Priority
9–10HighAnyHigh
9–10LowLow–ModerateMedium
4–6ModerateModerateMedium
2–3LowGoodLow

(Illustrative — the full AP table is defined in the AIAG-VDA handbook.)

Design FMEA vs. Process FMEA

Design FMEA (DFMEA)Process FMEA (PFMEA)
FocusThe product designThe manufacturing/assembly process
Failure mode example"Bracket fatigues under load""Bolt under-torqued at station 4"
Cause example"Wall thickness too thin""Torque gun mis-calibrated"
OwnerDesign engineeringManufacturing / process engineering

A related variant, FMECA (Failure Mode, Effects, and Criticality Analysis), adds a quantitative criticality analysis. FMEA also pairs naturally with reliability engineering: the Occurrence rating reflects failure likelihood, which connects to failure-rate and MTBF analysis available on the reliability and MTBF calculator.

The FMEA Process

  1. Scope and assemble the team — cross-functional, with design, process, quality, and field knowledge.
  2. Identify functions — what the item or step is supposed to do.
  3. Identify failure modes, effects, and causes for each function.
  4. Rate S, O, and D using agreed scales.
  5. Prioritize using RPN or Action Priority.
  6. Define and assign actions to reduce risk, with owners and dates.
  7. Re-evaluate after actions are implemented and keep the FMEA as a living document.

Best Practices

  • Start early — the value of FMEA is preventing failures while changes are still cheap.
  • Act on severity first — never let a low RPN excuse a high-severity risk.
  • Attack occurrence over detection — preventing the cause is more robust than catching the defect.
  • Keep it living — update the FMEA when the design, process, or field data changes.