# Failure Analysis Metrics

Failure analysis metrics identify where errors are concentrated and whether error confidence creates operational risk.

## Why This Matters

Two models with similar error rates can have very different risk profiles.
A model that is highly confident when wrong is usually harder to monitor and safer to block early.

## When to Use

- when incident cost is high for false confidence
- when investigating model behavior beyond aggregate accuracy
- when selecting between candidates with close top-line metrics

## Inputs and Assumptions

- `y_true`: ground-truth labels
- `y_pred`: predicted labels
- `y_prob`: predicted probabilities

## Output and Interpretation

Key outputs include:

- **Misclassification summary**: class-wise error context
- **Confidence gap**: separation between confidence on correct versus incorrect predictions
- **High-confidence error patterns**: useful for targeted inspection

A narrow or negative confidence gap is a warning signal in most operational contexts.

## Limitations and Caveats

- confidence signals depend on quality of upstream probability calibration
- low error counts can make distribution-level interpretation noisy

## API Reference

```{eval-rst}
.. automodule:: trustlens.metrics.failure
   :members:
   :show-inheritance:
```

## Related Pages

- [Features and Modules](../features.md)
- [Trust Score Explained](../trust_score_explained.md)
- [Known Limitations](../known_limitations.md)