Failure Analysis Metrics¶
Failure analysis metrics identify where errors are concentrated and whether error confidence creates operational risk.
Why This Matters¶
Two models with similar error rates can have very different risk profiles. A model that is highly confident when wrong is usually harder to monitor and safer to block early.
When to Use¶
when incident cost is high for false confidence
when investigating model behavior beyond aggregate accuracy
when selecting between candidates with close top-line metrics
Inputs and Assumptions¶
y_true: ground-truth labelsy_pred: predicted labelsy_prob: predicted probabilities
Output and Interpretation¶
Key outputs include:
Misclassification summary: class-wise error context
Confidence gap: separation between confidence on correct versus incorrect predictions
High-confidence error patterns: useful for targeted inspection
A narrow or negative confidence gap is a warning signal in most operational contexts.
Limitations and Caveats¶
confidence signals depend on quality of upstream probability calibration
low error counts can make distribution-level interpretation noisy
API Reference¶
trustlens.metrics.failure.¶
Failure-mode analysis: where and how does a model fail?
Metrics implemented¶
misclassification_summary— per-class error rates and high-confidence mistakes.confidence_gap— distribution of confidence for correct vs. incorrect predictions.
- trustlens.metrics.failure.misclassification_summary(y_true: ndarray, y_pred: ndarray, y_prob: ndarray) dict[source]¶
Build a comprehensive misclassification summary.
For each class, reports: * total support (ground truth count) * number of misclassified samples * error rate * average confidence of misclassified samples (overconfident mistakes) * indices of the most confident misclassifications
- Parameters:
y_true (np.ndarray) – Ground-truth labels, shape (n_samples,).
y_pred (np.ndarray) – Model predictions, shape (n_samples,).
y_prob (np.ndarray) – Predicted probabilities, shape (n_samples,) for binary or (n_samples, n_classes) for multi-class.
- Returns:
Nested dictionary keyed by class label.
- Return type:
dict
Examples
>>> summary = misclassification_summary(y_true, y_pred, y_prob) >>> print(summary[1]["error_rate"]) # error rate for class 1
- trustlens.metrics.failure.confidence_gap(y_true: ndarray, y_pred: ndarray, y_prob: ndarray, n_bins: int = 20) dict[source]¶
Measure the confidence gap — how much more confident is the model on correct predictions than on incorrect ones?
- Returns:
correct_confidence— confidence distribution for correct predsincorrect_confidence— confidence distribution for incorrect predsgap— mean(correct_conf) - mean(incorrect_conf)histogram_bins— bin edges for the confidence histogramcorrect_hist— histogram counts for correct predictionsincorrect_hist— histogram counts for incorrect predictions
- Return type:
dict with keys
Examples
>>> gap_data = confidence_gap(y_true, y_pred, y_prob) >>> print(f"Confidence gap: {gap_data['gap']:.3f}")