Overview¶
TrustLens is a reliability-focused evaluation layer for classification models. It helps teams move from metric reporting to deployment decisions backed by evidence.
Why This Matters¶
Accuracy alone is often insufficient for production decisions. A model can score high on accuracy while still being unsafe to deploy because of:
overconfident errors
subgroup performance disparity
weak probability calibration
TrustLens addresses this by combining diagnostic modules and explicit decision logic.
What TrustLens Evaluates¶
TrustLens evaluates models across four dimensions:
Calibration: are predicted probabilities aligned with real outcomes?
Failure behavior: are errors concentrated in high-confidence regions?
Bias and fairness: do important subgroups see uneven performance?
Representation quality: are embeddings well separated when provided?
These diagnostics are combined into a Trust Score, with penalties and blocker rules applied for high-risk conditions.
Typical Workflow¶
Run
analyze(model, X_val, y_val, y_prob=...).Inspect the returned
TrustReport.Review score, blockers, and dimension-level outputs.
Export artifacts for CI, governance, or comparison.
What You Get¶
A TrustLens run produces:
module-level metrics
a composite Trust Score with grade and verdict
narrative insights and detected risk patterns
saveable artifacts for downstream workflows