Overview¶

TrustLens is a reliability-focused evaluation layer for classification models. It helps teams move from metric reporting to deployment decisions backed by evidence.

Why This Matters¶

Accuracy alone is often insufficient for production decisions. A model can score high on accuracy while still being unsafe to deploy because of:

overconfident errors
subgroup performance disparity
weak probability calibration

TrustLens addresses this by combining diagnostic modules and explicit decision logic.

What TrustLens Evaluates¶

TrustLens evaluates models across four dimensions:

Calibration: are predicted probabilities aligned with real outcomes?
Failure behavior: are errors concentrated in high-confidence regions?
Bias and fairness: do important subgroups see uneven performance?
Representation quality: are embeddings well separated when provided?

These diagnostics are combined into a Trust Score, with penalties and blocker rules applied for high-risk conditions.

Typical Workflow¶

Run analyze(model, X_val, y_val, y_prob=...).
Inspect the returned TrustReport.
Review score, blockers, and dimension-level outputs.
Export artifacts for CI, governance, or comparison.

What You Get¶

A TrustLens run produces:

module-level metrics
a composite Trust Score with grade and verdict
narrative insights and detected risk patterns
saveable artifacts for downstream workflows

Overview¶

Why This Matters¶

What TrustLens Evaluates¶

Typical Workflow¶

What You Get¶

Related Pages¶