# System Architecture TrustLens is designed as a layered pipeline so diagnostic computation, scoring logic, and report interpretation remain decoupled and maintainable. ## Why This Design Matters The architecture aims to solve three practical needs: - stable and testable metric computation - explicit decision logic with traceable rules - clear extension points for contributors ## High-Level Data Flow This diagram shows how inputs move from orchestration through framework-specific resolvers to the agnostic metrics pipeline. ```mermaid graph TD Input[Input: model, X, y_true, y_prob?, sensitive_features?] --> API[api.analyze Orchestration] API --> Registry[Backend Registry] subgraph "Prediction Resolution Layer" Registry --> Detect[Framework Detection] Detect --> Resolver[Framework Resolver: Sklearn/XGBoost] Resolver --> Bundle[PredictionBundle: y_pred, y_prob, metadata] end Bundle --> Pipeline[Core Pipeline: Agnostic] subgraph "Agnostic Metrics Engine" Pipeline --> Calib[Calibration: Brier, ECE] Pipeline --> Fail[Failure: Confidence Gap] Pipeline --> Bias[Bias: Subgroup, Equalized Odds] Pipeline --> Rep[Representation: Silhouette] end Calib --> Results[Results Dict] Fail --> Results Bias --> Results Rep --> Results Results --> Scoring["trust_score.py: base_score vs Penalties"] Scoring --> Report[TrustReport: Narrative & Interpretation] Report --> Output[Console / Plots / Saved Files] ``` **Implementation note**: `analyze()` uses the `Backend Registry` to automatically detect the framework and resolve predictions into a standardized `PredictionBundle`. This ensures the core metrics pipeline remains 100% framework-agnostic. ## Component Interactions The system is decoupled into four primary layers. ```mermaid graph LR API["api.py: Orchestration"] -- "model" --> Backends["backends/: Framework Resolvers"] Backends -- "PredictionBundle" --> Pipeline["core/pipeline.py: Execution Engine"] Pipeline -- "dict: results" --> Report["report.py: TrustReport"] Report -- "dict: results" --> Score["trust_score.py: Scoring Engine"] Score -- "TrustScoreResult" --> Report ``` ### Data Contracts 1. **API to Backends**: Model reference, feature matrix `X`, and optional manual overrides (`y_pred`, `y_prob`). 2. **Backends to Pipeline**: `PredictionBundle` containing standardized numpy arrays, framework identifier, and audit metadata. 3. **Pipeline to Metrics**: Normalized arrays (`y_true`, `y_pred`, `y_prob`) and optional metadata. 4. **Pipeline to Report**: Consolidated `results` plus audit provenance (framework version, resolver details). 5. **Report to Scorer**: Score computation from `results` including penalties and blockers. ## Execution Sequence This sequence shows one `analyze()` call from input to `TrustReport`. ```mermaid sequenceDiagram participant User participant API as api.analyze participant Registry as Backend Registry participant Backend as backends/ participant Pipeline as core.pipeline participant Report as TrustReport User->>API: model, X, y_true API->>Registry: get_resolver(model) Registry-->>API: Resolver function API->>Backend: resolve(model, X) Backend-->>API: PredictionBundle API->>Pipeline: _run_analysis_pipeline(Bundle) Pipeline->>Pipeline: Dispatch Agnostic Modules Pipeline->>Report: Initialize(results, Bundle.metadata) Report-->>User: TrustReport Object ``` ## Layer Responsibilities ### Orchestration Layer (`api.py`) - Coordinates framework detection and prediction resolution. - Validates high-level input shapes and data consistency. - Delegates to the framework-agnostic core pipeline. ### Framework Resolvers (`backends/`) - Handles framework-specific prediction logic (e.g., `predict_proba`). - Normalizes output shapes (e.g., binary `(n,)` → `(n, 2)`). - Captures audit metadata (framework version, model type). - Blocks unsupported tasks (e.g., regression or ranking). ### Core Pipeline (`trustlens/core/pipeline.py`) - **Framework-agnostic** execution engine. - Dispatches enabled metrics modules. - Manages progress tracking and resource isolation. ### Metrics Layer (`trustlens/metrics/`) - Pure math/stat functions for diagnostics. - Independent of models; operates entirely on labels and probabilities. ### Reporting Layer (`trustlens/report.py`) - Packages diagnostic outputs with full backend provenance. - Generates textual and visual summaries. - Supports unified JSON serialization for experiment trackers. ### Comparison Layer (`comparison.py`) - compares candidate `TrustReport` objects - filters blocked candidates - provides deployment recommendation rationale ## Extensibility Path To add a new metric module: 1. implement the metric under `trustlens/metrics/` 2. wire dispatch logic in `analyze()` 3. ensure return format follows existing `results` conventions 4. add tests and documentation 5. optionally integrate into trust score weighting ## Architectural Constraints - optimized for classification evaluation - fairness checks are data- and task-dependent - representation diagnostics require embeddings - some threshold logic is currently heuristic by design ## Related Pages - [Features and Modules](features.md) - [Trust Score Explained](trust_score_explained.md) - [Known Limitations](known_limitations.md) - [Experimental Modules](EXPERIMENTAL.md)