# Implementation plan: Keras This document is for **contributors** and **maintainers** integrating the **Keras API** (`keras.Model`, `Sequential`, `predict`) with TrustLens so classification models produce NumPy `y_pred` and `y_prob` for the existing analysis pipeline. **Related plans:** [XGBoost](IMPLEMENTATION_PLAN_XGBoost.md) (tabular backend) · [TensorFlow](IMPLEMENTATION_PLAN_TensorFlow.md) (TensorFlow package, `tf.keras`, SavedModel, CI) **Status target:** **Experimental** until [`EXPERIMENTAL.md`](../EXPERIMENTAL.md) promotion criteria are met. Do **not** require Keras (or TensorFlow) for `pip install trustlens`. --- ## Scope: “Keras” vs “TensorFlow” in this repo | Topic | This document (Keras) | [TensorFlow plan](IMPLEMENTATION_PLAN_TensorFlow.md) | |--------|-------------------------|------------------------------------------------------| | `model.predict` output shapes, binary vs multiclass | **Primary** | References this plan | | `keras` PyPI package, `keras.Model` type checks | **Primary** | N/A | | `tensorflow` PyPI extra, lazy `import tensorflow` | Cross-reference | **Primary** | | `tf.keras` as deployment vehicle | Note: same API patterns | **Primary** for import/version/CI | | SavedModel, serving, GPU runtime | Out of scope (v1) | **Primary** | Many users only use **`tf.keras`**. Implementation may ship **`analyze_keras`** in `trustlens.experimental.keras` using `tf.keras.Model` first; this Keras plan still defines **API semantics** and tests that apply to any Keras-compatible `predict` output. --- ## Executive summary | Item | Decision | |------|----------| | **Goal** | Resolve `y_pred`, `y_prob` from a trained **classification** Keras model and run `_run_analysis_pipeline(...)` (shared with `analyze()`). | | **Public API (until promotion)** | e.g. `from trustlens.experimental.keras import analyze_keras` — avoid silent heavy imports from `import trustlens`. | | **Dependencies** | Optional extra, e.g. `keras = ["keras>=3"]` **and/or** overlap with TensorFlow extra when implementation uses `tf.keras` — document the **single supported install path for v1** in `pyproject.toml` and here once chosen. | | **Import rule** | Lazy `import keras` or lazy `import tensorflow as tf` **only** inside experimental modules (see TensorFlow plan for TF-specific hygiene). | --- ## Prerequisites 1. **`_run_analysis_pipeline`** extracted from `trustlens.api.analyze()` so Keras code does not duplicate calibration / failure / bias / representation logic. See [XGBoost plan](IMPLEMENTATION_PLAN_XGBoost.md) PR A for resolver context; pipeline extraction can land in the same or adjacent PR. 2. Optional **`y_pred=`** / **`y_prob=`** on `analyze()` for power users who bypass Keras resolution. --- ## Objectives (Keras) 1. **Binary classification** - Sigmoid output `(n, 1)`: normalize to TrustLens binary convention (two-column `y_prob` or documented single-column path consistent with `api.py`). - Softmax output `(n, 2)`: use as-is; `y_pred = argmax(axis=1)`. 2. **Multiclass** Softmax `(n, C)`, `C > 2`: `y_prob` as-is; `y_pred = argmax(axis=1)`. 3. **Input** **v1:** NumPy `X` (and optional `embeddings`) only; call `model.predict(X, verbose=0)` then `np.asarray(..., dtype=np.float64)`. 4. **Examples** `examples/keras_audit.py`: small `Sequential` model on synthetic data, `analyze_keras`, optional `report.save(...)`. 5. **Documentation** `docs/EXPERIMENTAL.md`: Keras subsection — install, API, limitations, promotion checklist. --- ## Non-goals (v1) - Multi-label, regression, object detection, dict / ragged inputs. - Custom training loops, callbacks, or layer freezing logic. - **Native** multi-backend Keras 3 on JAX/Torch for CI unless maintainers explicitly add jobs (numpy/TF path is acceptable for v1). - Monkey-patching `predict_proba` onto Keras models. --- ## Technical design ### 1. Output shape matrix (canonical) | Head | `predict` shape | `y_prob` (TrustLens) | `y_pred` | |------|-----------------|----------------------|----------| | Binary sigmoid | `(n, 1)` | `[1-p, p]` shape `(n, 2)` *recommended* | `argmax` or `(p > 0.5).astype(int)` — **pick one and test** | | Binary softmax | `(n, 2)` | unchanged | `argmax(axis=1)` | | Multiclass softmax | `(n, C)` | unchanged | `argmax(axis=1)` | Centralize normalization in **`resolve_keras_predictions(model, X) -> tuple[y_pred, y_prob]`** in `trustlens/experimental/keras.py` (or submodule). ### 2. Model typing - Prefer **`isinstance(model, keras.Model)`** when using the standalone `keras` package. - If v1 targets **`tf.keras.Model`** only, state that explicitly in module docstring and still follow the shape matrix above (types live in TensorFlow plan). ### 3. Integration ```text analyze_keras(model, X, y_true, *, embeddings=None, ...) -> resolve_keras_predictions(model, X) -> _run_analysis_pipeline(y_true, y_pred, y_prob, ...) -> TrustReport ``` ### 4. Embeddings Same as sklearn path: user-supplied `embeddings` NumPy array. Optional later: helper to attach a Keras intermediate layer (separate feature / not v1). --- ## Files to add or change (checklist) | Path | Action | |------|--------| | `trustlens/experimental/keras.py` | `resolve_keras_predictions`, `analyze_keras`. | | `trustlens/experimental/__init__.py` | Docstring / limited exports; no heavy imports at import time. | | `trustlens/api.py` | Export `_run_analysis_pipeline` for experimental use **or** move pipeline to `trustlens/backends/pipeline.py` to avoid circular imports. | | `pyproject.toml` | Optional `keras` and/or document use of `tensorflow` extra — one clear story. | | `docs/EXPERIMENTAL.md` | Keras integration subsection. | | `docs/getting_started.md` | One-line pointer to experimental Keras. | | `examples/keras_audit.py` | End-to-end demo. | | `tests/test_keras_experimental.py` | Shape unit tests (pure NumPy) + optional integration tests. | | `pyproject.toml` markers | e.g. `requires_keras` / reuse `requires_tensorflow` if tests use tf.keras only. | --- ## Testing strategy 1. **Pure NumPy tests** — Feed `resolve_keras_predictions`–level helpers with fixed arrays mimicking `(n,1)`, `(n,2)`, `(n,C)` outputs; run in default CI without Keras installed. 2. **Integration tests** — Behind `pytest.importorskip("keras")` or `tensorflow` depending on v1 choice; binary + 3-class models. 3. **Import hygiene** — `import trustlens` must not import `keras` or `tensorflow` (subprocess test recommended). --- ## CI recommendations - Default PR CI: **no** heavy Keras/TF install unless job time is acceptable. - Optional: weekly / `workflow_dispatch` job installing the chosen extra and running marked tests. Coordinate exact markers with [TensorFlow plan](IMPLEMENTATION_PLAN_TensorFlow.md) to avoid duplicate jobs. --- ## Acceptance criteria (Keras experimental “ready”) - [ ] `analyze_keras` returns `TrustReport` with calibration, failure, bias, representation populated for binary and 3-class toy models. - [ ] Binary sigmoid and softmax paths both tested. - [ ] `import trustlens` does not load Keras/TF (per implementation choice). - [ ] `docs/EXPERIMENTAL.md` updated. - [ ] `CHANGELOG.md` entry (Experimental). --- ## Suggested PR breakdown 1. **Pipeline extraction** — `_run_analysis_pipeline` only; no Keras. 2. **Experimental Keras** — `resolve_keras_predictions` + `analyze_keras` + tests + example + docs. --- ## Promotion to stable API When promoted per `EXPERIMENTAL.md`: - Document `framework="keras"` on main `analyze()` **or** keep `analyze_keras` as the supported entry point. - Expand CI if Keras becomes a first-class optional extra. --- ## FAQ **Q: Keras 3 multi-backend vs tf.keras only?** **A:** v1 should pick **one** supported install to reduce support burden; document the other as “community tested” or future work. **Q: Overlap with TensorFlow plan?** **A:** Keras plan owns **shapes and API**; TensorFlow plan owns **TF package**, versions, SavedModel, and lazy-import policy. --- ## Document history - **Scope:** Keras API only; XGBoost and TensorFlow have separate plans.