Implementation plan: Keras¶
This document is for contributors and maintainers integrating the Keras API (keras.Model, Sequential, predict) with TrustLens so classification models produce NumPy y_pred and y_prob for the existing analysis pipeline.
Related plans: XGBoost (tabular backend) · TensorFlow (TensorFlow package, tf.keras, SavedModel, CI)
Status target: Experimental until EXPERIMENTAL.md promotion criteria are met. Do not require Keras (or TensorFlow) for pip install trustlens.
Scope: “Keras” vs “TensorFlow” in this repo¶
Topic |
This document (Keras) |
|
|---|---|---|
|
Primary |
References this plan |
|
Primary |
N/A |
|
Cross-reference |
Primary |
|
Note: same API patterns |
Primary for import/version/CI |
SavedModel, serving, GPU runtime |
Out of scope (v1) |
Primary |
Many users only use tf.keras. Implementation may ship analyze_keras in trustlens.experimental.keras using tf.keras.Model first; this Keras plan still defines API semantics and tests that apply to any Keras-compatible predict output.
Executive summary¶
Item |
Decision |
|---|---|
Goal |
Resolve |
Public API (until promotion) |
e.g. |
Dependencies |
Optional extra, e.g. |
Import rule |
Lazy |
Prerequisites¶
_run_analysis_pipelineextracted fromtrustlens.api.analyze()so Keras code does not duplicate calibration / failure / bias / representation logic. See XGBoost plan PR A for resolver context; pipeline extraction can land in the same or adjacent PR.Optional
y_pred=/y_prob=onanalyze()for power users who bypass Keras resolution.
Objectives (Keras)¶
Binary classification
Sigmoid output
(n, 1): normalize to TrustLens binary convention (two-columny_probor documented single-column path consistent withapi.py).Softmax output
(n, 2): use as-is;y_pred = argmax(axis=1).
Multiclass Softmax
(n, C),C > 2:y_probas-is;y_pred = argmax(axis=1).Input v1: NumPy
X(and optionalembeddings) only; callmodel.predict(X, verbose=0)thennp.asarray(..., dtype=np.float64).Examples
examples/keras_audit.py: smallSequentialmodel on synthetic data,analyze_keras, optionalreport.save(...).Documentation
docs/EXPERIMENTAL.md: Keras subsection — install, API, limitations, promotion checklist.
Non-goals (v1)¶
Multi-label, regression, object detection, dict / ragged inputs.
Custom training loops, callbacks, or layer freezing logic.
Native multi-backend Keras 3 on JAX/Torch for CI unless maintainers explicitly add jobs (numpy/TF path is acceptable for v1).
Monkey-patching
predict_probaonto Keras models.
Technical design¶
1. Output shape matrix (canonical)¶
Head |
|
|
|
|---|---|---|---|
Binary sigmoid |
|
|
|
Binary softmax |
|
unchanged |
|
Multiclass softmax |
|
unchanged |
|
Centralize normalization in resolve_keras_predictions(model, X) -> tuple[y_pred, y_prob] in trustlens/experimental/keras.py (or submodule).
2. Model typing¶
Prefer
isinstance(model, keras.Model)when using the standalonekeraspackage.If v1 targets
tf.keras.Modelonly, state that explicitly in module docstring and still follow the shape matrix above (types live in TensorFlow plan).
3. Integration¶
analyze_keras(model, X, y_true, *, embeddings=None, ...)
-> resolve_keras_predictions(model, X)
-> _run_analysis_pipeline(y_true, y_pred, y_prob, ...)
-> TrustReport
4. Embeddings¶
Same as sklearn path: user-supplied embeddings NumPy array. Optional later: helper to attach a Keras intermediate layer (separate feature / not v1).
Files to add or change (checklist)¶
Path |
Action |
|---|---|
|
|
|
Docstring / limited exports; no heavy imports at import time. |
|
Export |
|
Optional |
|
Keras integration subsection. |
|
One-line pointer to experimental Keras. |
|
End-to-end demo. |
|
Shape unit tests (pure NumPy) + optional integration tests. |
|
e.g. |
Testing strategy¶
Pure NumPy tests — Feed
resolve_keras_predictions–level helpers with fixed arrays mimicking(n,1),(n,2),(n,C)outputs; run in default CI without Keras installed.Integration tests — Behind
pytest.importorskip("keras")ortensorflowdepending on v1 choice; binary + 3-class models.Import hygiene —
import trustlensmust not importkerasortensorflow(subprocess test recommended).
CI recommendations¶
Default PR CI: no heavy Keras/TF install unless job time is acceptable.
Optional: weekly /
workflow_dispatchjob installing the chosen extra and running marked tests.
Coordinate exact markers with TensorFlow plan to avoid duplicate jobs.
Acceptance criteria (Keras experimental “ready”)¶
[ ]
analyze_kerasreturnsTrustReportwith calibration, failure, bias, representation populated for binary and 3-class toy models.[ ] Binary sigmoid and softmax paths both tested.
[ ]
import trustlensdoes not load Keras/TF (per implementation choice).[ ]
docs/EXPERIMENTAL.mdupdated.[ ]
CHANGELOG.mdentry (Experimental).
Suggested PR breakdown¶
Pipeline extraction —
_run_analysis_pipelineonly; no Keras.Experimental Keras —
resolve_keras_predictions+analyze_keras+ tests + example + docs.
Promotion to stable API¶
When promoted per EXPERIMENTAL.md:
Document
framework="keras"on mainanalyze()or keepanalyze_kerasas the supported entry point.Expand CI if Keras becomes a first-class optional extra.
FAQ¶
Q: Keras 3 multi-backend vs tf.keras only? A: v1 should pick one supported install to reduce support burden; document the other as “community tested” or future work.
Q: Overlap with TensorFlow plan? A: Keras plan owns shapes and API; TensorFlow plan owns TF package, versions, SavedModel, and lazy-import policy.
Document history¶
Scope: Keras API only; XGBoost and TensorFlow have separate plans.