Representation Metrics

Representation metrics evaluate embedding geometry to estimate whether class structure is separable in latent space.

Why This Matters

Even when aggregate metrics look acceptable, weak latent separation can signal fragile generalization and higher failure risk.

When to Use

  • when embeddings are available from model internals

  • when diagnosing class overlap or feature-space quality

  • when comparing representation quality across model variants

Inputs and Assumptions

  • embeddings: latent vectors for each evaluated sample

  • y_true: labels aligned with embeddings

  • representation analysis is optional and only runs when embeddings are provided

Output and Interpretation

Key outputs include:

  • Silhouette score: higher values indicate better class separation

  • Within/between class distances: additional separation signal

  • CKA utility: representation similarity support for analysis workflows

Low separability should trigger deeper feature and model diagnostics, not immediate standalone conclusions.

Limitations and Caveats

  • representation quality depends on embedding extraction method

  • silhouette estimates can be unstable with small or highly imbalanced samples

  • representation outputs should be interpreted with calibration and failure diagnostics, not in isolation

2D Embedding Visualization

Use plot_embedding_2d() to project high-dimensional embeddings into a 2D scatter plot color-coded by class label.

report = TrustReport(...)
report.plot_embedding_2d(method="umap", save_path="clusters.png")

Fallback Behavior

The method parameter controls the projection algorithm:

  • "umap" (default) — requires umap-learn; falls back to t-SNE, then PCA

  • "tsne" — uses scikit-learn TSNE; falls back to PCA

  • "pca" — uses scikit-learn PCA (always available)

If the requested library is not installed, TrustLens silently falls back to the next available method. No extra dependencies are forced on users.

Subsampling

When the number of samples exceeds n_max (default 5000), the function automatically subsamples to keep runtime and plot density manageable:

report.plot_embedding_2d(n_max=3000)

Integration with report.plot()

When embeddings were passed to analyze(), calling report.plot() automatically generates both the separability scorecard and the 2D scatter projection. No extra call is needed.

API Reference

trustlens.metrics.representation.

Representation space analysis.

Probes the geometry of learned embedding spaces to understand: * Whether classes are well-separated * How similar two representation layers are (CKA) * Whether cluster structure aligns with ground-truth labels

Metrics implemented

  • embedding_separability — silhouette score + within/between class distance

  • centered_kernel_alignment — measures representational similarity between two sets of embeddings (e.g., two layers)

References

  • Kornblith, S., et al. (2019). Similarity of Neural Network Representations Revisited. ICML.

  • Rousseeuw, P. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics.

trustlens.metrics.representation.embedding_separability(embeddings: ndarray, y_true: ndarray, metric: str = 'euclidean', sample_limit: int = 5000) dict[source]

Measure how well class embeddings are separated in latent space.

Uses the silhouette score as the primary separability measure, augmented with within-class and between-class mean distances.

Parameters:
  • embeddings (np.ndarray) – Latent representations, shape (n_samples, embedding_dim).

  • y_true (np.ndarray) – Ground-truth labels, shape (n_samples,).

  • metric (str) – Distance metric passed to silhouette_score. Default "euclidean".

  • sample_limit (int) – Maximum samples used for silhouette computation (avoids O(n²) cost). A random subsample is drawn when len(embeddings) > sample_limit.

Returns:

  • silhouette_score — in [-1, 1]; 1.0 = perfect separation

  • within_class_distance — mean pairwise distance within classes

  • between_class_distance — mean pairwise distance across classes

  • separability_ratio — between / within (> 1 preferred)

Return type:

dict with keys

Examples

>>> sep = embedding_separability(embeddings, y_true)
>>> print(f"Silhouette: {sep['silhouette_score']:.3f}")
trustlens.metrics.representation.centered_kernel_alignment(X: ndarray, Y: ndarray) float[source]

Compute Centered Kernel Alignment (CKA) between two representation matrices.

CKA is a representational similarity metric that is invariant to orthogonal transformations and isotropic scaling, making it suitable for comparing representations across architectures and layers.

\[\begin{split}\\text{CKA}(K, L) = \\frac{\\text{HSIC}(K, L)}{ \\sqrt{\\text{HSIC}(K, K) \\cdot \\text{HSIC}(L, L)}}\end{split}\]
Parameters:
  • X (np.ndarray) – First representation matrix, shape (n_samples, d1).

  • Y (np.ndarray) – Second representation matrix, shape (n_samples, d2).

Returns:

CKA similarity score in [0, 1]. Higher → more similar representations.

Return type:

float

Raises:

ValueError – If X and Y have different numbers of samples.

Examples

>>> cka = centered_kernel_alignment(layer1_embeddings, layer2_embeddings)
>>> print(f"CKA similarity: {cka:.3f}")