playnano.analysis.modules.k_means_clustering module¶
K-Means clustering on features over the entire stack in 3D (x, y, time).
This module extracts a point-cloud from per-frame feature dictionaries (e.g. coordinates + timestamps), optionally normalizes each axis to [0,1], applies K-Means with a user-supplied k, then returns cluster assignments, cluster centers (in original coordinate units), and a summary.
- param coord_key:
Key in previous_results whose value is features_per_frame (list of lists of dicts).
- type coord_key:
str
- param coord_columns:
Which keys in each feature-dict to use (e.g. (“x”,”y”)).
- type coord_columns:
Sequence[str]
- param use_time:
If True and coord_columns length is 2, append frame time as the third dimension.
- type use_time:
bool
- param k:
Number of clusters.
- type k:
int
- param normalise:
If True, min-max normalize each axis before clustering.
- type normalise:
bool
- param time_weight:
If given, multiply the time axis by this weight.
- type time_weight:
float | None
- param **kmeans_kwargs:
Forwarded to sklearn.cluster.KMeans.
- class playnano.analysis.modules.k_means_clustering.KMeansClusteringModule[source]¶
Bases:
AnalysisModuleCluster features across all frames using K-Means in 2D or 3D (x, y, [time]).
Extracts point coordinates from per-frame features, applies optional normalization and time weighting, then performs K-Means clustering. Returns cluster assignments, centers in original scale, and a summary report.
- Parameters:
coord_key (str) – Key in previous_results pointing to ‘features_per_frame’ structure.
coord_columns (Sequence[str]) – Keys to extract coordinates from each feature (e.g. (“x”, “y”)).
use_time (bool) – If True, appends frame timestamp as a third clustering dimension.
k (int) – Number of clusters to fit.
normalise (bool) – If True, normalize each axis to [0, 1] before clustering.
time_weight (float or None) – Optional multiplier for time axis after normalization.
**kmeans_kwargs – Additional keyword arguments passed to sklearn.cluster.KMeans.
Version
-------
0.1.0
- property name: str¶
Name of the analysis module.
- Returns:
The string identifier for this module.
- Return type:
- requires = ['feature_detection', 'log_blob_detection']¶
- run(stack, previous_results: dict[str, Any] | None = None, *, detection_module: str = 'feature_detection', coord_key: str = 'features_per_frame', coord_columns: Sequence[str] = ('centroid_x', 'centroid_y'), use_time: bool = True, k: int, normalise: bool = True, time_weight: float | None = None, **kmeans_kwargs: Any) dict[str, Any][source]¶
Perform K-Means clustering on features extracted from a stack.
Constructs a coordinate array from features (x, y[, t]), optionally applies normalization and time weighting, and fits k-means to assign clusters.
- Parameters:
stack (AFMImageStack) – The input image stack providing frame times and data context.
previous_results (dict[str, Any], optional) – Dictionary containing outputs from previous analysis steps. Must contain the selected detection_module and coord_key.
detection_module (str) – Key identifying which previous module’s output to use. Default is “feature_detection”.
coord_key (str) – Key under the detection module that holds per-frame feature dicts. Default is “features_per_frame”.
coord_columns (Sequence[str]) – Keys to extract from each feature for clustering coordinates. If missing, fallback to the “centroid” tuple is attempted. Default is (“centroid_x”, “centroid_y”)
use_time (bool) – If True and coord_columns is 2D, append frame timestamp as third dimension. Default is True.
k (int) – Number of clusters to compute.
normalise (bool) – Whether to min-max normalize each axis of the feature points before clustering. Default is True.
time_weight (float or None, optional) – Weighting factor for time axis (applied after normalization). Only used if time is included as a third dimension.
**kmeans_kwargs (dict) – Additional arguments forwarded to sklearn.cluster.KMeans.
- Returns:
A dictionary with the following keys:
- ”clusters”list of dicts, each with:
id : int
frames : list of int
point_indices : list of int
- coordslist of tuple
The normalized coordinates used in clustering for each point in the cluster (e.g., (x, y[, t])).
- ”cluster_centers”ndarray of shape (k, D)
Cluster centers in original coordinate units.
- ”summary”dict
”n_clusters” : int
”members_per_cluster” : dict mapping cluster id to point count
- Return type:
- Raises:
RuntimeError – If the required detection_module output is not found in previous_results.
KeyError – If the required coordinate keys are missing in any feature dictionary.
- version = '0.1.0'¶