playnano.analysis.modules.dbscan_clustering module¶
DBSCAN clustering on features over the entire stack in 3D (x, y, time).
This module extracts feature points from a previous analysis step, optionally normalizes them, applies DBSCAN, and returns clusters (with noise as label -1 omitted or optionally retained), cluster cores, and a summary.
- param coord_key:
Key in previous_results containing features_per_frame.
- type coord_key:
str
- param coord_columns:
Which keys in each feature-dict to use (e.g. (“x”,”y”)).
- type coord_columns:
Sequence[str]
- param use_time:
If True and coord_columns length is 2, append frame time as the third dimension.
- type use_time:
bool
- param eps:
The maximum distance between two samples for them to be considered as in the same neighborhood (in normalized units if normalise=True).
- type eps:
float
- param min_samples:
The number of samples in a neighborhood for a point to be considered as a core point.
- type min_samples:
int
- param normalise:
If True, min-max normalize each axis before clustering.
- type normalise:
bool
- param time_weight:
If given, multiply the time axis by this weight.
- type time_weight:
float | None
- param **dbscan_kwargs:
Forwarded to sklearn.cluster.DBSCAN.
- class playnano.analysis.modules.dbscan_clustering.DBSCANClusteringModule[source]¶
Bases:
AnalysisModuleDBSCAN clustering of features across an AFMImageStack in (x, y, time) space.
This module extracts coordinates from per-frame features, optionally adds time as a third dimension, normalizes the space, and applies DBSCAN clustering. It returns clusters with point metadata, core point means as cluster centers, and a summary of cluster sizes.
Version¶
0.1.0
- property name: str¶
Name of the analysis module.
- Returns:
The string identifier for this module: “dbscan_clustering”.
- Return type:
- requires = ['feature_detection', 'log_blob_detection']¶
- run(stack, previous_results: dict[str, Any] | None = None, *, detection_module: str = 'feature_detection', coord_key: str = 'features_per_frame', coord_columns: Sequence[str] = ('centroid_x', 'centroid_y'), use_time: bool = True, eps: float = 0.3, min_samples: int = 5, normalise: bool = True, time_weight: float | None = None, **dbscan_kwargs: Any) dict[str, Any][source]¶
Perform DBSCAN clustering on detected features in (x, y[, t]) space.
- Parameters:
stack (AFMImageStack) – The input stack with .data and .time_for_frame() method.
previous_results (dict[str, Any], optional) – Output from previous analysis steps. Must contain features under the given detection_module and coord_key.
detection_module (str) – Which module’s output to use from previous_results. Default is “feature_detection”.
coord_key (str) – Key in previous_results[detection_module] containing the list of per-frame features. Default is “features_per_frame”.
coord_columns (Sequence[str]) – Keys to extract coordinates from each feature. If missing, will fall back to centroid tuple. Default is (“centroid_x”, “centroid_y”).
use_time (bool) – Whether to append frame timestamp as a third coordinate. Dafaulr is True.
eps (float) – Maximum distance for neighborhood inclusion (in normalized units if normalise=True). Default is 0.3.
min_samples (int) – Minimum number of points in a neighborhood to form a core point. Default is 5.
normalise (bool) – If True, normalize coordinate axes to [0, 1] range before clustering. Default is True.
time_weight (float or None, optional) – Scaling factor for the time axis (after normalization). If None, no weighting is applied.
**dbscan_kwargs (dict) – Additional keyword arguments forwarded to sklearn.cluster.DBSCAN.
- Returns:
Output dictionary with the following keys:
- ”clusters”: list of dicts, one per cluster, containing:
”id”: cluster ID (int)
”frames”: list of frame indices
”point_indices”: list of feature indices within frames
”coords”: list of 2D or 3D coordinates (post-normalization)
- ”cluster_centers”: np.ndarray of shape (n_clusters, D)
Mean location of each cluster in original coordinate units.
- ”summary”: dict with:
”n_clusters”: total number of clusters found
”members_per_cluster”: dict of cluster ID to count
- Return type:
- version = '0.1.0'¶