playnano.analysis.modules.dbscan_clustering module

DBSCAN clustering on features over the entire stack in 3D (x, y, time).

This module extracts feature points from a previous analysis step, optionally normalizes them, applies DBSCAN, and returns clusters (with noise as label -1 omitted or optionally retained), cluster cores, and a summary.

param coord_key:

Key in previous_results containing features_per_frame.

type coord_key:

str

param coord_columns:

Which keys in each feature-dict to use (e.g. (“x”,”y”)).

type coord_columns:

Sequence[str]

param use_time:

If True and coord_columns length is 2, append frame time as the third dimension.

type use_time:

bool

param eps:

The maximum distance between two samples for them to be considered as in the same neighborhood (in normalized units if normalise=True).

type eps:

float

param min_samples:

The number of samples in a neighborhood for a point to be considered as a core point.

type min_samples:

int

param normalise:

If True, min-max normalize each axis before clustering.

type normalise:

bool

param time_weight:

If given, multiply the time axis by this weight.

type time_weight:

float | None

param **dbscan_kwargs:

Forwarded to sklearn.cluster.DBSCAN.

class playnano.analysis.modules.dbscan_clustering.DBSCANClusteringModule[source]

Bases: AnalysisModule

DBSCAN clustering of features across an AFMImageStack in (x, y, time) space.

This module extracts coordinates from per-frame features, optionally adds time as a third dimension, normalizes the space, and applies DBSCAN clustering. It returns clusters with point metadata, core point means as cluster centers, and a summary of cluster sizes.

Version

0.1.0

property name: str

Name of the analysis module.

Returns:

The string identifier for this module: “dbscan_clustering”.

Return type:

str

requires = ['feature_detection', 'log_blob_detection']
run(stack, previous_results: dict[str, Any] | None = None, *, detection_module: str = 'feature_detection', coord_key: str = 'features_per_frame', coord_columns: Sequence[str] = ('centroid_x', 'centroid_y'), use_time: bool = True, eps: float = 0.3, min_samples: int = 5, normalise: bool = True, time_weight: float | None = None, **dbscan_kwargs: Any) dict[str, Any][source]

Perform DBSCAN clustering on detected features in (x, y[, t]) space.

Parameters:
  • stack (AFMImageStack) – The input stack with .data and .time_for_frame() method.

  • previous_results (dict[str, Any], optional) – Output from previous analysis steps. Must contain features under the given detection_module and coord_key.

  • detection_module (str) – Which module’s output to use from previous_results. Default is “feature_detection”.

  • coord_key (str) – Key in previous_results[detection_module] containing the list of per-frame features. Default is “features_per_frame”.

  • coord_columns (Sequence[str]) – Keys to extract coordinates from each feature. If missing, will fall back to centroid tuple. Default is (“centroid_x”, “centroid_y”).

  • use_time (bool) – Whether to append frame timestamp as a third coordinate. Dafaulr is True.

  • eps (float) – Maximum distance for neighborhood inclusion (in normalized units if normalise=True). Default is 0.3.

  • min_samples (int) – Minimum number of points in a neighborhood to form a core point. Default is 5.

  • normalise (bool) – If True, normalize coordinate axes to [0, 1] range before clustering. Default is True.

  • time_weight (float or None, optional) – Scaling factor for the time axis (after normalization). If None, no weighting is applied.

  • **dbscan_kwargs (dict) – Additional keyword arguments forwarded to sklearn.cluster.DBSCAN.

Returns:

Output dictionary with the following keys:

  • ”clusters”: list of dicts, one per cluster, containing:
    • ”id”: cluster ID (int)

    • ”frames”: list of frame indices

    • ”point_indices”: list of feature indices within frames

    • ”coords”: list of 2D or 3D coordinates (post-normalization)

  • ”cluster_centers”: np.ndarray of shape (n_clusters, D)

    Mean location of each cluster in original coordinate units.

  • ”summary”: dict with:
    • ”n_clusters”: total number of clusters found

    • ”members_per_cluster”: dict of cluster ID to count

Return type:

dict[str, Any]

version = '0.1.0'