playnano.processing.video_processing module

Video processing functions for AFM time-series (stacks of frames).

This module provides functions that operate on 3D numpy arrays (time-series of 2D AFM frames). These include:

  • Frame alignment to compensate for drift

  • Cropping and padding utilities

  • Temporal (time-domain) filters

  • Future extensions such as spatio-temporal denoising

All functions follow a NumPy-style API: input stacks are 3D arrays with shape (n_frames, height, width). Outputs are processed stacks and a metadata dictionary.

playnano.processing.video_processing.align_frames(stack: ndarray, reference_frame: int = 0, method: str = 'fft_cross_correlation', mode: str = 'pad', debug: bool = False, max_shift: int | None = None, pre_filter_sigma: float | None = None, max_jump: int | None = None)[source]

Align a stack of AFM frames to a reference frame using integer-pixel shifts.

Alignment is performed using either FFT-based or full cross-correlation. Jump smoothing prevents abrupt unrealistic displacements between consecutive frames by limiting the change in shift relative to the previous frame.

Parameters:
  • stack (np.ndarray[float]) – 3D array of shape (n_frames, height, width) containing the input AFM image stack.

  • reference_frame (int, optional) – Index of the frame to use as the alignment reference (default 0). Must be within [0, n_frames-1].

  • method ({"fft_cross_correlation", "full_cross_correlation"}, optional) – Alignment method (default “fft_cross_correlation”). FFT-based cross-correlation is generally faster and uses less memory for large frames.

  • mode ({"pad", "crop", "crop_square"}, optional) – How to handle borders after shifting: - “pad”: keep all frames with NaN padding (default) - “crop”: crop to intersection of all frames - “crop_square”: crop to largest centered square

  • debug (bool, optional) – If True, returns additional diagnostic outputs.

  • max_shift (int, optional) – Maximum allowed shift in pixels. Detected shifts are clipped to this range.

  • pre_filter_sigma (float, optional) – Standard deviation of Gaussian filter applied to frames before cross-correlation.

  • max_jump (int, optional) – Maximum allowed change in shift between consecutive frames. If exceeded, the shift is replaced by a linear extrapolation from the previous two frames.

Returns:

  • aligned_stack (np.ndarray[float]) – Aligned 3D stack of frames. Shape may be larger than input to accommodate all shifts.

  • metadata (dict) – Dictionary containing alignment information: - “reference_frame”: int, index of the reference frame - “method”: str, the alignment method used - “mode”: str, border approach used - “shifts”: np.ndarray of shape (n_frames, 2), detected (dy, dx) shifts - “original_shape”: tuple of (height, width) - “aligned_shape”: tuple of (height, width) of the output canvas - “border_mask”: np.ndarray[bool], True where valid frame pixels exist - “pre_filter_sigma”: float or None - “max_shift”: int or None - “max_jump”: int or None

  • debug_outputs (dict, optional) – Returned only if debug=True. Contains: - “shifts”: copy of the shifts array.

Raises:
  • ValueError – If stack.ndim is not 3.

  • ValueError – If method is not one of {“fft_cross_correlation”, “full_cross_correlation”}.

  • ValueError – If reference_frame is not in the range [0, n_frames-1].

Notes

  • Using fft_cross_correlation reduces memory usage compared to full

cross-correlation because it leverages the FFT algorithm and avoids creating large full correlation matrices. - Padding with NaNs allows all frames to be placed without clipping, but may increase memory usage for large shifts. - The function does not interpolate subpixel shifts; all shifts are integer-valued.

Examples

>>> import numpy as np
>>> from playnano.processing.video_processing import align_frames
>>> stack = np.random.rand(10, 200, 200)  # 10 frames of 200x200 pixels
>>> aligned_stack, metadata = align_frames(stack, reference_frame=0)
>>> aligned_stack.shape
(10, 210, 210)  # padded to accommodate shifts
>>> metadata['shifts']
array([[ 0,  0],
    [ 1, -2],
    ...])
playnano.processing.video_processing.crop_square(stack: ndarray, pad=0) tuple[ndarray, dict][source]

Crop aligned stack to the largest centered square region.

This is based on the finite-pixel intersection across frames, with optional outward padding (np.nan).

Parameters:
  • stack (ndarray of shape (n_frames, height, width)) – Input aligned stack with possible NaN padding.

  • pad (int or tuple, optional (default=0)) –

    Extra pixels to add around the square bounds. Accepts:

    • int: uniform pad

    • (v, h): vertical and horizontal pad

    • (top, bottom, left, right): per-side pad

Returns:

  • cropped (ndarray) – Cropped (and possibly padded) square stack.

  • meta (dict) – Metadata including original shape, intersection shape, square size, bounds, padding details, and offset compatible with the original function (offset within the intersection crop).

playnano.processing.video_processing.intersection_crop(stack: ndarray, pad=0) tuple[ndarray, dict][source]

Crop aligned stack to the largest common intersection region (finite across frames).

Option to add padding to expand the crop beyond the intersection, filling with NaN when beyond the data.

Parameters:
  • stack (ndarray of shape (n_frames, height, width)) – Input aligned stack with NaN padding.

  • pad (int or tuple, optional (default=0)) – Extra pixels to add around the intersection bounds. - int: uniform pad - (v, h): vertical and horizontal pad - (top, bottom, left, right): per-side pad

Returns:

  • cropped (ndarray) – Cropped (and possibly padded) stack.

  • meta (dict) – Metadata including original shape, intersection bounds, requested bounds, actual padding applied, and new shape.

playnano.processing.video_processing.register_video_processing() dict[str, Callable][source]

Return a dictionary of registered video processing filters.

Keys are names of the operations, values are the functions themselves. These functions should take a 3D stack (n_frames, H, W) and return either an ndarray (filtered stack) or a tuple (stack, metadata).

playnano.processing.video_processing.replace_nan(stack: ndarray, mode: Literal['zero', 'mean', 'median', 'global_mean', 'constant'] = 'zero', value: float | None = None) tuple[ndarray, dict][source]

Replace NaN values in a 2D frame or 3D AFM image stack using various strategies.

Primarily used in video pipelines after alignment, but also applicable to single frames.

Parameters:
  • stack (np.ndarray) – Input 3D array of shape (n_frames, height, width) or 2D frame (height, width) that may contain NaN values.

  • mode ({"zero", "mean", "median", "global_mean", "constant"}, optional) – Replacement strategy. Default is “zero”. - “zero” : Replace NaNs with 0. - “mean” : Replace NaNs with the mean of each frame. - “median” : Replace NaNs with the median of each frame. - “global_mean” : Replace NaNs with the mean of the entire stack. - “constant” : Replace NaNs with a user-specified constant value.

  • value (float, optional) – Constant value to use when mode=”constant”. Must be provided in that case.

Returns:

  • filled (np.ndarray) – Stack of the same shape as stack with NaNs replaced according to mode.

  • meta (dict) – Metadata about the NaN replacement operation (e.g., count, mode, constant used).

Raises:

ValueError – If mode is unknown or if mode=”constant” and value is not provided.

Notes

  • Frame-wise operations like “mean” and “median” compute statistics per frame independently.

  • Preserves the dtype of the input stack.

playnano.processing.video_processing.rolling_frame_align(stack: ndarray, window: int = 5, mode: str = 'pad', debug: bool = False, max_shift: int | None = None, pre_filter_sigma: float | None = None, max_jump: int | None = None)[source]

Align a stack of AFM frames using a rolling reference and integer pixel shifts.

This function computes frame-to-frame shifts relative to a rolling reference (average of the last window aligned frames) using phase cross-correlation. Each frame is then placed on a canvas large enough to accommodate all shifts. Optional jump smoothing prevents sudden unrealistic displacements between consecutive frames, and optional Gaussian pre-filtering can improve correlation robustness for noisy data.

Parameters:
  • stack (np.ndarray[float]) – 3D array of shape (n_frames, height, width) containing the image frames.

  • window (int, optional) – Number of previous aligned frames to average when building the rolling reference. Default is 5.

  • mode ({"pad", "crop", "crop_square"}, optional) – How to handle borders after shifting: - “pad”: keep all frames with NaN padding (default) - “crop”: crop to intersection of all frames - “crop_square”: crop to largest centered square

  • debug (bool, optional) – If True, returns additional diagnostic outputs such as the rolling reference frames. Default is False.

  • max_shift (int, optional) – Maximum allowed shift in pixels along either axis. Detected shifts are clipped. Default is None (no clipping).

  • pre_filter_sigma (float, optional) – Standard deviation of Gaussian filter applied to both reference and moving frames prior to cross-correlation. Helps reduce noise. Default is None.

  • max_jump (int, optional) – Maximum allowed jump in pixels between consecutive frame shifts. If exceeded, the shift is replaced by a linear extrapolation from the previous two shifts. Default is None (no jump smoothing).

Returns:

  • aligned_stack (np.ndarray[float]) – 3D array of shape (n_frames, canvas_height, canvas_width) containing the aligned frames. NaN values indicate areas outside the original frames after alignment.

  • metadata (dict) – Dictionary containing alignment information: - “window”: int, rolling reference window used - “method”: str, alignment method used - “mode”: str, border approach used - “shifts”: ndarray of shape (n_frames, 2), detected integer shifts (dy, dx) - “original_shape”: tuple of (height, width) - “aligned_shape”: tuple of (canvas_height, canvas_width) - “border_mask”: ndarray of shape (canvas_height, canvas_width), True where

    valid pixels exist

    • ”pre_filter_sigma”: float or None

    • ”max_shift”: int or None

    • ”max_jump”: int or None

  • debug_outputs (dict, optional) – Returned only if debug=True. Contains: - “shifts”: copy of the detected shifts array - “aligned_refs”: deque of indices used for rolling reference

Raises:

Notes

  • The rolling reference is computed using the last window aligned frames, ignoring NaN pixels.

  • Shifts are integer-valued; no subpixel interpolation is performed.

  • Padding ensures all frames fit without clipping, but increases memory usage.

  • Internally, a deque aligned_refs tracks which patches of which frames contribute to the rolling reference. Each entry stores:

    (frame_index, y0c, y1c, x0c, x1c, fy0, fy1, fx0, fx1),

    i.e. both the region of the canvas updated and the corresponding slice in the original frame. This allows exact removal of old contributions from rolling_sum and rolling_count when the window is exceeded, ensuring consistency without recomputation.

Examples

>>> import numpy as np
>>> from playnano.processing.video_processing import rolling_frame_align
>>> stack = np.random.rand(10, 200, 200)  # 10 frames of 200x200 pixels
>>> aligned_stack, metadata = rolling_frame_align(stack, window=3)
>>> aligned_stack.shape
(10, 210, 210)
>>> metadata['shifts']
array([[0, 0],
       [1, -1],
       ...])
playnano.processing.video_processing.temporal_mean_filter(stack: ndarray, window: int = 3) ndarray[source]

Apply mean filter across the time dimension.

Parameters:
  • stack (ndarray of shape (n_frames, height, width)) – Input stack.

  • window (int, optional) – Window size (number of frames). Default is 3.

Returns:

filtered – Stack after temporal mean filtering.

Return type:

ndarray of shape (n_frames, height, width)

playnano.processing.video_processing.temporal_median_filter(stack: ndarray, window: int = 3) ndarray[source]

Apply median filter across the time dimension.

Parameters:
  • stack (ndarray of shape (n_frames, height, width)) – Input stack.

  • window (int, optional) – Window size (number of frames). Default is 3.

Returns:

filtered – Stack after temporal median filtering.

Return type:

ndarray of shape (n_frames, height, width)