Processing

The playNano.processing subpackage provides tools for flattening, filtering and masking AFM image stacks. Processing is applied frame-by-frame to stacks shaped (n_frames, height, width) so that snapshots and per-step provenance are retained while preserving stack shape.

This page covers: - quick start and CLI examples - common filters and masks - how to supply pipelines (inline or YAML) - programmatic usage - a concise summary of what the pipeline records

See also the Command Line Interface (CLI), GUI: Interactive Playback and Analysis pages.

Quick start

Processing can be applied directly in a batch mode using the process subcommand. This can be used to apply a series of filters and export the results in the CLI:

playnano process ./tests/resources/sample_0.h5-jpk \
    --processing "remove_plane;threshold_mask:threshold=1.5;row_median_align;gaussian_filter:sigma=2.0" \
    --export tif,npz \
    --make-gif \
    --output-folder ./results \
    --output-name sample_processed

Or use a YAML pipeline file:

filters:
  - name: remove_plane
  - name: threshold_mask
    threshold: 2
  - name: polynomial_flatten
    order: 2
  - name: gaussian_filter
    sigma: 2.0
playnano process ./tests/resources/sample_0.h5-jpk --processing-file pipeline.yaml

Concepts & behaviour

  • Processing operations are 2D functions applied to each frame independently.

  • Supported step types: - Filters - modify image data (flattening, smoothing, alignment). - Masks - boolean masks used to exclude regions from subsequent filters. - Plugins - third-party filters registered via entry points.

  • The pipeline maintains snapshots for raw and intermediate results and records detailed provenance for reproducibility.

  • After a pipeline run the pipeline updates ``stack.data`` to the final processed array (so downstream code sees processed frames by default).

Built-in filters and masks

There are a number of built in functions that can be used to process AFM data.

These functions take a numpy array as a argument along with any parameters and return a numpy array. In the case of the filters this is an array of floats while the mask functions output a binary array.

Certain filters (e.g. remove_plane, row_median_align) support masked computation. When a binary mask is provided, the operation is applied to the full image, but its internal parameters are estimated only from unmasked pixels. This is useful when regions of the image contain artifacts, noise, or irrelevant features that should not influence the operation, but the correction itself must be applied globally (i.e. flattening based on background pixels).

The output of each function is saved as a step and the masks can also be used in analysis pipelines.

Filters

  • remove_plane - fit and subtract a 2D plane (useful for tilt removal).

  • polynomial_flatten - fit & subtract a 2D polynomial surface. - parameter: order (int, default: 2)

  • row_median_align - subtract median per row to remove horizontal banding.

  • zero_mean - subtract global mean (centres data around zero or background around zero if a foreground

    mask is applied).

  • gaussian_filter - gaussian smoothing. - parameter: sigma (float, default: 1.0)

Masks

  • mask_threshold - mask values above threshold. - parameter: threshold (float, default: 0.0)

  • mask_below_threshold - mask values below threshold. - parameter: threshold (float, default: 0.0)

  • mask_mean_offset - mask values beyond mean ± factor x std. - parameter: factor (float, default: 1.0)

  • mask_morphological - threshold + morphological closing (structure size param). - parameter: threshold (float) - parameter: structure_size (int, default= 3)

  • mask_adaptive - block-wise adaptive thresholding (block_size, offset). - parameter: block_size (int, default: 5) - parameter: offset (float, default: 0.0)

Note

Masks are combined using logical OR (new masks overlay previous ones). Use the clear step to reset masks.

Plugins

Extend the pipeline by registering filter functions via entry points under playNano.filters. This can be any callable that accepts a 2D numpy array with optional parameters and returns a processed 2D array.

Example pyproject.toml fragment:

[project.entry-points."playNano.filters"]
my_plugin = "my_pkg.module:my_filter"

Plugin signature:

def my_filter(frame: np.ndarray, **kwargs) -> np.ndarray:
    """
    Accepts a 2D array (frame) and returns a processed 2D array.
    """

When the plugin is installed, it appears in the same CLI/API list as the built-in filters.

CLI / GUI Usage

The processing pipeline can defined in the CLI and run in the CLI or the GUI.

The playNano wizard allows processing pipelines to be built interactively. To launch this you use the wizard subcommand followed by a path to the file you are processing and flags that define the output folder and file name (see Command Line Interface (CLI)).

Programmatic usage

The processing pipeline can be used programmatically via the ProcessingPipeline class, which operates on a AFMImageStack object. Use the add_filter() and add_mask() methods to build the pipeline step-by-step, and call run() to execute it.

Build and run a pipeline from Python:

from playNano.afm_stack import AFMImageStack
from playNano.processing.pipeline import ProcessingPipeline

stack = AFMImageStack.load_afm_stack("data/sample.h5-jpk", channel="height_trace")

pipeline = ProcessingPipeline(stack)
pipeline.add_filter("remove_plane")
pipeline.add_mask("mask_threshold", threshold=2.0)
pipeline.add_filter("gaussian_filter", sigma=1.0)

pipeline.run()   # updates stack.processed and stack.data

After execution, the processed frames are available via stack.data, and intermediate snapshots can be accessed through stack.processed.

Saved data & exports

The processing system supports exporting processed results and snapshots to:

  • OME-TIFF - multi-frame TIFF, compatible with ImageJ/Fiji.

  • NPZ - numpy zipped archive containing arrays and metadata.

  • HDF5 - self-contained bundle including data, processed snapshots and provenance.

  • GIF - annotated animated GIF (requires timing metadata for correct frame rates).

Use the CLI flags --export, --make-gif, --output-folder and --output-name to control export behaviour (See :docs:`cli` for CLI flag details).

What the pipeline records

After a run the following are available on the AFMImageStack:

  • stack.processed : dict - Snapshots keyed by step name (see detailed section below) and raw data preserved in raw.

  • stack.masks : dict - Boolean mask snapshots keyed by step name.

  • stack.provenance["processing"] : dict - steps : ordered list of per-step provenance records. - keys_by_name : mapping of step names to created snapshot keys.

  • stack.provenance["environment"] : metadata about OS / Python / package versions.

These records enable reproducibility and inspection of intermediate results.

Advanced / Implementation details

Snapshot key naming

  • Processed snapshot keys use the pattern:

    step_<idx>_<step_name>
    

    where idx is 1-based step index and step_name is the invoked step name with spaces replaced by underscores. A "raw" snapshot is created automatically (if missing) before the first processing step.

  • Mask snapshots are stored under stack.masks with similar keys. When masks are overlaid the new mask key concatenates the previous mask suffixes to preserve lineage.

Provenance record structure

For each run stack.provenance["processing"] is rebuilt and contains:

  • steps - list of dicts, each with fields: - index : int (1-based) - name : str (step name) - params : dict (keyword args passed) - timestamp : ISO-8601 UTC timestamp - step_type : "filter", "mask", "clear" or "plugin" - version : optional version string if provided via a decorator or plugin metadata - function_module : Python module path (where the function lives) - If mask: mask_key and a concise mask_summary (shape/dtype) - If filter/plugin: processed_key and an output_summary

  • keys_by_name - dict mapping step name to ordered list of created keys.

Other notes

  • Indexing in snapshot keys is 1-based (step_1_* is the first applied step).

  • After pipeline completion stack.data is overwritten with the final processed array so that subsequent consumers use the processed frames by default.

  • When you export a bundle (HDF5/NPZ) the provenance and snapshots are included.

  • If you pass log_to to programmatic run helpers, large arrays are sanitized (summarized) for JSON-friendly logging.

Inspecting results programmatically

# list snapshots
print(sorted(stack.processed.keys()))
print(sorted(stack.masks.keys()))

# walk provenance
for step in stack.provenance["processing"]["steps"]:
    print(step["index"], step["step_type"], step["name"], step.get("processed_key") or step.get("mask_key"))

# retrieve results produced by a named step
for key in stack.provenance["processing"]["keys_by_name"].get("polynomial_flatten", []):
    arr = stack.processed[key]
    # do stuff...

Tips & troubleshooting

  • If you expect a "raw" snapshot but do not see one, check whether you loaded an HDF5 bundle; bundles may already contain raw snapshots.

  • If a plugin filter does not appear in the CLI, ensure the package is installed and exposes the entry point group playNano.filters.

  • For large stacks, avoid asking the pipeline to write the entire record as raw JSON (use the HDF5 bundle instead).

See also