Processing ========== The ``playNano.processing`` subpackage provides tools for flattening, filtering and masking AFM image stacks. Processing is applied frame-by-frame to stacks shaped ``(n_frames, height, width)`` so that snapshots and per-step provenance are retained while preserving stack shape. This page covers: - quick start and CLI examples - common filters and masks - how to supply pipelines (inline or YAML) - programmatic usage - a concise summary of what the pipeline records See also the :doc:`cli`, :doc:`gui` and :doc:`analysis` pages. Quick start ----------- Processing can be applied directly in a batch mode using the ``process`` subcommand. This can be used to apply a series of filters and export the results in the CLI: .. code-block:: bash playnano process ./tests/resources/sample_0.h5-jpk \ --processing "remove_plane;threshold_mask:threshold=1.5;row_median_align;gaussian_filter:sigma=2.0" \ --export tif,npz \ --make-gif \ --output-folder ./results \ --output-name sample_processed Or use a YAML pipeline file: .. code-block:: yaml filters: - name: remove_plane - name: threshold_mask threshold: 2 - name: polynomial_flatten order: 2 - name: gaussian_filter sigma: 2.0 .. code-block:: bash playnano process ./tests/resources/sample_0.h5-jpk --processing-file pipeline.yaml Concepts & behaviour -------------------- - Processing operations are **2D functions** applied to each frame independently. - Supported step types: - **Filters** - modify image data (flattening, smoothing, alignment). - **Masks** - boolean masks used to exclude regions from subsequent filters. - **Plugins** - third-party filters registered via entry points. - The pipeline maintains snapshots for raw and intermediate results and records detailed provenance for reproducibility. - After a pipeline run the pipeline **updates ``stack.data``** to the final processed array (so downstream code sees processed frames by default). Built-in filters and masks -------------------------- There are a number of built in functions that can be used to process AFM data. These functions take a numpy array as a argument along with any parameters and return a numpy array. In the case of the filters this is an array of floats while the mask functions output a binary array. Certain filters (e.g. ``remove_plane``, ``row_median_align``) support masked computation. When a binary mask is provided, the operation is applied to the full image, but its internal parameters are estimated only from unmasked pixels. This is useful when regions of the image contain artifacts, noise, or irrelevant features that should not influence the operation, but the correction itself must be applied globally (i.e. flattening based on background pixels). The output of each function is saved as a step and the masks can also be used in analysis pipelines. Filters ^^^^^^^ - ``remove_plane`` - fit and subtract a 2D plane (useful for tilt removal). - ``polynomial_flatten`` - fit & subtract a 2D polynomial surface. - parameter: ``order`` (int, default: 2) - ``row_median_align`` - subtract median per row to remove horizontal banding. - ``zero_mean`` - subtract global mean (centres data around zero or background around zero if a foreground mask is applied). - ``gaussian_filter`` - gaussian smoothing. - parameter: ``sigma`` (float, default: 1.0) Masks ^^^^^ - ``mask_threshold`` - mask values above ``threshold``. - parameter: ``threshold`` (float, default: 0.0) - ``mask_below_threshold`` - mask values below ``threshold``. - parameter: ``threshold`` (float, default: 0.0) - ``mask_mean_offset`` - mask values beyond mean ± factor x std. - parameter: ``factor`` (float, default: 1.0) - ``mask_morphological`` - threshold + morphological closing (structure size param). - parameter: ``threshold`` (float) - parameter: ``structure_size`` (int, default= 3) - ``mask_adaptive`` - block-wise adaptive thresholding (``block_size``, ``offset``). - parameter: ``block_size`` (int, default: 5) - parameter: ``offset`` (float, default: 0.0) .. note:: Masks are combined using logical OR (new masks overlay previous ones). Use the ``clear`` step to reset masks. Plugins ------- Extend the pipeline by registering filter functions via entry points under ``playNano.filters``. This can be any callable that accepts a 2D numpy array with optional parameters and returns a processed 2D array. Example `pyproject.toml` fragment: .. code-block:: toml [project.entry-points."playNano.filters"] my_plugin = "my_pkg.module:my_filter" Plugin signature: .. code-block:: python def my_filter(frame: np.ndarray, **kwargs) -> np.ndarray: """ Accepts a 2D array (frame) and returns a processed 2D array. """ When the plugin is installed, it appears in the same CLI/API list as the built-in filters. CLI / GUI Usage --------------- The processing pipeline can defined in the CLI and run in the CLI or the GUI. The **playNano** wizard allows processing pipelines to be built interactively. To launch this you use the ``wizard`` subcommand followed by a path to the file you are processing and flags that define the output folder and file name (see :doc:`cli`). .. code-block:: playnano wizard .test/resources/sample_0.h5-jpk --output-folder ./results --output-name processed_sample Programmatic usage ------------------ The processing pipeline can be used programmatically via the :class:`~playNano.processing.pipeline.ProcessingPipeline` class, which operates on a :class:`~playNano.afm_stack.AFMImageStack` object. Use the ``add_filter()`` and ``add_mask()`` methods to build the pipeline step-by-step, and call ``run()`` to execute it. Build and run a pipeline from Python: .. code-block:: python from playNano.afm_stack import AFMImageStack from playNano.processing.pipeline import ProcessingPipeline stack = AFMImageStack.load_afm_stack("data/sample.h5-jpk", channel="height_trace") pipeline = ProcessingPipeline(stack) pipeline.add_filter("remove_plane") pipeline.add_mask("mask_threshold", threshold=2.0) pipeline.add_filter("gaussian_filter", sigma=1.0) pipeline.run() # updates stack.processed and stack.data After execution, the processed frames are available via ``stack.data``, and intermediate snapshots can be accessed through ``stack.processed``. Saved data & exports ^^^^^^^^^^^^^^^^^^^^ The processing system supports exporting processed results and snapshots to: - **OME-TIFF** - multi-frame TIFF, compatible with ImageJ/Fiji. - **NPZ** - numpy zipped archive containing arrays and metadata. - **HDF5** - self-contained bundle including data, processed snapshots and provenance. - **GIF** - annotated animated GIF (requires timing metadata for correct frame rates). Use the CLI flags ``--export``, ``--make-gif``, ``--output-folder`` and ``--output-name`` to control export behaviour (See :docs:`cli` for CLI flag details). What the pipeline records ^^^^^^^^^^^^^^^^^^^^^^^^^ After a run the following are available on the :class:`~playNano.afm_stack.AFMImageStack`: - ``stack.processed`` : dict - Snapshots keyed by step name (see detailed section below) and raw data preserved in ``raw``. - ``stack.masks`` : dict - Boolean mask snapshots keyed by step name. - ``stack.provenance["processing"]`` : dict - ``steps`` : ordered list of per-step provenance records. - ``keys_by_name`` : mapping of step names to created snapshot keys. - ``stack.provenance["environment"]`` : metadata about OS / Python / package versions. These records enable reproducibility and inspection of intermediate results. Advanced / Implementation details --------------------------------- Snapshot key naming ^^^^^^^^^^^^^^^^^^^ - Processed snapshot keys use the pattern:: step__ where ``idx`` is 1-based step index and ``step_name`` is the invoked step name with spaces replaced by underscores. A ``"raw"`` snapshot is created automatically (if missing) before the first processing step. - Mask snapshots are stored under ``stack.masks`` with similar keys. When masks are overlaid the new mask key concatenates the previous mask suffixes to preserve lineage. Provenance record structure ^^^^^^^^^^^^^^^^^^^^^^^^^^^ For each run ``stack.provenance["processing"]`` is rebuilt and contains: - ``steps`` - list of dicts, each with fields: - ``index`` : int (1-based) - ``name`` : str (step name) - ``params`` : dict (keyword args passed) - ``timestamp`` : ISO-8601 UTC timestamp - ``step_type`` : ``"filter"``, ``"mask"``, ``"clear"`` or ``"plugin"`` - ``version`` : optional version string if provided via a decorator or plugin metadata - ``function_module`` : Python module path (where the function lives) - *If mask*: ``mask_key`` and a concise ``mask_summary`` (shape/dtype) - *If filter/plugin*: ``processed_key`` and an ``output_summary`` - ``keys_by_name`` - dict mapping step name to ordered list of created keys. Other notes ^^^^^^^^^^^ - Indexing in snapshot keys is **1-based** (``step_1_*`` is the first applied step). - After pipeline completion ``stack.data`` is overwritten with the final processed array so that subsequent consumers use the processed frames by default. - When you export a bundle (HDF5/NPZ) the provenance and snapshots are included. - If you pass ``log_to`` to programmatic run helpers, large arrays are sanitized (summarized) for JSON-friendly logging. Inspecting results programmatically ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. code-block:: python # list snapshots print(sorted(stack.processed.keys())) print(sorted(stack.masks.keys())) # walk provenance for step in stack.provenance["processing"]["steps"]: print(step["index"], step["step_type"], step["name"], step.get("processed_key") or step.get("mask_key")) # retrieve results produced by a named step for key in stack.provenance["processing"]["keys_by_name"].get("polynomial_flatten", []): arr = stack.processed[key] # do stuff... Tips & troubleshooting ---------------------- - If you expect a ``"raw"`` snapshot but do not see one, check whether you loaded an HDF5 bundle; bundles may already contain ``raw`` snapshots. - If a plugin filter does not appear in the CLI, ensure the package is installed and exposes the entry point group ``playNano.filters``. - For large stacks, avoid asking the pipeline to write the entire record as raw JSON (use the HDF5 bundle instead). See also -------- - :doc:`cli` - command-line reference - :doc:`gui` - GUI behaviour and export options - :doc:`analysis` - analysis pipeline and provenance for analysis steps