harpy.im.segment

Contents

harpy.im.segment#

harpy.im.segment(sdata, image_name, model=<function cellpose_callable>, output_labels_name='segmentation_mask', output_shapes_name='segmentation_mask_boundaries', labels_name_align=None, depth=100, chunks=None, boundary='reflect', trim=False, iou=True, iou_depth=2, iou_threshold=0.7, crd=None, to_coordinate_system='global', scale_factors=None, overwrite=False, **kwargs)#

Segment images using a provided model and add segmentation results (labels element and shapes element) to the SpatialData object.

Parameters:
  • sdata (SpatialData) – The SpatialData object containing the image element to segment.

  • image_name (str) – The image element in sdata to be segmented.

  • model (Callable[..., ndarray[tuple[Any, ...], dtype[TypeVar(_ScalarT, bound= generic)]]] (default: <function cellpose_callable at 0x742f7c17c5e0>)) – The segmentation model function used to process the images. Callable should take as input numpy arrays of dimension (z,y,x,c) and return labels of dimension (z,y,x,c). It can have an arbitrary number of other parameters.

  • output_labels_name (str | list[str] (default: 'segmentation_mask')) – Name of the labels element in which segmentation results will be stored in sdata. Can be a list of strings, if model returns multi channel mask. If provided as a list, its length should match the c dimension of the output of model.

  • output_shapes_name (str | list[str] | None (default: 'segmentation_mask_boundaries')) – Name of the shapes element where boundaries obtained output_labels_name will be stored. If set to None, shapes won’t be stored. Can be a list of strings, if model returns multi channel mask. If provided as a list, its length should match the c dimension of the output of model.

  • labels_name_align (str | None (default: None)) – Name of the labels element in output_labels_name to align to if model retuns multi channel mask.

  • depth (tuple[int, int] | int (default: 100)) – The depth in y and x dimension. The depth parameter is passed to dask.array.map_overlap. If trim is set to False, it’s recommended to set the depth to a value greater than twice the estimated diameter of the cells/nulcei.

  • chunks (str | int | tuple[int, int] | None (default: None)) – Chunk sizes for processing. Can be a string, integer or tuple of integers. If chunks is a tuple, they contain the chunk size that will be used in y and x dimension. Chunking in z or c dimension is not supported.

  • boundary (str (default: 'reflect')) – Boundary parameter passed to dask.array.map_overlap.

  • trim (bool (default: False)) – If set to True, overlapping regions will be processed using the squidpy algorithm. If set to False, the harpy algorithm will be employed instead. For dense cell distributions, we recommend setting trim to False.

  • iou (bool (default: True)) – If set to True, will try to harmonize labels across chunks using a label adjacency graph with an iou threshold (see harpy.image.segmentation.utils._link_labels). If set to False, conflicts will be resolved using an algorithm that only retains masks with the center in the chunk. Setting iou to False gives good results if there is reasonable agreement of the predicted labels across adjacent chunks.

  • iou_depth (tuple[int, int] | int (default: 2)) – iou depth used for harmonizing labels across chunks. Note that if labels_name_align is specified, iou_depth will also be used for harmonizing labels between different chunks.

  • iou_threshold (float (default: 0.7)) – iou threshold used for harmonizing labels across chunks. Note that if labels_name_align is specified, iou_threshold will also be used for harmonizing labels between different chunks.

  • crd (tuple[int, int, int, int] | None (default: None)) – The coordinates specifying the region of the image to be segmented. Defines the bounds (x_min, x_max, y_min, y_max).

  • to_coordinate_system (str (default: 'global')) – The coordinate system to which the crd is specified. Ignored if crd is None.

  • scale_factors (Sequence[dict[str, int] | int] | None (default: None)) – Scale factors to apply for multiscale.

  • overwrite (bool (default: False)) – If True, overwrites the existing output elements if they exist. Otherwise, raises an error if they exist.

  • **kwargs (Any) – Additional keyword arguments passed to the provided model.

Return type:

SpatialData

Returns:

: Updated sdata object containing the segmentation results.

Raises:

TypeError – If the provided model is not a callable.

Example

import os
import harpy as hp

from spatialdata import read_zarr
from dask.distributed import LocalCluster, Client

cluster = LocalCluster(
    n_workers=1,
    threads_per_worker=1,
)
client = Client(cluster)

print(client.dashboard_link)

sdata = hp.datasets.resolve_example()

# Write to a Zarr store for optimal performance
sdata.write(
    os.path.join(os.environ.get("TMPDIR"), "sdata.zarr"),
    overwrite=True,
)
sdata = read_zarr(sdata.path)

sdata = hp.im.segment(
    sdata,
    image_name="raw_image",
    crd=[1000, 2000, 3000, 4000],  # only segment a crop
    output_labels_name="segmentation_mask_computed",
    output_shapes_name=None,
    model=hp.im.cellpose_callable,
    # Keyword arguments passed to the model
    flow_threshold=0.8,
)

client.close()

hp.pl.plot_sdata(
    sdata,
    image_name="raw_image",
    labels_name="segmentation_mask_computed",
    crd=[1000, 2000, 3000, 4000],
)