harpy.im.segment_points

Contents

harpy.im.segment_points#

harpy.im.segment_points(sdata, labels_name, points_name, name_x='x', name_y='y', name_gene='gene', model=<function baysor_callable>, output_labels_name='segmentation_mask', output_shapes_name='segmentation_mask_boundaries', labels_name_align=None, depth=100, chunks=None, boundary='reflect', trim=False, iou=True, iou_depth=2, iou_threshold=0.7, crd=None, to_coordinate_system='global', scale_factors=None, overwrite=False, **kwargs)#

Segment images using a points_name and a prior (labels_name) and add segmentation results (labels element and shapes element) to the SpatialData object.

Currently only segmentation using a prior is supported (i.e. labels_name should be provided). The points_name and the labels_name should be registered (i.e. same coordinate space in sdata).

Parameters:
  • sdata (SpatialData) – The SpatialData object containing the image element to segment.

  • labels_name (str) – The labels element in sdata to be used as a prior.

  • points_name (str) – The points element in sdata to be used for segmentation.

  • name_x (str (default: 'x')) – Column name for x-coordinates of the transcripts in the points element, by default “x”.

  • name_y (str (default: 'y')) – Column name for y-coordinates of the transcripts in the points element, by default “y”.

  • name_gene (str (default: 'gene')) – Column name in the points_name representing gene information.

  • model (Callable[..., ndarray[tuple[Any, ...], dtype[TypeVar(_ScalarT, bound= generic)]]] (default: <function baysor_callable at 0x742f7c157ec0>)) – The segmentation model function used to process the images. Callable should take as input numpy arrays of dimension (z,y,x,c), a pandas dataframe with the transcripts, and parameters ‘name_x’, ‘name_y’ and ‘name_gene’ with the column names of the x and y location and the column name for the transcripts. It should return labels of dimension (z,y,x,c). Currently only 2D segmentation is supported (y,x). It can have an arbitrary number of other parameters.

  • output_labels_name (str | list[str] (default: 'segmentation_mask')) – Name of the labels element in which segmentation results will be stored in sdata. Can be a list of strings, if model returns multi channel mask. If provided as a list, its length should match the c dimension of the output of model.

  • output_shapes_name (str | list[str] | None (default: 'segmentation_mask_boundaries')) – Name of the shapes element where boundaries obtained output_labels_name will be stored. If set to None, shapes won’t be stored. Can be a list of strings, if model returns multi channel mask. If provided as a list, its length should match the c dimension of the output of model.

  • labels_name_align (str | None (default: None)) – Name of the labels element in output_labels_name to align to if model retuns multi channel mask.

  • depth (tuple[int, int] | int (default: 100)) – The depth in y and x dimension. The depth parameter is passed to dask.array.map_overlap. If trim is set to False, it’s recommended to set the depth to a value greater than twice the estimated diameter of the cells/nulcei.

  • chunks (str | int | tuple[int, int] | None (default: None)) – Chunk sizes for processing. Can be a string, integer or tuple of integers. If chunks is a Tuple, they contain the chunk size that will be used in y and x dimension. Chunking in z or c dimension is not supported.

  • boundary (str (default: 'reflect')) – Boundary parameter passed to dask.array.map_overlap.

  • trim (bool (default: False)) – If set to True, overlapping regions will be processed using the squidpy algorithm. If set to False, the harpy algorithm will be employed instead. For dense cell distributions, we recommend setting trim to False.

  • iou (bool (default: True)) – If set to True, will try to harmonize labels across chunks using a label adjacency graph with an iou threshold (see harpy.image.segmentation.utils._link_labels). If set to False, conflicts will be resolved using an algorithm that only retains masks with the center in the chunk. Setting iou to False gives good results if there is reasonable agreement of the predicted labels accross adjacent chunks.

  • iou_depth (tuple[int, int] | int (default: 2)) – iou depth used for harmonizing labels across chunks. Note that if labels_name_align is specified, iou_depth will also be used for harmonizing labels between different chunks.

  • iou_threshold (float (default: 0.7)) – iou threshold used for harmonizing labels across chunks. Note that if labels_name_align is specified, iou_threshold will also be used for harmonizing labels between different chunks.

  • crd (tuple[int, int, int, int] | None (default: None)) – The coordinates specifying the region of the image to be segmented. Defines the bounds (x_min, x_max, y_min, y_max).

  • to_coordinate_system (str (default: 'global')) – The coordinate system to which the crd is specified. Ignored if crd is None.

  • scale_factors (Sequence[dict[str, int] | int] | None (default: None)) – Scale factors to apply for multiscale.

  • overwrite (bool (default: False)) – If True, overwrites the existing output elements if they exist. Otherwise, raises an error if they exist.

  • **kwargs (Any) – Additional keyword arguments passed to the provided model.

Return type:

SpatialData

Returns:

: Updated sdata object containing the segmentation results.

Raises:

TypeError – If the provided model is not callable.