harpy.im.filter_labels

Contents

harpy.im.filter_labels#

harpy.im.filter_labels(sdata, labels_name, min_size=10, max_size=100000, chunks=None, output_labels_name=None, output_shapes_name=None, scale_factors=None, overwrite=False)#

Filter labels in a labels element by global object size.

Labels in labels_name whose total size across the full image is smaller than min_size or larger than max_size are set to 0 in the output. Size is computed per object globally, so labels that span multiple chunks are filtered consistently.

Parameters:
  • sdata (SpatialData) – The SpatialData object containing the labels element to be filtered.

  • labels_name (str) – The name of the labels element to be filtered.

  • min_size (int (default: 10)) – labels in labels_name with size smaller than min_size will be set to 0.

  • max_size (int (default: 100000)) – labels in labels_name with size larger than max_size will be set to 0.

  • chunks (str | int | tuple[int, int] | None (default: None)) – The desired chunk size for the Dask computation, or “auto” to allow the function to choose an optimal chunk size based on the data.

  • output_labels_name (str | None (default: None)) – The name of the output labels element where results will be stored. This must be specified.

  • output_shapes_name (str | None (default: None)) – The name for the new shapes element generated from the filtered labels element. If None, no shapes element is created. Default is None.

  • scale_factors (Sequence[dict[str, int] | int] | None (default: None)) – Scale factors to apply for multiscale.

  • overwrite (bool (default: False)) – If True, overwrites output_labels_name or output_shapes_name if they already exist in sdata.

Return type:

SpatialData

Returns:

: The modified SpatialData object with the filtered labels element.

Raises:
  • ValueError – If output_labels_name is not provided.

  • ValueError – If min_size or max_size is negative.

  • ValueError – If min_size is larger than max_size.

See also

harpy.utils.get_instance_size

compute global object sizes for a labels mask.

Notes

The function works with Dask arrays and can handle large datasets that do not fit into memory.

Example

sdata = hp.datasets.mibi_example()

sdata = hp.im.filter_labels(
    sdata,
    labels_name="masks_whole",
    min_size=100,
    max_size=1000,
    chunks=256,
    output_labels_name="masks_whole_filtered",
    output_shapes_name="masks_whole_filtered_boundaries",
    overwrite=True,
)