harpy.tb.allocate

Contents

harpy.tb.allocate#

harpy.tb.allocate(sdata, labels_name, points_name='transcripts', output_table_name='table_transcriptomics', to_coordinate_system='global', chunks=None, name_gene_column='gene', append=False, update_shapes_elements=False, region_key='fov_labels', instance_key='cell_ID', spatial_key='spatial', cell_index_name='cells', overwrite=False)#

Allocates transcripts to instances via provided labels_name and points_name and returns updated SpatialData object with a table element (sdata.tables[output_table_name]) holding the anndata.AnnData object with transcript counts.

It requires that labels_name and points_name are registered. Relation between to_coordinate_system and points_name should be a spatialdata.transformations.Identity transformation. Relation between to_coordinate_system and labels_name can be a spatialdata.transformations.Identity, spatialdata.transformations.Translation, or a spatialdata.transformation.Sequence of translations.

Parameters:
  • sdata (SpatialData) – The SpatialData object.

  • labels_name (str) – The labels element (i.e. segmentation mask) in sdata to be used to allocate the transcripts to cells.

  • points_name (str (default: 'transcripts')) – The points element in sdata that contains the transcripts.

  • output_table_name (str (default: 'table_transcriptomics')) – The table element in sdata in which to save the AnnData object with the transcripts counts per cell.

  • to_coordinate_system (str (default: 'global')) – The coordinate system that holds labels_name and points_name. This should be the intrinsic coordinate system in pixels.

  • chunks (str | tuple[int, ...] | int | None (default: None)) – Chunk size for processing. Consider setting ‘chunks’ to ‘None’ and rechunk the ‘labels_name’ to the desired chunk size on disk, e.g. with harpy.im.add_labels().

  • name_gene_column (str (default: 'gene')) – Column name in the points_name representing gene information.

  • append (bool (default: False)) – If set to True, and the labels_name does not yet exist as a region_key in sdata.tables[output_table_name].obs, the transcripts counts obtained during the current function call will be appended (along axis=0) to any existing transcript count values. within the SpatialData object’s table attribute. If False, and overwrite is set to True any existing data in sdata.tables[output_table_name] will be overwritten by the newly extracted transcripts counts.

  • update_shapes_elements (bool (default: False)) – Whether to filter the shapes elements associated with labels_name. If set to True, cells that do not appear in resulting output_table_name (with region_key equal to labels_name) will be removed from the shapes elements (via instance_key) in the sdata object. Filtered shapes will be added to sdata with prefix ‘filtered_segmentation’. This parameter is deprecated, and will be removed in a future version.

  • instance_key (str (default: 'cell_ID')) – Instance key. The name of the column in AnnData table .obs that will hold the instance ids.

  • region_key (str (default: 'fov_labels')) – Region key. The name of the column in AnnData table .obs that will hold the name of the element(s) that are annotated by the resulting table.

  • spatial_key (str (default: 'spatial')) – The key in the AnnData table .obsm that will hold the x and y center of the instances. This center is calculated by taking the average x,y coordinate of the transcripts found inside the cell.

  • cell_index_name (str (default: 'cells')) – The name of the index of the resulting AnnData table.

  • overwrite (bool (default: False)) – If True, overwrites the output_table_name if it already exists in sdata.

Return type:

SpatialData

Returns:

: An updated SpatialData object with an AnnData table added to sdata.tables at slot output_table_name.

Example

sdata = hp.datasets.resolve_example_multiple_coordinate_systems()

# Create an AnnData table with transcript count per cell with name 'my_table'
sdata = hp.tb.allocate(
    sdata,
    labels_name="labels_a1_1",
    points_name="points_a1_1",
    output_table_name="my_table",
    to_coordinate_system="a1_1",
    overwrite=True,
)

# Append transcript count per cell from different sample to 'my_table'
sdata = hp.tb.allocate(
    sdata,
    labels_name="labels_a1_2",
    points_name="points_a1_2",
    output_table_name="my_table",
    to_coordinate_system="a1_2",
    append=True,
    overwrite=True,
)