API#

Import Harpy as:

import harpy as hp

IO#

I/O.

io.macsima(path[, image_name, ...])

Read MACSima formatted dataset.

io.merscope(path[, to_coordinate_system, ...])

Read MERSCOPE data from Vizgen.

io.xenium(path[, to_coordinate_system, ...])

Read a 10X Genomics Xenium dataset into a SpatialData object.

io.visium(path[, dataset_id, counts_file, ...])

Read 10x Genomics Visium formatted dataset.

io.visium_hd(path[, dataset_id, ...])

Read 10x Genomics Visium HD formatted dataset.

io.read_transcripts(sdata, path_count_matrix)

Reads transcript information from a file with each row listing the x and y coordinates, along with the gene name.

io.read_resolve_transcripts(sdata, ...[, ...])

Reads and adds transcripts from Resolve Biosciences’ Molecular Cartography technology to a SpatialData object.

io.read_merscope_transcripts(sdata, ...[, ...])

Reads and adds merscope transcript information to a SpatialData object.

io.read_stereoseq_transcripts(sdata, ...[, ...])

Reads and adds Stereoseq transcript information to a SpatialData object.

io.convert_to_zarr_2(sdata, output[, overwrite])

Convert a backed Zarr v3 SpatialData object into a Zarr v2 SpatialData store.

Image#

Operations on image and labels elements.

im.add_image(sdata, arr, output_image_name)

Add an image element to a SpatialData object.

im.add_labels(sdata, arr, output_labels_name)

Add a labels element to a SpatialData object.

im.get_dataarray(sdata, element_name[, scale])

Retrieve the highest-resolution xarray.DataArray from an element in sdata.images or sdata.labels.

im.map_image(sdata, image_name, ...[, ...])

Apply a specified function to an image element of a SpatialData object.

im.tiling_correction(sdata[, image_name, ...])

Function corrects for the tiling effect that occurs in some image data (e.g. resolve data).

im.enhance_contrast(sdata[, image_name, ...])

Enhance the contrast of an image in a SpatialData object.

im.normalize(sdata, image_name, ...[, ...])

Normalize the intensity of an image element in a SpatialData object using specified percentiles.

im.min_max_filtering(sdata[, image_name, ...])

Apply min max filtering to an image in a SpatialData object using dask (using dask_image.ndfilters.maximum_filter and dask_image.ndfilters.minimum_filter).

im.gaussian_filtering(sdata[, image_name, ...])

Apply Gaussian filtering to an image in a SpatialData object using dask.

im.transcript_density(sdata[, image_name, ...])

Calculate the transcript density using a Gaussian filter and add it to the provided spatialdata.SpatialData object.

im.combine(sdata, image_name, output_image_name)

Combines specific channels within an image element of a SpatialData object.

im.segment(sdata, image_name[, model, ...])

Segment images using a provided model and add segmentation results (labels element and shapes element) to the SpatialData object.

im.segment_points(sdata, labels_name, ...[, ...])

Segment images using a points_name and a prior (labels_name) and add segmentation results (labels element and shapes element) to the SpatialData object.

im.cellpose_callable(img[, batch_size, ...])

Perform cell segmentation using the Cellpose model.

im.instanseg_callable(img[, device, ...])

Perform segmentation using instanseg.

im.baysor_callable(img, df, name_x, name_y, ...)

Perform cell segmentation using the Baysor algorithm.

im.add_grid_labels(sdata, shape, size, ...)

Adds a grid-based labels element to the SpatialData object using either a hexagonal or square grid.

im.expand_labels(sdata, labels_name[, ...])

Expand cells in the labels element labels_name using skimage.segmentation.expand_labels.

im.align_labels(sdata, labels_name_1, ...[, ...])

Align two labels elements.

im.map_labels(sdata, func, labels_name[, ...])

Apply a specified function to a labels element in a SpatialData object.

im.filter_labels(sdata, labels_name[, ...])

Filter labels in a labels element by global object size.

im.merge_labels(sdata, ...[, threshold, ...])

Merge two labels elements using a global object-level overlap rule.

im.match_labels_to_reference(sdata, ...[, ...])

Match source labels to reference labels based on an overlap score.

im.rasterize(sdata, shapes_name, ...[, ...])

Given a shapes element in a SpatialData object, corresponding masks are created, and added as a labels element to the SpatialData object.

im.pixel_clustering_preprocess(sdata, ...[, ...])

Preprocess image elements specified in image_name.

im.flowsom(sdata, image_name, ...[, ...])

Applies flowsom clustering on image element(s) of a SpatialData object.

Shape#

Operations on shapes (polygons) elements.

sh.vectorize(sdata, labels_name, ...[, ...])

Vectorize a labels element.

sh.add_shapes(sdata, input, output_shapes_name)

Add a shapes element to a SpatialData object.

sh.filter_shapes(sdata, table_name, ...)

Filter shapes in a SpatialData object.

sh.create_voronoi_boundaries(sdata[, ...])

Create Voronoi boundaries from the shapes element of the provided SpatialData object.

Table#

Operations on table (AnnData object) elements.

tb.add_table(sdata, adata, ...[, ...])

Add an AnnData object as a table element to a SpatialData object.

tb.allocate(sdata, labels_name[, ...])

Allocates transcripts to instances via provided labels_name and points_name and returns updated SpatialData object with a table element (sdata.tables[output_table_name]) holding the anndata.AnnData object with transcript counts.

tb.bin_counts(sdata, table_name, ...[, ...])

Bins gene counts from barcodes to cells or regions defined in labels_name and returns an updated SpatialData object with a table element (sdata.tables[output_table_name]) holding an AnnData object with the binned counts per cell or region.

tb.allocate_intensity(sdata[, image_name, ...])

Allocates intensity values from a specified image element to corresponding cells in a SpatialData object and returns an updated SpatialData object augmented with a table element (sdata.tables[output_table_name]) AnnData object with intensity values for each cell and each (specified) channel.

tb.add_regionprops(sdata, labels_name, ...)

Calculates region property features from the specified labels element, and adds the results to the AnnData object that annotates the labels element.

tb.add_feature_matrix(sdata, labels_name, ...)

Compute per-instance feature matrices from labels and optional image data.

tb.extract_instances(sdata, image_name, ...)

Extract per-label instance windows from image_name/labels_name of size diameter in y and x using dask.array.map_overlap() and dask.array.map_blocks().

tb.ZarrIterableInstances(zarr_path, instance_ids)

Chunk-wise iterable dataset that:

tb.ZarrDataLoader(*args[, start_epoch])

DataLoader that increments epoch and forwards it to epoch-aware datasets.

tb.featurize(sdata, image_name, labels_name, ...)

Extract per-instance feature vectors from image_name and labels_name using a user-provided embedding model.

tb.preprocess_transcriptomics(sdata, ...[, ...])

Preprocess a table (AnnData) attribute of a SpatialData object for transcriptomics data.

tb.preprocess_proteomics(sdata, labels_name, ...)

Preprocess a table (AnnData) attribute of a SpatialData object for proteomics data.

tb.filter_on_size(sdata, labels_name, ...[, ...])

Returns the updated SpatialData object.

tb.leiden(sdata, labels_name, table_name, ...)

Applies leiden clustering on the table_name of the SpatialData object with optional UMAP calculation and gene ranking.

tb.kmeans(sdata, labels_name, table_name, ...)

Applies KMeans clustering on the table_name of the SpatialData object with optional UMAP calculation and gene ranking.

tb.score_genes(sdata, labels_name, ...[, ...])

The function loads marker genes from a CSV file and scores cells for each cell type using those markers using scanpy's score_genes() function.

tb.score_genes_iter(sdata, labels_name, ...)

Iterative annotation algorithm.

tb.correct_marker_genes(sdata, labels_name, ...)

Correct celltype expression in sdata.tables[table_name] using celltype_correction_dict.

tb.cluster_cleanliness(sdata, labels_name, ...)

Re-calculates annotations, potentially following corrections to the list of celltypes, or after a manual update of the assigned scores per cell type via e.g. correct_marker_genes.

tb.nhood_enrichment(sdata, labels_name, ...)

Calculate the nhood enrichment using squidpy via squidpy.gr.spatial_neighbors() and squidpy.gr.nhood_enrichment().

tb.nhood_kmeans(sdata, table_name, ...[, ...])

Cluster cells (instances) based on neighborhood cell-type composition using KMeans.

tb.cluster_intensity(sdata, table_name, ...)

Calculates weighted (by instance size) average intensity per cluster.

tb.cluster_intensity_SOM(sdata, mapping, ...)

Calculates average intensity of each channel in image_name per SOM cluster as available in the labels_name, and saves it as a table element in sdata as output_table_name.

tb.spatial_pixel_neighbors(sdata, labels_name)

Computes spatial pixel neighbors and performs neighborhood enrichment analysis.

tb.cell_clustering_preprocess(sdata, ...[, ...])

Preprocesses spatial data for cell clustering.

tb.flowsom(sdata, cells_labels_name, ...[, ...])

Run FlowSOM cell clustering on pixel-cluster-derived cell features.

tb.weighted_channel_expression(sdata, ...[, ...])

Calculation of weighted channel expression in the context of cell clustering.

Points#

Operations on points (Dask DataFrame object) elements.

pt.add_points(sdata, ddf, ...[, ...])

Add a points element to a SpatialData object.

Externals#

External integrations.

externals.ilastik.run_object_classification(...)

Run ilastik headless object classification and add predicted labels to a table element.

Quality Control#

Quality control functions.

image_histogram(sdata, image_name, channel, bins)

Generate and visualize a histogram for a specified image channel within an image of a SpatialData object.

segmentation_coverage(sdata, labels_name[, ...])

Calculate coverage statistics for a segmentation labels element.

segmentation_histogram(sdata, labels_name[, ...])

Plot a histogram of segmented instance sizes for a labels element.

analyse_genes_left_out(sdata, labels_name, ...)

Analyse and visualize the proportion of genes that could not be assigned to an instance during allocation step.

metric_histogram(sdata, table_name[, ...])

Plot a QC metric histogram for an AnnData table.

metrics_histogram(sdata, table_name[, ...])

Plot a standard panel of QC metric histograms for an AnnData table.

obs_scatter(sdata, table_name[, ...])

Plot the relationship between two observation-level columns.

Plotting#

Plotting functions.

General plots#

plot_sdata(sdata, image_name[, channel, ...])

Light wrapper around spatialdata-plot to plot a SpatialData object.

plot_sdata_genes(sdata, points_name[, ...])

Light wrapper around spatialdata-plot to visualize gene expression from a SpatialData object.

plot_instance_density(sdata, table_name[, ...])

Plot an instance density heatmap from centroids stored in sdata.tables[table_name].obsm[spatial_key].

Proteomics plots#

cluster_intensity_heatmap(sdata, table_name, ...)

Generate and visualize a heatmap of mean channel intensities per cluster for each channel.

pixel_clusters(sdata, labels_name[, crd, ...])

Visualize spatial distribution of pixel clusters based on labels in a SpatialData object, obtained using harpy.im.flowsom().

pixel_clusters_heatmap(sdata, table_name[, ...])

Generate and visualize a heatmap of mean channel intensities for clusters or metaclusters.

snr_ratio(sdata[, ax, loglog, color])

Plot the signal to noise ratio.

group_snr_ratio(sdata, groupby[, ax, ...])

Plot the signal to noise ratio.

snr_clustermap(sdata[, signal_threshold, ...])

signal_clustermap(sdata[, signal_threshold, ...])

clustermap(*args, **kwargs)

Transcriptomics plots#

plot_transcript_density(sdata, bin_size, ...)

Plot a transcript density heatmap from a SpatialData object.

Utils#

Utility functions.

utils.RasterAggregator(mask_dask_array[, ...])

Helper class to calulate aggregated 'sum', 'mean', 'var', 'kurtosis', 'skew', 'area', 'min', 'max' and 'center of mass' of image and labels using Dask.

utils.Featurizer(mask_dask_array[, ...])

Helper class to featurize images and labels using Dask.

utils.kronos_embedding(array, ...[, ...])

Compute KRONOS embeddings for multi-channel instance windows using a pre-trained vision transformer.

utils.bounding_box_query(sdata, labels_name, ...)

Query the labels elements of a SpatialData object and the corresponding instances it annotates in sdata.tables via a bounding box query.

Datasets#

Dataset loaders.

datasets.cluster_blobs([shape, ...])

Differs from spatialdata.datasets.make_blobs in that it generates cells with multiple image channels and known ground truth cell types.

datasets.multisample_blobs([n_samples, prefix])

Multisample blobs.

datasets.pixie_example()

Example pixie dataset, loaded from s3 bucket.

datasets.macsima_example()

Example proteomics dataset generated using the MACSima platform.

datasets.macsima_colorectal_carcinoma([...])

Load the Colorectal Carcinoma MACSima dataset as a spatialdata.SpatialData object.

datasets.macsima_colorectal_carcinoma_course(...)

Colorectal carcinoma MACSima course dataset.

datasets.macsima_tonsil([filter_regex, path])

Tonsil proteomics dataset generated using the MACSima platform

datasets.codex_example([path])

Example annotated codex dataset (cHL maps dataset), Shaban, M.

datasets.mibi_example()

Example proteomics dataset

datasets.vectra_example([path])

Example proteomics dataset LuCa-7color_[13860,52919]_1x1 from Perkin Elmer

datasets.resolve_example([path])

Example transcriptomics dataset.

datasets.merscope_mouse_liver([output, ...])

Example transcriptomics dataset

datasets.xenium_human_lung_cancer([output, path])

Example transcriptomics dataset

datasets.xenium_human_ovarian_cancer([...])

Example transcriptomics dataset

datasets.xenium_human_ovarian_cancer_course(...)

Human ovarian cancer Xenium course dataset.

datasets.visium_hd_example([bin_size, ...])

Example transcriptomics dataset

datasets.visium_hd_example_custom_binning([path])

Example transcriptomics dataset

datasets.get_registry([path])

Get the Pooch registry

datasets.get_spatialdata_registry([path])

Get the Pooch SpatialData registry