Harpy

Harpy#

Harpy logo

Single-cell spatial omics analysis that makes you happy.

πŸ’« If you find Harpy useful, please give us a ⭐ on GitHub! It helps others discover the project and supports continued development.

Why Harpy?

Harpy is a spatial omics analysis library for spatial transcriptomics and proteomics. Within the scverse stack, it bridges SpatialData and downstream analysis tools such as AnnData, Scanpy, and Squidpy. It provides scalable, image- and geometry-aware computation to transform raw spatial data into analysis-ready representations, with a strong emphasis on interoperability and large-scale workflows.

In practice, Harpy offers fast, out-of-core image preprocessing, tiled segmentation, and efficient aggregation workflows to generate AnnData tables and compute per-cell features from images, segmentation masks, and transcript coordinates. It also supports deep feature extraction, pixel- and cell-level clustering, and the construction of single-cell representations from highly multiplexed images.

  • Multi-platform support for spatial transcriptomics and proteomics data.

  • Interoperable outputs built on SpatialData.

  • Scales to (very) large images: tiled workflows with Dask; optional GPU acceleration with CuPy and PyTorch.

  • Scalable computational building blocks for segmentation, feature extraction, clustering, and spatial analysis.

For loading and browsing SpatialData stores in napari, alongside feature extraction and interactive object classification workflows, see the napari-harpy package.

For interactive visualization of Harpy outputs using Vitessce, see the harpy_vitessce package.

Note for users upgrading to Harpy 0.4.0: parameters that refer to SpatialData elements now use the *_name convention instead of the older *_layer naming to stay aligned with scverse naming conventions. For example, img_layer becomes image_name, labels_layer becomes labels_name, and table_layer becomes table_name.


Explore how to use Harpy for segmentation, shallow and deep feature extraction, clustering, and spatial analysis of gigapixel-scale multiplexed data with these step-by-step notebooks:



  • πŸ”§ Technology-specific advice

    Learn which technologies Harpy supports. πŸ‘‰ Notebook


  • 🧩 Pixel and Cell Clustering

    Learn how to perform unsupervised pixel- and cell-level clustering using Harpy together with FlowSOM. πŸ‘‰ Tutorial


  • βœ‚οΈ Cell Segmentation

    Explore segmentation workflows in Harpy using different tools:

    πŸ’‘ Want us to add support for another segmentation method? πŸ‘‰ Open an issue and let us know!


  • πŸ§ͺ Single-cell representations from highly multiplexed images and downstream use with PyTorch

    Learn how single-cell representations can be generated from highly multiplexed images. These representations can then be used downstream to train classifiers in PyTorch. πŸ‘‰ Tutorial


  • 🧠 Deep Feature Extraction

    Discover how Harpy enables fast, scalable extraction of deep, cell-level features from highly multiplex imaging data with the KRONOS foundation model for proteomics. πŸ‘‰ Tutorial

    πŸ’‘ Want us to add support for another deep feature extraction method? πŸ‘‰ Open an issue and let us know!


  • πŸ”¬ Shallow Feature Extraction

    Learn to extract shallow featuresβ€”such as mean, median, and standard deviation of intensitiesβ€”from multiplex imaging data with Harpy. πŸ‘‰ Tutorial



  • 🌐 Multiple samples and coordinate systems

    Learn how to work with multiple samples, intrinsic and micron coordinates. πŸ‘‰ Tutorial


  • πŸ“ Unifying Raster and Vector Annotations

    Learn how to convert a segmentation mask (array) into its vectorized form, and segmentation boundaries (polygons) into their rasterized equivalents. This conversion is useful, for example, when integrating annotations (e.g., from QuPath) into downstream spatial omics analysis.πŸ‘‰ Tutorial


πŸ“š For a complete list of tutorials, visit the tutorials section.

Installation

Learn how to install Harpy.

Installation
Quickstart

Run a short, end-to-end example.

Quickstart
Tutorials

Tutorials to help you get up to speed with Harpy.

Tutorials
Technology-specific advice

Learn which technologies Harpy supports.

Technology-specific advice
API

Find a detailed documentation of Harpy.

API
Computational Benchmark

Explore Harpy’s benchmark performance.

Benchmark results Harpy
HPC

Learn how to run Harpy in a High-Performance Computing (HPC) environment.

HPC tutorials
Contributing

Learn how to contribute to Harpy.

Development

If you use Harpy for spatial proteomics analysis, please cite:

Benjamin Rombaut, Arne Defauw, Frank Vernaillen, Julien Mortier, Evelien Van Hamme, Sofie Van Gassen, Ruth Seurinck, Yvan Saeys. Scalable analysis of whole slide spatial proteomics with Harpy. Bioinformatics (2026), btag122. https://doi.org/10.1093/bioinformatics/btag122

If you use Harpy for spatial transcriptomics analysis, please cite:

Lotte Pollaris, Bavo Vanneste, Benjamin Rombaut, Arne Defauw, Frank Vernaillen, Julien Mortier, Wout Vanhenden, Liesbet Martens, Tinne Thone, Jean-Francois Hastir, Anna Bujko, Wouter Saelens, Jean-Christophe Marine, Hilde Nelissen, Evelien Van Hamme, Ruth Seurinck, Charlotte L. Scott, Martin Guilliams, Yvan Saeys. SPArrOW: a flexible, interactive and scalable pipeline for spatial transcriptomics analysis. https://doi.org/10.1101/2024.07.04.601829