Interactive napari cell annotation on spatial proteomics

Interactive napari cell annotation on spatial proteomics#

Harpy works with SpatialData and AnnData objects. This allows for interoperability with other libraries in the scverse ecosystem that also work with these objects.

Anndata objects can be converted to pandas DataFrames, which can be used to store features in a napari Labels layer. These features can be used in a wide range of napari plugins, which can be used to perform interactive cell type labeling.

In this notebook, we load an artificial example dataset and perform annotation using napari-clusters-plotter. The clustering result can be visualized within napari and saved back to the SpatialData object within the notebook.

# Install napari-clusters-plotter as shown here https://github.com/BiAPoL/napari-clusters-plotter/tree/main?tab=readme-ov-file#installation
# e.g. conda install -c conda-forge napari-clusters-plotter

# load some example SpatialData
from harpy.datasets import multisample_blobs

sdata = multisample_blobs(n_samples=1)
sdata

INFO     no axes information specified in the object, setting `dims` to: ('c', 'y', 'x')                           
INFO     no axes information specified in the object, setting `dims` to: ('y', 'x')

SpatialData object
├── Images
│     └── 'sample_0_image': DataArray[cyx] (11, 512, 512)
├── Labels
│     └── 'sample_0_labels': DataArray[yx] (512, 512)
├── Points
│     └── 'sample_0_points': DataFrame with shape: (<Delayed>, 2) (2D points)
└── Tables
      ├── 'sample_0_table': AnnData (20, 11)
      └── 'table': AnnData (20, 11)
with coordinate systems:
    ▸ 'sample_0', with elements:
        sample_0_image (Images), sample_0_labels (Labels), sample_0_points (Points)

table = sdata["sample_0_table"]
table

AnnData object with n_obs × n_vars = 20 × 11
    obs: 'instance_id', 'region', 'fov_labels', 'cell_ID', 'phenotype', 'area', 'eccentricity', 'major_axis_length', 'minor_axis_length', 'perimeter', 'centroid-0', 'centroid-1', 'convex_area', 'equivalent_diameter', '_major_minor_axis_ratio', '_perim_square_over_area', '_major_axis_equiv_diam_ratio', '_convex_hull_resid', '_centroid_dif'
    var: 'cycle'
    uns: 'spatialdata_attrs'

Here we create the DataFrame of features to be used in napari. The DataFrame should have a label and index column refering back to the cell instances in the label mask. This is the index of the AnnData table, so we set this index as the column ‘label’ and ‘index’ in the DataFrame. The DataFrame should also contain all the features you want to visualize in napari and cluster on.

df = table.to_df()
df["label"] = df.index.astype(int)
df["index"] = df.index.astype(int)
df

	nucleus	lineage_0	lineage_1	lineage_2	lineage_3	lineage_4	lineage_5	lineage_6	lineage_7	lineage_8	lineage_9	label	index
cells
1	130579.074583	73143.384718	90176.329291	78887.342227	83547.398858	0.0	85104.954042	89203.618952	71052.452962	166453.784109	79783.723973	1	1
2	97747.257070	44127.352051	54403.315415	47592.677537	50404.086393	0.0	51343.758329	103611.315408	42865.894408	48557.293269	48133.464008	2	2
3	129783.693347	72871.920340	89159.713324	77997.994298	82605.515103	0.0	132335.765591	88197.968976	70251.432797	79581.243727	78884.270554	3	3
4	126967.536598	69764.494608	86010.594981	159287.142678	79687.890840	0.0	81180.370216	85082.819414	67770.154343	76768.146443	76776.118424	4	4
5	107214.353348	52460.254002	64676.705321	56579.963133	124573.992918	0.0	61039.388909	63979.053282	50960.585762	57726.734557	57222.870407	5	5
6	118023.115808	137276.241101	76316.189314	66762.324339	70706.128294	0.0	72387.349397	75492.984966	60131.660871	71197.154810	67520.931832	6	6
7	122155.172414	140294.370188	79746.467902	69763.173487	73884.244499	0.0	75261.651691	78886.261965	62834.473349	71177.144253	75451.055432	7	7
8	132338.563601	152777.165022	92088.042864	80559.732355	85318.580915	0.0	86909.155844	91094.776083	72558.745575	82192.529441	81475.117149	8	8
9	127852.325618	70794.839797	183547.310962	76354.365832	80864.793725	0.0	82372.337711	86339.399487	68771.045311	77901.927901	77221.965853	9	9
10	121525.452585	67926.259433	80034.961874	70015.551499	74151.531060	0.0	75534.839518	79171.644022	63061.785821	154850.113126	70811.124788	10	10
11	131650.132124	74095.756472	91350.353696	79914.392959	84635.119838	0.0	86212.953157	139548.028686	71977.499640	81534.110207	80822.444886	11	11
12	127371.369690	70297.364119	86667.554141	76338.564150	80296.556433	0.0	81793.614099	85732.692114	68287.790846	77354.510681	157361.012640	12	12
13	114577.728962	59503.671129	73360.327287	64176.500530	67967.553929	0.0	115488.462960	74399.400815	57802.654475	65477.239746	64905.725802	13	13
14	102792.173341	47854.334318	58998.202313	51612.340103	54661.199666	0.0	58107.597624	106489.632074	46486.334359	52658.426977	52198.801229	14	14
15	98593.953283	48764.333471	56132.317387	49105.229349	52005.988117	0.0	52975.523944	55526.831603	44228.223439	50100.501715	126998.845549	15	15
16	128578.399102	71259.826508	87854.145733	76855.867998	81395.920777	0.0	82913.366444	86906.484229	145386.128521	78413.594590	77729.166492	16	16
17	133312.011550	75540.130776	93131.207064	81472.304999	86285.061323	0.0	87893.654127	141309.672418	73380.683836	83123.598249	82398.059181	17	17
18	98854.304635	45097.277102	55599.107503	48655.494694	51511.974897	0.0	101403.272287	54999.373325	43808.092452	49624.589015	49191.749709	18	18
19	103946.301444	48853.336484	60229.842722	52689.794012	55802.301254	0.0	93048.637568	59580.157919	47456.778298	53757.718888	75974.922923	19	19
20	101242.245589	47964.507147	59134.031156	51731.164816	54787.043628	0.0	68469.878151	58496.166608	46593.357705	52779.660047	111525.597766	20	20

We add image and label mask as a napari layer and add the features to the Labels layer. Then we run napari, which will open in a new window. The workflow can be found here and is as follows:

In the napari window, open the Plotter widget via Plugins > napari-clusters-plotter > Plotte Widget.
In the new widget window on the right, select as labels the labels element with the features dataframe. You can select a feature for both the x-axis and y-axis and click Plot to visualize the cells in a scatter plot.
By drawing around groups of cells in the scatter plot, you can assign cell types to the cells.
- By holding SHIFT, you can add create new clusters.

import napari

viewer = napari.view_image(sdata["sample_0_image"], name="image")
labels_view = viewer.add_labels(sdata["sample_0_labels"], name="labels", features=df)

napari.run()
# Do annotation in napari window

After annotation, the labels will be saved in the DataFrame and can be saved back to the SpatialData object, as shown in the code below.

# Here we add a dummy labeling to the feature table to simulate annotation
# COMMENT OUT THIS LINE IF YOU ARE DOING REAL ANNOTATION
labels_view.features["MANUAL_CLUSTER_ID"] = labels_view.features["label"] + 1
labels_view.features

	nucleus	lineage_0	lineage_1	lineage_2	lineage_3	lineage_5	lineage_6	lineage_7	lineage_8	lineage_9	label	index	MANUAL_CLUSTER_ID
0	130579.074583	73143.384718	90176.329291	78887.342227	83547.398858	85104.954042	89203.618952	71052.452962	166453.784109	79783.723973	1	1	2
1	97747.257070	44127.352051	54403.315415	47592.677537	50404.086393	51343.758329	103611.315408	42865.894408	48557.293269	48133.464008	2	2	3
2	129783.693347	72871.920340	89159.713324	77997.994298	82605.515103	132335.765591	88197.968976	70251.432797	79581.243727	78884.270554	3	3	4
3	126967.536598	69764.494608	86010.594981	159287.142678	79687.890840	81180.370216	85082.819414	67770.154343	76768.146443	76776.118424	4	4	5
4	107214.353348	52460.254002	64676.705321	56579.963133	124573.992918	61039.388909	63979.053282	50960.585762	57726.734557	57222.870407	5	5	6
5	118023.115808	137276.241101	76316.189314	66762.324339	70706.128294	72387.349397	75492.984966	60131.660871	71197.154810	67520.931832	6	6	7
6	122155.172414	140294.370188	79746.467902	69763.173487	73884.244499	75261.651691	78886.261965	62834.473349	71177.144253	75451.055432	7	7	8
7	132338.563601	152777.165022	92088.042864	80559.732355	85318.580915	86909.155844	91094.776083	72558.745575	82192.529441	81475.117149	8	8	9
8	127852.325618	70794.839797	183547.310962	76354.365832	80864.793725	82372.337711	86339.399487	68771.045311	77901.927901	77221.965853	9	9	10
9	121525.452585	67926.259433	80034.961874	70015.551499	74151.531060	75534.839518	79171.644022	63061.785821	154850.113126	70811.124788	10	10	11
10	131650.132124	74095.756472	91350.353696	79914.392959	84635.119838	86212.953157	139548.028686	71977.499640	81534.110207	80822.444886	11	11	12
11	127371.369690	70297.364119	86667.554141	76338.564150	80296.556433	81793.614099	85732.692114	68287.790846	77354.510681	157361.012640	12	12	13
12	114577.728962	59503.671129	73360.327287	64176.500530	67967.553929	115488.462960	74399.400815	57802.654475	65477.239746	64905.725802	13	13	14
13	102792.173341	47854.334318	58998.202313	51612.340103	54661.199666	58107.597624	106489.632074	46486.334359	52658.426977	52198.801229	14	14	15
14	98593.953283	48764.333471	56132.317387	49105.229349	52005.988117	52975.523944	55526.831603	44228.223439	50100.501715	126998.845549	15	15	16
15	128578.399102	71259.826508	87854.145733	76855.867998	81395.920777	82913.366444	86906.484229	145386.128521	78413.594590	77729.166492	16	16	17
16	133312.011550	75540.130776	93131.207064	81472.304999	86285.061323	87893.654127	141309.672418	73380.683836	83123.598249	82398.059181	17	17	18
17	98854.304635	45097.277102	55599.107503	48655.494694	51511.974897	101403.272287	54999.373325	43808.092452	49624.589015	49191.749709	18	18	19
18	103946.301444	48853.336484	60229.842722	52689.794012	55802.301254	93048.637568	59580.157919	47456.778298	53757.718888	75974.922923	19	19	20
19	101242.245589	47964.507147	59134.031156	51731.164816	54787.043628	68469.878151	58496.166608	46593.357705	52779.660047	111525.597766	20	20	21

table.obs["manual_clustering"] = labels_view.features["MANUAL_CLUSTER_ID"]
table

AnnData object with n_obs × n_vars = 20 × 11
    obs: 'instance_id', 'region', 'fov_labels', 'cell_ID', 'phenotype', 'area', 'eccentricity', 'major_axis_length', 'minor_axis_length', 'perimeter', 'centroid-0', 'centroid-1', 'convex_area', 'equivalent_diameter', '_major_minor_axis_ratio', '_perim_square_over_area', '_major_axis_equiv_diam_ratio', '_convex_hull_resid', '_centroid_dif', 'manual_clustering'
    var: 'cycle'
    uns: 'spatialdata_attrs'