Quality Control on IMC spatial proteomics with Harpy

Quality Control on IMC spatial proteomics with Harpy#

Introduction#

This noteboook outlines some quality controls steps for highly-multiplexed spatial proteomics data using the Harpy package. Most steps are similar to this good resource on Imaging Mass Cytometry data analysis in R from the Bodenmiller lab. For comparison, the same dataset (IMC data from the Hyperion imaging system) analyzed with steinbock is used in this notebook. The dataset is available in Python as a SpatialData dataset.

The levels of quality control are:

  • Image level: to give an overview of the quality of the images in the dataset.

  • Segmentation level: to give an overview of the quality of the segmentation in the dataset.

  • Cell level: to give an overview of the quality of the calculated features per cell in the dataset.

We start by loading in an example dataset and visualising the data:

import harpy
import matplotlib.pyplot as plt
import scanpy as sc
import spatialdata_plot  # noqa
from harpy.datasets.proteomics import imc_example

plt.viridis()

sdata = imc_example()
sdata
The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload
SpatialData object
├── Images
│     ├── 'Patient1_001_image': DataArray[cyx] (40, 600, 600)
│     ├── 'Patient1_002_image': DataArray[cyx] (40, 600, 600)
│     ├── 'Patient1_003_image': DataArray[cyx] (40, 600, 600)
│     ├── 'Patient2_001_image': DataArray[cyx] (40, 600, 600)
│     ├── 'Patient2_002_image': DataArray[cyx] (40, 600, 600)
│     ├── 'Patient2_003_image': DataArray[cyx] (40, 600, 600)
│     ├── 'Patient2_004_image': DataArray[cyx] (40, 600, 600)
│     ├── 'Patient3_001_image': DataArray[cyx] (40, 600, 600)
│     ├── 'Patient3_002_image': DataArray[cyx] (40, 600, 600)
│     ├── 'Patient3_003_image': DataArray[cyx] (40, 600, 600)
│     ├── 'Patient4_005_image': DataArray[cyx] (40, 600, 600)
│     ├── 'Patient4_006_image': DataArray[cyx] (40, 600, 600)
│     ├── 'Patient4_007_image': DataArray[cyx] (40, 600, 600)
│     └── 'Patient4_008_image': DataArray[cyx] (40, 600, 600)
├── Labels
│     ├── 'Patient1_001_labels': DataArray[yx] (600, 600)
│     ├── 'Patient1_002_labels': DataArray[yx] (600, 600)
│     ├── 'Patient1_003_labels': DataArray[yx] (600, 600)
│     ├── 'Patient2_001_labels': DataArray[yx] (600, 600)
│     ├── 'Patient2_002_labels': DataArray[yx] (600, 600)
│     ├── 'Patient2_003_labels': DataArray[yx] (600, 600)
│     ├── 'Patient2_004_labels': DataArray[yx] (600, 600)
│     ├── 'Patient3_001_labels': DataArray[yx] (600, 600)
│     ├── 'Patient3_002_labels': DataArray[yx] (600, 600)
│     ├── 'Patient3_003_labels': DataArray[yx] (600, 600)
│     ├── 'Patient4_005_labels': DataArray[yx] (600, 600)
│     ├── 'Patient4_006_labels': DataArray[yx] (600, 600)
│     ├── 'Patient4_007_labels': DataArray[yx] (600, 600)
│     └── 'Patient4_008_labels': DataArray[yx] (600, 600)
└── Tables
      └── 'table': AnnData (47859, 40)
with coordinate systems:
    ▸ 'Patient1_001', with elements:
        Patient1_001_image (Images), Patient1_001_labels (Labels)
    ▸ 'Patient1_002', with elements:
        Patient1_002_image (Images), Patient1_002_labels (Labels)
    ▸ 'Patient1_003', with elements:
        Patient1_003_image (Images), Patient1_003_labels (Labels)
    ▸ 'Patient2_001', with elements:
        Patient2_001_image (Images), Patient2_001_labels (Labels)
    ▸ 'Patient2_002', with elements:
        Patient2_002_image (Images), Patient2_002_labels (Labels)
    ▸ 'Patient2_003', with elements:
        Patient2_003_image (Images), Patient2_003_labels (Labels)
    ▸ 'Patient2_004', with elements:
        Patient2_004_image (Images), Patient2_004_labels (Labels)
    ▸ 'Patient3_001', with elements:
        Patient3_001_image (Images), Patient3_001_labels (Labels)
    ▸ 'Patient3_002', with elements:
        Patient3_002_image (Images), Patient3_002_labels (Labels)
    ▸ 'Patient3_003', with elements:
        Patient3_003_image (Images), Patient3_003_labels (Labels)
    ▸ 'Patient4_005', with elements:
        Patient4_005_image (Images), Patient4_005_labels (Labels)
    ▸ 'Patient4_006', with elements:
        Patient4_006_image (Images), Patient4_006_labels (Labels)
    ▸ 'Patient4_007', with elements:
        Patient4_007_image (Images), Patient4_007_labels (Labels)
    ▸ 'Patient4_008', with elements:
        Patient4_008_image (Images), Patient4_008_labels (Labels)
<Figure size 640x480 with 0 Axes>
table = sdata["table"]
table
AnnData object with n_obs × n_vars = 47859 × 40
    obs: 'Image', 'area', 'centroid-0', 'centroid-1', 'axis_major_length', 'axis_minor_length', 'eccentricity', 'image', 'image_width_px', 'image_height_px', 'image_num_channels', 'image_source_file', 'image_recovery_file', 'image_recovered', 'image_acquisition_id', 'image_acquisition_description', 'image_acquisition_start_x_um', 'image_acquisition_start_y_um', 'image_acquisition_end_x_um', 'image_acquisition_end_y_um', 'image_acquisition_width_um', 'image_acquisition_height_um', 'cell_id', 'region', 'sample_id', 'patient_id', 'ROI', 'indication'
    var: 'channel', 'name', 'keep', 'ilastik', 'deepcell', 'Tube Number', 'Target', 'Antibody Clone', 'Stock Concentration', 'Final Concentration / Dilution', 'uL to add'
    uns: 'spatialdata_attrs'
    obsm: 'spatial'
table.to_df()
MPO HistoneH3 SMA CD16 CD38 HLADR CD27 CD15 CD45RA CD163 ... VISTA CD40 CD4 CD14 Ecad CD303 CD206 cleavedPARP DNA1 DNA2
Object 1 in Patient1_001.tiff 0.575106 3.127308 0.260094 2.034775 0.253014 4.111773 2.307456 7.300499 1.535942 2.166430 ... 5.509909 5.162859 7.398499 23.098799 140.415451 9.861239 1.502746 0.672675 75.195312 142.235321
Object 2 in Patient1_001.tiff 0.416667 11.359788 1.672038 2.588054 0.682667 7.646656 1.587182 8.823277 2.303773 6.265749 ... 4.544585 5.389460 7.441268 33.736485 65.583473 4.470543 9.239043 0.939726 73.576645 129.302673
Object 3 in Patient1_001.tiff 0.497549 2.384144 0.153519 2.294307 1.190298 13.193821 2.657502 5.005493 1.638779 8.174764 ... 4.400037 5.853950 7.854741 26.709986 106.939453 6.576359 4.165048 1.148231 71.255905 119.832062
Object 4 in Patient1_001.tiff 0.890154 7.712961 1.193948 15.629084 2.126060 138.590393 6.149451 5.903136 2.424042 19.541658 ... 4.765517 15.262775 24.683125 81.283867 96.895164 7.853501 6.633181 1.260584 59.181263 99.848030
Object 5 in Patient1_001.tiff 0.181818 1.451272 0.298670 0.608422 0.291779 5.911579 0.958761 8.756398 1.549757 0.529756 ... 2.797691 1.359595 4.253781 8.609590 74.317711 4.970934 1.286630 0.512645 10.164272 12.806628
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
Object 2841 in Patient4_008.tiff 0.127660 7.424975 0.473410 0.160480 0.230139 0.281806 0.189742 1.311998 0.682346 0.183106 ... 0.547729 0.195831 1.727591 3.871677 25.318060 1.705843 0.298129 0.258668 103.932419 180.427872
Object 2842 in Patient4_008.tiff 0.117647 8.225169 0.339634 0.280664 0.374028 0.180113 0.557850 1.664211 0.889153 0.000000 ... 0.495479 0.305937 1.954891 5.834383 32.321331 1.414846 0.499866 0.118527 105.321434 187.471375
Object 2843 in Patient4_008.tiff 0.146341 4.159492 0.390537 0.325213 0.340863 0.524208 0.496114 1.371000 0.791754 0.094179 ... 0.305323 0.074817 1.589985 4.659936 33.211224 1.783822 0.370879 0.313851 88.465622 164.969193
Object 2844 in Patient4_008.tiff 0.167724 2.898857 0.334794 0.620313 0.083333 0.536346 0.639925 13.249680 0.893656 0.114361 ... 0.459324 0.282099 1.955293 6.207744 28.697414 1.973373 0.534088 0.325632 28.828003 50.687141
Object 2845 in Patient4_008.tiff 0.213438 3.275088 0.259993 0.273648 0.094500 0.419652 0.666213 0.861737 0.667140 0.053631 ... 0.359852 0.340135 2.530452 8.510514 22.324598 2.136185 0.507724 0.215658 43.971691 73.308266

47859 rows × 40 columns

table.obs
Image area centroid-0 centroid-1 axis_major_length axis_minor_length eccentricity image image_width_px image_height_px ... image_acquisition_end_x_um image_acquisition_end_y_um image_acquisition_width_um image_acquisition_height_um cell_id region sample_id patient_id ROI indication
Object 1 in Patient1_001.tiff Patient1_001.tiff 12 0.416667 468.583333 7.406234 1.895294 0.966702 Patient1_001.tiff 600 600 ... 38100.828 17156.254 600.0 600.0 1 Patient1_001_labels Patient1_001 Patient1 001 SCCHN
Object 2 in Patient1_001.tiff Patient1_001.tiff 24 0.416667 515.833333 16.480040 1.962838 0.992882 Patient1_001.tiff 600 600 ... 38100.828 17156.254 600.0 600.0 2 Patient1_001_labels Patient1_001 Patient1 001 SCCHN
Object 3 in Patient1_001.tiff Patient1_001.tiff 17 0.470588 587.235294 9.850849 1.985817 0.979470 Patient1_001.tiff 600 600 ... 38100.828 17156.254 600.0 600.0 3 Patient1_001_labels Patient1_001 Patient1 001 SCCHN
Object 4 in Patient1_001.tiff Patient1_001.tiff 24 1.250000 192.250000 8.082904 3.915780 0.874818 Patient1_001.tiff 600 600 ... 38100.828 17156.254 600.0 600.0 4 Patient1_001_labels Patient1_001 Patient1 001 SCCHN
Object 5 in Patient1_001.tiff Patient1_001.tiff 22 0.909091 231.772727 8.793666 3.116532 0.935091 Patient1_001.tiff 600 600 ... 38100.828 17156.254 600.0 600.0 5 Patient1_001_labels Patient1_001 Patient1 001 SCCHN
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
Object 2841 in Patient4_008.tiff Patient4_008.tiff 47 597.255319 357.680851 12.607573 5.160594 0.912389 Patient4_008.tiff 600 600 ... 28302.627 6595.411 600.0 600.0 2841 Patient4_008_labels Patient4_008 Patient4 008 CRC
Object 2842 in Patient4_008.tiff Patient4_008.tiff 17 597.764706 367.058824 5.415619 3.980897 0.677984 Patient4_008.tiff 600 600 ... 28302.627 6595.411 600.0 600.0 2842 Patient4_008_labels Patient4_008 Patient4 008 CRC
Object 2843 in Patient4_008.tiff Patient4_008.tiff 41 597.731707 136.341463 13.102653 4.112542 0.949466 Patient4_008.tiff 600 600 ... 28302.627 6595.411 600.0 600.0 2843 Patient4_008_labels Patient4_008 Patient4 008 CRC
Object 2844 in Patient4_008.tiff Patient4_008.tiff 24 597.833333 232.291667 7.790156 4.103809 0.849993 Patient4_008.tiff 600 600 ... 28302.627 6595.411 600.0 600.0 2844 Patient4_008_labels Patient4_008 Patient4 008 CRC
Object 2845 in Patient4_008.tiff Patient4_008.tiff 30 597.666667 338.666667 9.220415 4.253829 0.887219 Patient4_008.tiff 600 600 ... 28302.627 6595.411 600.0 600.0 2845 Patient4_008_labels Patient4_008 Patient4 008 CRC

47859 rows × 28 columns

selected_markers = ["MPO", "HistoneH3", "SMA"]
sdata.pl.render_images(channel=selected_markers).pl.render_labels().pl.show()
../../_images/835a04ae903dd4fa24cf18d750e380558f1b87ba6d660152688e768e3976632a.png

Image level quality control#

df = harpy.pl.calculate_snr_ratio(sdata)
df
image cycle channel snr signal
0 Patient1_001_image None MPO 547.535156 1.241055
1 Patient1_001_image None HistoneH3 5.745501 7.948751
2 Patient1_001_image None SMA 13.119742 2.827469
3 Patient1_001_image None CD16 12.360662 8.994999
4 Patient1_001_image None CD38 13.662168 5.337214
... ... ... ... ... ...
555 Patient4_008_image None CD303 9.104072 3.356459
556 Patient4_008_image None CD206 36.433674 14.416249
557 Patient4_008_image None cleavedPARP 1098.977173 1.388665
558 Patient4_008_image None DNA1 6.867725 168.084061
559 Patient4_008_image None DNA2 6.872324 296.049530

560 rows × 5 columns

harpy.pl.snr_ratio(sdata)
<Axes: xlabel='Signal intensity', ylabel='Signal-to-noise ratio'>
../../_images/ac056b01e5c69793a4dbed153e4aa78d678447e3582ba21676cef282bf1290c8.png
harpy.pl.snr_ratio(sdata, signal_threshold=2)
<Axes: xlabel='Signal intensity', ylabel='Signal-to-noise ratio'>
../../_images/c62147efd39c62ae0ae7eb5c75d622deb1d59ac24d05f79efe2ae3e21330dce2.png

The plots above summarize all samples together. It would be interesting to look without cell segmentation bias across all channels and samples. There are multiple ways to aggregate the channel signal on an image level to a heatmap. One way is to create heatmaps using the SNR and signal values calculated above. This again depends on the unsupervised Otsu thresholding and is not a very good proxy of a good cell segmentation.

When showing the heatmap you could use the z_score or standard_scale options, but it’s also interesting not to transform the data too much in order to still visualize the outliers.

harpy.pl.signal_clustermap(sdata, signal_threshold=2, figsize=(12, 10))
<seaborn.matrix.ClusterGrid at 0x377fc6c80>
../../_images/3bf1bb03e8e6e56de028fe4c89fb430d8efb7b20b7ff26feaa09bf21b77393a2.png
harpy.pl.snr_clustermap(sdata, signal_threshold=2, figsize=(12, 10))
<seaborn.matrix.ClusterGrid at 0x36d5080a0>
../../_images/ced5e2f69997f6a4cfcebb50812adc94d042a1f48b8a206b54f4f01d462a7f6a.png

Another way is to normalize the image with a quartile normalization. The min and max quartile can greatly influence how the ends of the intensity distributions. Here we clip the signal below and above the 5th and 95th percentile. We also apply an arcsinh transformation to each channel against extreme outliers to make the heatmap more interpretable.

df_norm = harpy.pl.calculate_mean_norm(
    sdata, overwrite=True, q_min=5, q_max=95, c_mask=["DNA1", "DNA2"]
)
df_norm
MPO HistoneH3 SMA CD16 CD38 HLADR CD27 CD15 CD45RA CD163 ... CD33 Ki67 VISTA CD40 CD4 CD14 Ecad CD303 CD206 cleavedPARP
image_name
Patient1_001_image 0.001373 0.055785 0.005045 0.026264 0.008826 0.175229 0.020593 0.054659 0.013744 0.056766 ... 0.074157 0.267642 0.032680 0.034993 0.074261 0.218794 0.456826 0.036260 0.032173 0.003076
Patient1_002_image 0.006060 0.049233 0.004587 0.020164 0.005141 0.165615 0.015160 0.111997 0.013541 0.043220 ... 0.037956 0.307081 0.037028 0.025349 0.062264 0.176832 0.312511 0.025535 0.018039 0.001985
Patient1_003_image 0.010610 0.047956 0.003759 0.042909 0.017612 0.257181 0.047959 0.131205 0.043712 0.066042 ... 0.078412 0.092601 0.061601 0.071824 0.114782 0.317894 0.144735 0.034393 0.046012 0.002193
Patient2_001_image 0.000752 0.058385 0.006775 0.016455 0.010782 0.205677 0.022859 0.013033 0.011812 0.026477 ... 0.068141 0.045461 0.014171 0.020201 0.050557 0.102343 0.127614 0.020617 0.021694 0.001510
Patient2_002_image 0.000641 0.041701 0.005403 0.013765 0.009545 0.148287 0.022571 0.044532 0.009762 0.026233 ... 0.083496 0.051572 0.013299 0.015754 0.041932 0.113859 0.188180 0.026002 0.025739 0.001604
Patient2_003_image 0.000477 0.040524 0.004184 0.014256 0.009770 0.149964 0.016900 0.032347 0.006899 0.016842 ... 0.075266 0.052448 0.010327 0.015825 0.029393 0.077996 0.205089 0.021494 0.016302 0.001290
Patient2_004_image 0.000430 0.020642 0.001822 0.007410 0.007910 0.154417 0.018306 0.014674 0.016942 0.014809 ... 0.031859 0.035523 0.010955 0.019163 0.038357 0.058413 0.082950 0.011072 0.012548 0.000811
Patient3_001_image 0.000831 0.016921 0.001898 0.006824 0.002759 0.043634 0.009610 0.005878 0.020288 0.006521 ... 0.021315 0.009402 0.010975 0.035498 0.041559 0.035015 0.108840 0.012112 0.017181 0.001813
Patient3_002_image 0.001961 0.055722 0.007662 0.017035 0.003206 0.191106 0.008463 0.007178 0.004085 0.032297 ... 0.039019 0.017989 0.017073 0.024589 0.048552 0.113121 0.109767 0.013799 0.116642 0.001494
Patient3_003_image 0.001900 0.015013 0.001557 0.010953 0.006222 0.065199 0.014200 0.016212 0.008011 0.006861 ... 0.034591 0.009602 0.017635 0.026982 0.047254 0.051440 0.205255 0.019356 0.024959 0.002253
Patient4_005_image 0.001648 0.038940 0.007374 0.002212 0.001156 0.044527 0.003054 0.023514 0.002895 0.003408 ... 0.007986 0.035216 0.002902 0.002356 0.009998 0.036888 0.058972 0.005537 0.005516 0.000484
Patient4_006_image 0.000639 0.037834 0.005459 0.002180 0.001767 0.123937 0.011612 0.006143 0.018318 0.008780 ... 0.007986 0.008681 0.006178 0.010628 0.035626 0.034305 0.008730 0.004045 0.013496 0.000243
Patient4_007_image 0.000936 0.045258 0.009955 0.004031 0.003228 0.100138 0.006063 0.006296 0.007925 0.014336 ... 0.010433 0.007212 0.009297 0.009211 0.020034 0.069162 0.010025 0.007196 0.010052 0.000590
Patient4_008_image 0.002200 0.037542 0.006024 0.004075 0.001709 0.027580 0.003400 0.040889 0.004455 0.003270 ... 0.018520 0.039438 0.004279 0.003409 0.012010 0.046535 0.126114 0.007212 0.004356 0.000545

14 rows × 38 columns

df_metadata = table.obs.groupby("sample_id").first()[
    ["ROI", "patient_id", "indication"]
]
df_metadata["image_name"] = df_metadata.index.astype(str) + "_image"
df_metadata.reset_index(inplace=True)
df_metadata.set_index("image_name", inplace=True)
df_metadata.drop("sample_id", axis=1, inplace=True)
df_metadata
ROI patient_id indication
image_name
Patient1_001_image 001 Patient1 SCCHN
Patient1_002_image 002 Patient1 SCCHN
Patient1_003_image 003 Patient1 SCCHN
Patient2_001_image 001 Patient2 BCC
Patient2_002_image 002 Patient2 BCC
Patient2_003_image 003 Patient2 BCC
Patient2_004_image 004 Patient2 BCC
Patient3_001_image 001 Patient3 NSCLC
Patient3_002_image 002 Patient3 NSCLC
Patient3_003_image 003 Patient3 NSCLC
Patient4_005_image 005 Patient4 CRC
Patient4_006_image 006 Patient4 CRC
Patient4_007_image 007 Patient4 CRC
Patient4_008_image 008 Patient4 CRC
harpy.pl.clustermap(
    df_norm, row_colors=harpy.pl.make_cols_colors(df_metadata), figsize=(12, 10)
)
<seaborn.matrix.ClusterGrid at 0x3693fc9a0>
../../_images/c812176141174074e3157854281b6700f085d9c3446d9e57b714f6d51648dd76.png

Segmentation level quality control#

harpy.pl.segmentation_coverage(sdata)
<Axes: xlabel='Percentile of covered area', ylabel='name'>
../../_images/1c6c0eabe45f329d9256910844cdae8f069c3836dea96d8f07dd5fcdc2c6489e.png
sdata.pl.render_images(channel=selected_markers).pl.render_labels().pl.show(
    coordinate_systems=["Patient4_005", "Patient4_007"]
)
../../_images/00a823b0cbcdea66a347eeeb80e5e3a73683dc43cafb0181c5f21cb295dedd28.png
harpy.pl.segmentation_size_boxplot(sdata)
<Axes: title={'center': 'area'}>
../../_images/cf922c447df1df82458cc3656fe9e19eaa310a55aab7cbc98f187a00d9e73cb1.png
table.obs["area"].describe()
count    47859.000000
mean        76.377296
std         41.443170
min          3.000000
25%         47.000000
50%         70.000000
75%         98.000000
max        466.000000
Name: area, dtype: float64
sum(table.obs["area"] < 5)
65
table[table.obs["area"] >= 5]
View of AnnData object with n_obs × n_vars = 47794 × 40
    obs: 'Image', 'area', 'centroid-0', 'centroid-1', 'axis_major_length', 'axis_minor_length', 'eccentricity', 'image', 'image_width_px', 'image_height_px', 'image_num_channels', 'image_source_file', 'image_recovery_file', 'image_recovered', 'image_acquisition_id', 'image_acquisition_description', 'image_acquisition_start_x_um', 'image_acquisition_start_y_um', 'image_acquisition_end_x_um', 'image_acquisition_end_y_um', 'image_acquisition_width_um', 'image_acquisition_height_um', 'cell_id', 'region', 'sample_id', 'patient_id', 'ROI', 'indication'
    var: 'channel', 'name', 'keep', 'ilastik', 'deepcell', 'Tube Number', 'Target', 'Antibody Clone', 'Stock Concentration', 'Final Concentration / Dilution', 'uL to add'
    uns: 'spatialdata_attrs'
    obsm: 'spatial'
harpy.pl.calculate_segments_per_area(sdata)
sample_id image_width_px image_height_px cells_per_mm2
sample_id
Patient2_003 2612 600.0 600.0 7255.555556
Patient4_005 2795 600.0 600.0 7763.888889
Patient4_008 2845 600.0 600.0 7902.777778
Patient2_002 3033 600.0 600.0 8425.0
Patient4_007 3135 600.0 600.0 8708.333333
Patient2_001 3185 600.0 600.0 8847.222222
Patient1_002 3304 600.0 600.0 9177.777778
Patient3_002 3393 600.0 600.0 9425.0
Patient1_001 3567 600.0 600.0 9908.333333
Patient3_003 3816 600.0 600.0 10600.0
Patient1_003 3884 600.0 600.0 10788.888889
Patient3_001 3938 600.0 600.0 10938.888889
Patient2_004 3980 600.0 600.0 11055.555556
Patient4_006 4372 600.0 600.0 12144.444444
harpy.pl.segments_per_area(sdata)
<Axes: xlabel='sample_id'>
../../_images/ff47d4a047a58b12996685c3c0b9289f5eaa9ad1f60ff50b5119156468f9c8cf.png

Cell level quality control#

import numpy as np

table.layers["arcsinh"] = np.arcsinh(table.to_df())
table
AnnData object with n_obs × n_vars = 47859 × 40
    obs: 'Image', 'area', 'centroid-0', 'centroid-1', 'axis_major_length', 'axis_minor_length', 'eccentricity', 'image', 'image_width_px', 'image_height_px', 'image_num_channels', 'image_source_file', 'image_recovery_file', 'image_recovered', 'image_acquisition_id', 'image_acquisition_description', 'image_acquisition_start_x_um', 'image_acquisition_start_y_um', 'image_acquisition_end_x_um', 'image_acquisition_end_y_um', 'image_acquisition_width_um', 'image_acquisition_height_um', 'cell_id', 'region', 'sample_id', 'patient_id', 'ROI', 'indication'
    var: 'channel', 'name', 'keep', 'ilastik', 'deepcell', 'Tube Number', 'Target', 'Antibody Clone', 'Stock Concentration', 'Final Concentration / Dilution', 'uL to add'
    uns: 'spatialdata_attrs'
    obsm: 'spatial'
    layers: 'arcsinh'
used_var_names = [x for x in table.var_names if x not in ["DNA1", "DNA2"]]
sc.tl.dendrogram(table, groupby="sample_id", var_names=used_var_names)
sc.pl.heatmap(
    table,
    layer="arcsinh",
    groupby="sample_id",
    var_names=used_var_names,
    swap_axes=True,
    dendrogram=True,
    figsize=(12, 10),
)
../../_images/477ed7fb76e7501a9ccec9d6c44cc0096b044ba2880e28230f6a081771dc02dd.png
adata = sc.pp.subsample(table, fraction=0.1, copy=True)
adata
AnnData object with n_obs × n_vars = 4785 × 40
    obs: 'Image', 'area', 'centroid-0', 'centroid-1', 'axis_major_length', 'axis_minor_length', 'eccentricity', 'image', 'image_width_px', 'image_height_px', 'image_num_channels', 'image_source_file', 'image_recovery_file', 'image_recovered', 'image_acquisition_id', 'image_acquisition_description', 'image_acquisition_start_x_um', 'image_acquisition_start_y_um', 'image_acquisition_end_x_um', 'image_acquisition_end_y_um', 'image_acquisition_width_um', 'image_acquisition_height_um', 'cell_id', 'region', 'sample_id', 'patient_id', 'ROI', 'indication'
    var: 'channel', 'name', 'keep', 'ilastik', 'deepcell', 'Tube Number', 'Target', 'Antibody Clone', 'Stock Concentration', 'Final Concentration / Dilution', 'uL to add'
    uns: 'spatialdata_attrs', 'dendrogram_sample_id', 'sample_id_colors'
    obsm: 'spatial'
    layers: 'arcsinh'
harpy.pl.ridgeplot_channel_sample(
    adata, y="patient_id", value_vars=["Ecad", "CD40", "PD1"]
)
../../_images/e0724e3d5a5311e9d710b173dd4d5eb456573355d237cfc9d233783c27d7bd67.png ../../_images/96a6884f313f96164e939797ed711e9952ab205d90b25fd24ee673ec00daa75c.png ../../_images/83150186931301a08004ba45cd712034308b21401d3fc1915bc3e1041567063c.png
sc.pp.neighbors(table, n_neighbors=10, random_state=42)
sc.tl.umap(table, random_state=42)
markers = ["patient_id", "ROI", "indication"]
fig, axes = plt.subplots(len(markers), 1, figsize=(10, 30))
for c, axs in zip(markers, axes, strict=False):
    sc.pl.umap(table, color=c, ax=axs, show=False)
../../_images/42d683c1bf60c3f8b9565b4dc59e4e403be9b7b146c9742f22539b95c49bd59d.png
markers = ["Ecad", "CD45RO"]
fig, axes = plt.subplots(1, len(markers), figsize=(5 * len(markers), 10))
for c, axs in zip(markers, axes, strict=False):
    sc.pl.umap(table, color=c, ax=axs, show=False)
../../_images/987478f6a4149de170b4cc5aef71e2fd7e1ad129aca25d78105e3069a0dc12a2.png