stSME Comparison¶

This tutorial shows the stSME normalization effect between of two scenarios:

Normal (without stSME)
stSME applied on raw gene counts

In this tutorial we use Mouse Brain (Coronal) Visium dataset from 10x genomics website.

[1]:

import os
import platform

# Only constrain threads on macOS where BLAS/numba deadlocks are common.
# Must run before any numpy/scanpy import.
if platform.system() == "Darwin":
    os.environ["OPENBLAS_NUM_THREADS"] = "1"
    os.environ["MKL_NUM_THREADS"] = "1"
    os.environ["NUMBA_NUM_THREADS"] = "1"
    n_cpus = 1
else:
    n_cpus = None

[2]:

import scanpy as sc
import stlearn as st
import pathlib as pathlib
import numpy as np
import random as random
import os as os

st.settings.set_figure_params(dpi=120)

# Make sure all the seeds are set
seed = 0
np.random.seed(seed)
random.seed(seed)
os.environ['PYTHONHASHSEED'] = str(seed)

# Ignore all warnings
import warnings

warnings.filterwarnings("ignore")

[3]:

st.settings.datasetdir = pathlib.Path.cwd().parent / "data"

[4]:

data = sc.datasets.visium_sge(sample_id="V1_Adult_Mouse_Brain")
data = st.convert_scanpy(data)

[5]:

# pre-processing for gene count table
st.pp.filter_genes(data, min_cells=1)
st.pp.normalize_total(data)
st.pp.log1p(data)

Normalization step is finished in adata.X
Log transformation step is finished in adata.X

[6]:

# pre-processing for spot image
st.pp.tiling(data, out_path="tiling")

# this step uses deep learning model to extract high-level features from tile images
# may need few minutes to be completed
st.pp.extract_feature(data)

Tiling image: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████ [ time left: 00:00 ]
Extract feature: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉ [ time left: 00:00 ]

The morphology feature is added to adata.obsm['X_morphology']!

[7]:

# run PCA for gene expression data
st.em.run_pca(data, n_comps=50)

PCA is done! Generated in adata.obsm['X_pca'], adata.uns['pca'] and adata.varm['PCs']

Normal¶

[8]:

data_normal = data.copy()

stSME¶

[9]:

data_sme = data.copy()
# apply stSME to normalise log transformed data
st.spatial.sme.sme_normalize(data_sme, use_data="raw")
data_sme.X = data_sme.obsm['raw_SME_normalized']
st.em.run_pca(data_sme, n_comps=50)

Adjusting data: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████ [ time left: 00:00 ]

The data adjusted by SME is added to adata.obsm['raw_SME_normalized']
PCA is done! Generated in adata.obsm['X_pca'], adata.uns['pca'] and adata.varm['PCs']

[10]:

import matplotlib.pyplot as plt

def plot_comparison(i):
    fig, axes = plt.subplots(1, 2, figsize=(12, 5))

    # Plot into the axes but suppress individual titles
    st.pl.gene_plot(data_normal, gene_symbols=i, size=3, ax=axes[0])
    st.pl.gene_plot(data_sme, gene_symbols=i, size=3, ax=axes[1])

    # Remove individual titles set by the function
    axes[0].set_title("")
    axes[1].set_title("")

    # Set subplot labels and shared title
    axes[0].set_title("Normal", fontsize=12)
    axes[1].set_title("stSME", fontsize=12)
    fig.suptitle(i, fontsize=16, fontweight="bold")

    plt.tight_layout()
    plt.show()

Marker gene for CA3¶

[11]:

plot_comparison("Lhfpl1")

../_images/tutorials_stsme_comparison_15_0.png

Marker gene for DG¶

[12]:

plot_comparison("Pla2g2f")

../_images/tutorials_stsme_comparison_17_0.png

Tutorial by Xiao Tan