Working with Old Spatial Transcriptomics data¶

Preparation files¶

Use need to prepare 3 files: count_matrix_file, spatial_file and image_file

The file should follow this format:

[1]:

import pandas as pd

Format for the count_matrix_file

[2]:

pd.read_table("../../../UQ/Developmental_heart/ST_Samples_4.5-5PCW/ST_Sample_4.5-5PCW_1/ST_Sample_4.5-5.tsv",index_col=0).head()

[2]:

	ENSG00000000003.14	ENSG00000000005.5	ENSG00000000419.12	ENSG00000000457.13	ENSG00000000460.16	ENSG00000000938.12	ENSG00000000971.15	ENSG00000001036.13	ENSG00000001084.10	ENSG00000001167.14	...	ENSG00000283667.1	ENSG00000283669.1	ENSG00000283674.1	ENSG00000283675.1	ENSG00000283683.1	ENSG00000283684.1	ENSG00000283689.1	ENSG00000283692.1	ENSG00000283696.1	ENSG00000283698.1
17x20	0	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	0
17x21	0	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	0
17x22	0	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	0
18x18	0	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	0
18x19	0	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	0

5 rows × 39739 columns

Format for the spatial_file (column selected is optional). It should have the x and y column match the column in count_matrix_file. The pixel_x and pixel_y columns should be match with the image_file.

[3]:

pd.read_table("../../../UQ/Developmental_heart/ST_Samples_4.5-5PCW/ST_Sample_4.5-5PCW_1/spot_data-all-ST_Sample_4.5-5PCW_1.tsv")

[3]:

	x	y	new_x	new_y	pixel_x	pixel_y	selected
0	1	1	0.96	1.03	1787.5	522.3	0
1	1	2	0.99	2.03	1795.7	741.1	0
2	1	3	0.94	3.09	1783.2	972.1	0
3	1	4	0.97	4.04	1790.8	1181.2	0
4	1	5	0.99	5.02	1794.1	1395.5	0
...	...	...	...	...	...	...	...
1132	33	31	33.02	30.93	8732.3	7059.4	0
1133	33	32	33.01	31.89	8730.2	7269.5	0
1134	33	33	33.03	32.91	8734.7	7492.1	0
1135	33	34	33.02	33.95	8731.8	7721.1	0
1136	33	35	33.01	34.93	8731.0	7934.6	0

1137 rows × 7 columns

Read the old ST-seq dataset¶

[1]:

import stlearn as st

[2]:

data = st.ReadOldST(count_matrix_file="[PATH to count_matrix_file file]",
                    spatial_file="[PATH to spatial_file file]",
                    image_file="[PATH to image_file file]")

Added tissue image to the object!

[3]:

data

[3]:

AnnData object with n_obs × n_vars = 676 × 26907
    obs: 'imagecol', 'imagerow'
    uns: 'spatial'
    obsm: 'spatial'

[4]:

st.pl.QC_plot(data,dpi=150)

../_images/tutorials_Working-with-Old-Spatial-Transcriptomics-data_12_0.png

Run preprocessing for count data

[5]:

st.pp.filter_genes(data,min_cells=3)
st.pp.normalize_total(data)
st.pp.log1p(data)
st.pp.scale(data)

Normalization step is finished in adata.X
Log transformation step is finished in adata.X
Scale step is finished in adata.X

Run PCA

[6]:

st.em.run_pca(data,n_comps=50,random_state=0)

PCA is done! Generated in adata.obsm['X_pca'], adata.uns['pca'] and adata.varm['PCs']

Create k-nearest-neigbor graph

[7]:

st.pp.neighbors(data,n_neighbors=30,use_rep='X_pca',random_state=0)

Created k-Nearest-Neighbor graph in adata.uns['neighbors']

Run and plot louvain clustering

[8]:

st.tl.clustering.louvain(data,random_state=0)

Applying Louvain clustering ...
Louvain clustering is done! The labels are stored in adata.obs['louvain']

[9]:

st.pl.cluster_plot(data,use_label="louvain",tissue_alpha=1,spot_size=15,show_legend=True)

../_images/tutorials_Working-with-Old-Spatial-Transcriptomics-data_21_0.png

Run and plot k-means clustering

[10]:

st.tl.clustering.kmeans(data,n_clusters=5)

Applying Kmeans clustering ...
Kmeans clustering is done! The labels are stored in adata.obs["kmeans"]

[11]:

st.pl.cluster_plot(data,use_label="kmeans",tissue_alpha=1,spot_size=15,show_legend=True)

../_images/tutorials_Working-with-Old-Spatial-Transcriptomics-data_24_0.png