1. leiden聚类

sc.tl.leiden(adata, resolution=0.05, key_added='leiden_r0.05', random_state=10)

2.计算各类别marker基因

sc.tl.rank_genes_groups(adata, groupby='leiden_r0.05', key_added='rank_genes_r0.05')  

#默认使用raw data

3.提供各类别marker基因

marker_genes = dict()

marker_genes['T'] = ['CD3G','CD3D','CD3E','CD2']
marker_genes['CD8+T'] = ['CD8A','GZMA']
marker_genes['CD4+T'] = ['CD4','FOXP3']

4.数据集中300个marker与自己提供的marker做overlap计算个数

cell_annotation = sc.tl.marker_gene_overlap(adata, marker_genes, key='rank_genes_r0.05',top_n_markers = 300)   

5.overlap marker gene个数做标准化计算占比

cell_annotation_norm = sc.tl.marker_gene_overlap(adata, marker_genes, key='rank_genes_r0.05', normalize='reference',top_n_markers = 300)
 

6.重命名到cluster上

Logo

加入「COC·上海城市开发者社区」,成就更好的自己!

更多推荐