stlearn.pp.normalize_total

stlearn.pp.normalize_total(adata: AnnData, target_sum: Optional[float] = None, exclude_highly_expressed: bool = False, max_fraction: float = 0.05, key_added: Optional[str] = None, layers: Optional[Union[Literal['all'], Iterable[str]]] = None, layer_norm: Optional[str] = None, inplace: bool = True) Optional[Dict[str, ndarray]][source]

Wrap function from scanpy.pp.log1p Normalize counts per cell. If choosing target_sum=1e6, this is CPM normalization. If exclude_highly_expressed=True, very highly expressed genes are excluded from the computation of the normalization factor (size factor) for each cell. This is meaningful as these can strongly influence the resulting normalized values for all other genes [Weinreb17]. Similar functions are used, for example, by Seurat [Satija15], Cell Ranger [Zheng17] or SPRING [Weinreb17]. :param adata: The annotated data matrix of shape n_obs × n_vars.

Rows correspond to cells and columns to genes.

Parameters:
  • target_sum – If None, after normalization, each observation (cell) has a total count equal to the median of total counts for observations (cells) before normalization.

  • exclude_highly_expressed – Exclude (very) highly expressed genes for the computation of the normalization factor (size factor) for each cell. A gene is considered highly expressed, if it has more than max_fraction of the total counts in at least one cell. The not-excluded genes will sum up to target_sum.

  • max_fraction – If exclude_highly_expressed=True, consider cells as highly expressed that have more counts than max_fraction of the original total counts in at least one cell.

  • key_added – Name of the field in adata.obs where the normalization factor is stored.

  • layers – List of layers to normalize. Set to ‘all’ to normalize all layers.

  • layer_norm

    Specifies how to normalize layers: * If None, after normalization, for each layer in layers each cell

    has a total count equal to the median of the counts_per_cell before normalization of the layer.

    • If ‘after’, for each layer in layers each cell has a total count equal to target_sum.

    • If ‘X’, for each layer in layers each cell has a total count equal to the median of total counts for observations (cells) of adata.X before normalization.

  • inplace – Whether to update adata or return dictionary with normalized copies of adata.X and adata.layers.

Returns:

  • Returns dictionary with normalized copies of adata.X and adata.layers

  • or updates adata with normalized version of the original

  • adata.X and adata.layers, depending on inplace.