opendvp.tl.impute_gaussian#
- opendvp.tl.impute_gaussian(adata, mean_shift=-1.8, std_dev_shift=0.3, perSample=False, layer_key='unimputed', uns_key='impute_gaussian_qc_metrics')#
Impute missing values in an AnnData object using a Gaussian distribution.
This function imputes missing values in the data matrix using a Gaussian distribution, with the mean shifted and the standard deviation scaled. Imputation can be performed per protein (column) or per sample (row).
The original, un-imputed data matrix is stored in
adata.layers
. A DataFrame with quality control metrics for the imputation is stored inadata.uns
. The QC metrics include the number of imputed values, the mean and standard deviation used for imputation, and a numpy array of the imputed values themselves for each feature.- Return type:
Parameters#
- adataad.AnnData
AnnData object with missing values to impute.
- mean_shiftfloat, default -1.8
Number of standard deviations to shift the mean of the Gaussian distribution.
- std_dev_shiftfloat, default 0.3
Factor to scale the standard deviation of the Gaussian distribution.
- perSamplebool, default False
If True, impute per sample (row); if False, impute per protein (column).
- layer_keystr, default ‘unimputed’
Key under which to store the original, un-imputed data matrix in
adata.layers
.- uns_keystr, default ‘impute_gaussian_qc_metrics’
Key under which to store the imputation QC metrics DataFrame in
adata.uns
.
Returns:#
- ad.AnnData
AnnData object with imputed values in
.X
, the original matrix in.layers[layer_key]
, and QC metrics in.uns[uns_key]
.