opendvp.tl.filter_features_byNaNs

opendvp.tl.filter_features_byNaNs#

opendvp.tl.filter_features_byNaNs(adata, threshold=0.7, grouping=None, valid_in_ANY_or_ALL_groups='ANY')#

Filter out proteins that have a NaN proportion above the threshold, for each group in the grouping variable.

Return type:

AnnData

Parameters#

adataAnnData

AnnData object to filter.

thresholdfloat, default 0.7

Proportion of valid values above which a protein is considered valid (between 0 and 1).

groupingOptional[str], default None

Name of the column in adata.obs to discriminate the groups by. If provided, counting of NaNs and validity is done per group.

valid_in_ANY_or_ALL_groups{‘ANY’, ‘ALL’}, default ‘ANY’

‘ANY’ means that if a protein passes the threshold in any group it will be kept. ‘ALL’ means that a protein must pass validity threshold for all groups to be kept (more stringent).

Returns:#

AnnData

Filtered AnnData object. The quality control metrics (e.g., NaN counts, valid proportions) are added to adata.var. A complete QC matrix for all initial features is stored in adata.uns['filter_features_byNaNs_qc_metrics']. The adata.var of the returned object will contain its original columns, plus ‘mean’ and ‘nan_proportions’ (derived from ‘overall_mean’ and ‘overall_nan_proportions’).