opendvp.plotting.imputation_qc

Contents

opendvp.plotting.imputation_qc#

imputation_qc(adata, unimputed_layer='unimputed', return_fig=False, ax=None, highlight_genes=None, show_highlighted_genes_names=True)#

Generate a quality control plot to visualize the effect of imputation.

This function creates a scatter plot comparing the mean expression of genes before and after imputation. The x-axis represents the mean of the raw (unimputed) data, and the y-axis shows the difference between the raw mean and the imputed mean. This helps to identify genes that were most affected by the imputation process. The plot also includes a 2D histogram and kernel density estimate to visualize the distribution of data points.

Return type:

Figure | None

Parameters:#

adata

An AnnData object containing the imputed data in adata.X and the unimputed data in a specified layer.

unimputed_layer

The name of the layer in adata.layers that contains the unimputed data. Defaults to “unimputed”.

return_fig

If True, returns the matplotlib Figure object. Defaults to False.

ax

A matplotlib Axes object to plot on. If None, a new figure and axes are created. Defaults to None.

highlight_genes

A list of gene names to highlight in the plot. These genes will be plotted in a different color. Defaults to None.

show_highlighted_genes_names

If True, displays the names of the highlighted genes on the plot. Defaults to True.

Returns:#

A matplotlib Figure object if return_fig is True, otherwise None.

Example:#

>>> import anndata
>>> import numpy as np
>>> import pandas as pd
>>> from opendvp.plotting import imputation_qc
>>> # Create a dummy AnnData object
>>> n_obs, n_vars = 100, 50
>>> X_imputed = np.random.rand(n_obs, n_vars)
>>> X_raw = X_imputed.copy()
>>> # Introduce some NaNs to simulate unimputed data
>>> X_raw[np.random.choice([True, False], size=X_raw.shape, p=[0.1, 0.9])] = np.nan
>>> adata = anndata.AnnData(X_imputed)
>>> adata.layers["unimputed"] = X_raw
>>> adata.var_names = [f"Gene_{i}" for i in range(n_vars)]
>>> adata.obs_names = [f"Cell_{i}" for i in range(n_obs)]
>>> # Generate the QC plot
>>> imputation_qc(adata, highlight_genes=["Gene_5", "Gene_10"])