opendvp.tl.phenotype_cells

opendvp.tl.phenotype_cells#

opendvp.tl.phenotype_cells(adata, phenotype, gate=0.5, label='phenotype', imageid='imageid', pheno_threshold_percent=None, pheno_threshold_abs=None, verbose=True)#
Parameters:
  • adata (anndata.AnnData) – The input AnnData object containing single-cell data for phenotyping.

  • phenotype (pd.DataFrame) – A DataFrame specifying the gating strategy for cell phenotyping. It should outline the workflow for phenotype classification based on marker expression levels. An example workflow is available at [this GitHub link](ajitjohnson/scimap).

  • gate (float, optional) – The threshold value for determining positive cell classification based on scaled data. By convention, values above this threshold are considered to indicate positive cells.

  • label (str) – The name of the column in adata.obs where the final phenotype classifications will be stored. This label will be used to access the phenotyping results within the AnnData object.

  • imageid (str, optional) – The name of the column in adata.obs that contains unique image identifiers. This is crucial for analyses that require differentiation of data based on the source image, especially when using phenotype threshold parameters (pheno_threshold_percent or pheno_threshold_abs).

  • pheno_threshold_percent (float, optional) – A threshold value (between 0 and 100) specifying the minimum percentage of cells that must exhibit a particular phenotype for it to be considered valid. Phenotypes not meeting this threshold are reclassified as ‘unknown’. This parameter is useful for minimizing the impact of low-frequency false positives.

  • pheno_threshold_abs (int, optional) – Similar to pheno_threshold_percent, but uses an absolute cell count instead of a percentage. Phenotypes with cell counts below this threshold are reclassified as ‘unknown’. This can help in addressing rare phenotype classifications that may not be meaningful.

  • verbose (bool) – If set to True, the function will print detailed messages about its progress and the steps being executed.

Returns:

The input AnnData object, updated to include the phenotype classifications for each cell. The phenotyping results can be found in adata.obs[label], where label is the name specified by the user for the phenotype column.

Return type:

adata (anndata.AnnData)

Example

```python

# Load the phenotype workflow CSV file phenotype = pd.read_csv(‘path/to/csv/file/’)

# Apply phenotyping to cells based on the specified workflow adata = sm.tl.phenotype_cells(adata, phenotype=phenotype, gate=0.5, label=”phenotype”)

```