Postprocessing¶

class palantir.presults.PResults(pseudotime, entropy, branch_probs, waypoints)View on GitHub ¶

Bases: object

Container of palantir results

property branch_probsView on GitHub ¶

property entropyView on GitHub ¶

classmethod load(pkl_file)View on GitHub ¶

property pseudotimeView on GitHub ¶

save(pkl_file: str)View on GitHub ¶

property waypointsView on GitHub ¶

palantir.presults.cluster_gene_trends(data: AnnData | DataFrame, branch_name: str, genes: List[str] | None = None, gene_trend_key: str | None = 'gene_trends', n_neighbors: int = 150, **kwargs) → SeriesView on GitHub ¶

Cluster gene trends using the Leiden algorithm.

This function applies the Leiden clustering algorithm to gene expression trends along the pseudotemporal trajectory. If the input is an AnnData object, it uses the gene trends stored in the varm attribute accessed using the gene_trend_key. If the input is a DataFrame, it directly uses the input data for clustering.

Parameters:

data (Union[AnnData, pd.DataFrame]) – AnnData object or a DataFrame of gene expression trends.
branch_name (str) – Name of the branch for which the gene trends are to be clustered.
genes (list of str, optional) – List of genes to be considered for clustering. If None, all genes are considered. Default is None.
gene_trend_key (str, optional) – Key to access gene trends in the AnnData object’s varm. Default is ‘palantir_gene_trends’.
n_neighbors (int, optional) – The number of nearest neighbors to use for the k-NN graph construction. Default is 150.
**kwargs – Additional keyword arguments passed to scanpy.tl.leiden.

Returns:

A pandas series with the cluser lables for all passed genes.

Return type:

pd.Series

Raises:

KeyError – If gene_trend_key is None when data is an AnnData object.

palantir.presults.compute_gene_trends(ad: AnnData, lineages: List[str] | None = None, masks_key: str = 'branch_masks', expression_key: str = None, pseudo_time_key: str = 'palantir_pseudotime', gene_trend_key: str = 'gene_trends', save_as_df: bool = None, **kwargs) → DataFrameView on GitHub ¶

Compute gene expression trends along pseudotime in the given AnnData object.

This function computes the gene expression trends for each branch of the pseudotime trajectory using mellon.FunctionEstimator. The computed gene trends are stored in the varm attribute of the AnnData object, with keys in the format ‘{gene_trend_key}_{branch}’. Each key maps to a 2D numpy array where rows correspond to genes and columns correspond to the pseudotime grid. The pseudotime grid for each branch is stored in the uns attribute of the AnnData object, with keys in the format ‘{gene_trend_key}_{branch}_pseudotime’.

Parameters:

ad (AnnData) – AnnData object containing the gene expression data and pseudotime.
lineages (List[str], optional) – Subset of lineages for which to compute the trends. If None uses all columns of the fate probability matrix. Default is None.
masks_key (str, optional) – Key to access the branch cell selection masks from obsm of the AnnData object. Default is ‘branch_masks’.
expression_key (str, optional) – Key to access the gene expression data in the layers of the AnnData object. If None, uses raw expression data in .X. Default is None.
pseudo_time_key (str, optional) – Key to access the pseudotime values in the AnnData object. Default is ‘palantir_pseudotime’.
gene_trend_key (str, optional) – Key base to store the gene expression trends in varm of the AnnData object. Default is ‘palantir_gene_trends’.
save_as_df (bool, optional) – If True, the trends will be saved in the varm of the AnnData object as pandas DataFrame with pseudotime as column names. If False, the trends will be saved as numpy array and the pseudotime in uns[gene_trend_key + “_” + lineage_name + “_pseudotime”]. The option to save as DataFrame is there due to some versions of AnnData not being able to write h5ad files with DataFrames in ad.varm. Default is palantir.SAVE_AS_DF=True.
**kwargs – Additional arguments to be passed to mellon.FunctionEstimator.

Returns:

A dictionary containing gene expression trends for each branch. The keys of the dictionary are the branch names. The value for each branch is a sub-dictionary with a key ‘trends’ that maps to a DataFrame. The DataFrame contains the gene expression trends, indexed by gene names and columns representing pseudotime points.

Return type:

dict

palantir.presults.compute_gene_trends_legacy(data: AnnData | PResults, gene_exprs: DataFrame | None = None, lineages: List[str] | None = None, n_splines: int = 4, spline_order: int = 2, n_jobs: int = -1, expression_key: str = 'MAGIC_imputed_data', pseudo_time_key: str = 'palantir_pseudotime', fate_prob_key: str = 'palantir_fate_probabilities', gene_trend_key: str = 'palantir_gene_trends', save_as_df: bool = None) → Dict[str, Dict[str, DataFrame]]View on GitHub ¶

Computes gene expression trends along pseudotemporal trajectories.

This function calculates gene expression trends and their standard deviations along pseudotemporal trajectories computed by Palantir.

Parameters:

data (Union[AnnData, palantir.presults.PResults]) – Either a Scanpy AnnData object or a Palantir results object.
gene_exprs (pd.DataFrame, optional) – DataFrame of gene expressions with cells as rows and genes as columns. If not provided, gene expressions will be fetched from data using expression_key.
lineages (List[str], optional) – List of lineages for which to compute the trends. If None, all columns of the fate probability matrix are used. Default is None.
n_splines (int, optional) – Number of splines to use. Must be non-negative. Default is 4.
spline_order (int, optional) – Order of the splines to use. Must be non-negative. Default is 2.
n_jobs (int, optional) – Number of cores to use. If -1, all available cores will be used. Default is -1.
expression_key (str, optional) – Key to access gene expression matrix from a layer of the AnnData object. Default is ‘MAGIC_imputed_data’. Ignored if gene_exprs is provided.
pseudo_time_key (str, optional) – Key to access pseudotime from the obs attribute of the AnnData object. Default is ‘palantir_pseudotime’.
fate_prob_key (str, optional) – Key to access fate probabilities from the obsm attribute of the AnnData object. Default is ‘palantir_fate_probabilities’.
gene_trend_key (str, optional) – Key to store the computed gene trends in the varm attribute of the AnnData object. Default is ‘palantir_gene_trends’. The trends for each lineage are stored under ‘varm[gene_trend_key + “_” + lineage_name]’. The pseudotime points at which the trends are computed are stored in the uns attribute under ‘uns[gene_trend_key + “_” + lineage_name + “_pseudotime”]’.
save_as_df (bool, optional) – If True, the trends will be saved in the varm of the AnnData object as pandas DataFrame with pseudotime as column names. If False, the trends will be saved as numpy array and the pseudotime in uns[gene_trend_key + “_” + lineage_name + “_pseudotime”]. The option to save as DataFrame is there due to some versions of AnnData not being able to write h5ad files with DataFrames in ad.varm. Default is palantir.SAVE_AS_DF=True.

Returns:

Dictionary of gene expression trends and standard deviations for each branch.

Return type:

Dict[str, Dict[str, pd.DataFrame]]

palantir.presults.gam_fit_predict(x, y, weights=None, pred_x=None, n_splines=4, spline_order=2)View on GitHub ¶

Function to compute individual gene trends using pyGAM

This function requires the optional pygam package. If not installed, it will raise an ImportError with instructions on how to install it.

Parameters:

x (array-like) – Pseudotime axis
y (array-like) – Magic imputed expression for one gene
weights (array-like, optional) – Lineage branch weights
pred_x (array-like, optional) – Pseudotime axis for predicted values
n_splines (int, optional) – Number of splines to use. Must be non-negative.
spline_order (int, optional) – Order of spline to use. Must be non-negative.

Returns:

Predicted values and their standard deviations.

Return type:

tuple

Raises:

ImportError – If pygam is not installed. Install with pip install pygam or pip install palantir[gam].

palantir.presults.select_branch_cells(ad: AnnData, pseudo_time_key: str = 'palantir_pseudotime', fate_prob_key: str = 'palantir_fate_probabilities', q: float = 0.01, eps: float = 0.01, masks_key: str = 'branch_masks', save_as_df: bool = None)View on GitHub ¶

Selects cells along specific branches of pseudotime ordering.

This function identifies cells that are most likely to follow a certain lineage or “fate” by looking at their pseudotime order and fate probabilities. These cells are expected to be along the path of differentiation towards that specific fate.

Parameters:

ad (AnnData) – Annotated data matrix. The pseudotime and fate probabilities should be stored under the keys provided.
pseudo_time_key (str, optional) – Key to access the pseudotime from obs of the AnnData object. Default is ‘palantir_pseudotime’.
fate_prob_key (str, optional) – Key to access the fate probabilities from obsm of the AnnData object. Default is ‘palantir_fate_probabilities’.
q (float, optional) – Quantile used to determine the threshold for the fate probability. This parameter should be between 0 and 1. Default is 1e-2.
eps (float, optional) – A small constant substracted from the fate probability threshold. Default is 1e-2.
masks_key (str, optional) – Key under which the resulting branch cell selection masks are stored in the obsm of the AnnData object. Default is ‘branch_masks’.
save_as_df (bool, optional) – If True, the masks will be saved in obsm of the AnnData object as pandas DataFrame. If False, the masks will be saved as numpy array. The option to save as numpy array is there due to some versions of AnnData not being able to write h5ad files with DataFrames in ad.obsm. Default is palantir.SAVE_AS_DF.

Returns:

masks – An array of boolean masks that indicates whether each cell is on the path to each fate.

Return type:

np.ndarray