Core¶
Core functions for running Palantir
- palantir.core.identify_terminal_states(ms_data: DataFrame, early_cell: str, knn: int = 30, num_waypoints: int = 1200, n_jobs: int = -1, max_iterations: int = 25, seed: int = 20) Tuple[ndarray, Index]View on GitHub¶
Identify terminal states from multi-scale data.
This function identifies terminal states by sampling waypoints, constructing a pseudotime ordering, building a Markov chain, and analyzing its properties.
- Parameters:
ms_data (pd.DataFrame) – Multi-scale space diffusion components.
early_cell (str) – Start cell for pseudotime construction.
knn (int, optional) – Number of nearest neighbors for graph construction. Default is 30.
num_waypoints (int, optional) – Number of waypoints to sample. Default is 1200.
n_jobs (int, optional) – Number of jobs for parallel processing. Default is -1.
max_iterations (int, optional) – Maximum number of iterations for pseudotime convergence. Default is 25.
seed (int, optional) – Random seed for waypoint sampling. Default is 20.
- Returns:
- terminal_statesnp.ndarray
Array of identified terminal state cells.
- excluded_boundariespd.Index
Boundary cells that are not terminal states.
- Return type:
Tuple[np.ndarray, pd.Index]
- palantir.core.run_palantir(data: DataFrame | AnnData, early_cell, terminal_states: List | Dict | Series | None = None, knn: int = 30, num_waypoints: int = 1200, n_jobs: int = -1, scale_components: bool = True, use_early_cell_as_start: bool = False, max_iterations: int = 25, eigvec_key: str = 'DM_EigenVectors_multiscaled', pseudo_time_key: str = 'palantir_pseudotime', entropy_key: str = 'palantir_entropy', fate_prob_key: str = 'palantir_fate_probabilities', save_as_df: bool = None, waypoints_key: str = 'palantir_waypoints', seed: int = 20) object | NoneView on GitHub¶
Executes the Palantir algorithm to derive pseudotemporal ordering of cells, their fate probabilities, and state entropy based on the multiscale diffusion map results.
- Parameters:
data (Union[pd.DataFrame, AnnData]) – Either a DataFrame of multiscale space diffusion components or a Scanpy AnnData object.
early_cell (str) – Start cell for pseudotime construction.
terminal_states (List/Series/Dict, optional) – User-defined terminal states structure in the format {terminal_name:cell_name}. Default is None.
knn (int, optional) – Number of nearest neighbors for graph construction. Default is 30.
num_waypoints (int, optional) – Number of waypoints to sample. Default is 1200.
n_jobs (int, optional) – Number of jobs for parallel processing. Default is -1.
scale_components (bool, optional) – If True, components are scaled. Default is True.
use_early_cell_as_start (bool, optional) – If True, the early cell is used as start. Default is False.
max_iterations (int, optional) – Maximum number of iterations for pseudotime convergence. Default is 25.
eigvec_key (str, optional) – Key to access multiscale space diffusion components from obsm of the AnnData object. Default is ‘DM_EigenVectors_multiscaled’.
pseudo_time_key (str, optional) – Key to store the pseudotime in obs of the AnnData object. Default is ‘palantir_pseudotime’.
entropy_key (str, optional) – Key to store the entropy in obs of the AnnData object. Default is ‘palantir_entropy’.
fate_prob_key (str, optional) – Key to store the fate probabilities in obsm of the AnnData object. Default is ‘palantir_fate_probabilities’. If save_as_df is True, the fate probabilities are stored as pandas DataFrame with terminal state names as columns. If False, the fate probabilities are stored as numpy array and the terminal state names are stored in uns[fate_prob_key + “_columns”].
save_as_df (bool, optional) – If True, the fate probabilities are saved as pandas DataFrame. If False, the data is saved as numpy array. The option to save as DataFrame is there due to some versions of AnnData not being able to write h5ad files with DataFrames in ad.obsm. Default is palantir.SAVE_AS_DF = True.
waypoints_key (str, optional) – Key to store the waypoints in uns of the AnnData object. Default is ‘palantir_waypoints’.
seed (int, optional) – The seed for the random number generator used in waypoint sampling. Default is 20.
- Returns:
PResults object with pseudotime, entropy, branch probabilities, and waypoints. If an AnnData object is passed as data, the result is written to its obs, obsm, and uns attributes using the provided keys and None is returned.
- Return type:
Optional[PResults]