mlchem.chem.calculator.descriptors.get_fingerprint_df

get_fingerprint_df(mol_input_list: list[str | Mol] | ndarray[str | Mol], fp_type: Literal['m', 'ap', 'rk', 'tt', 'mac'] = 'm', radius: int = 2, nBits: int = 2048, include_chirality: bool = False, include_bit_info: bool = False) DataFrame | tuple[DataFrame, dict]

Generate a DataFrame of fingerprints for a list of molecules.

This function computes fingerprints for each molecule in the input list and returns them as a DataFrame. Optionally, bit information can also be returned.

Parameters:
  • mol_input_list (list or np.ndarray of str or rdkit.Chem.rdchem.Mol) – List or array of molecules in SMILES format or as RDKit Mol objects.

  • fp_type ({'m', 'ap', 'rk', 'tt', 'mac'}, optional) – Type of fingerprint to generate. Default is ‘m’.

  • radius (int, optional) – Radius or path length depending on fingerprint type. Default is 2.

  • nBits (int, optional) – Size of the fingerprint. Default is 2048.

  • include_chirality (bool, optional) – Whether to include chirality. Default is False.

  • include_bit_info (bool, optional) – Whether to return bit information. Default is False.

Returns:

DataFrame of fingerprints. If include_bit_info is True, also returns a dictionary of bit information.

Return type:

pd.DataFrame or tuple of (pd.DataFrame, dict)

Examples

>>> get_fingerprint_df(["CCO", "c1ccccc1"], fp_type='m')