mlchem.chem.calculator.descriptors.get_chemotypes

get_chemotypes(mol_input_list: list | ndarray[str | Mol], chemotype_dict: dict | None = None) DataFrame

Identify chemotypes for a list of molecules.

This function applies a dictionary of chemotype definitions to each molecule in the input list. Each chemotype is defined by a function and its arguments. If no dictionary is provided, a default one is used.

Parameters:
  • mol_input_list (list or np.ndarray of str or rdkit.Chem.rdchem.Mol) – List or array of molecules in SMILES format or as RDKit Mol objects.

  • chemotype_dict (dict, optional) – Dictionary of chemotype definitions. Each entry should be a key with a tuple of (function, argument_dict). If None, a default dictionary is used.

Returns:

DataFrame containing the identified chemotypes for each molecule.

Return type:

pd.DataFrame

Examples

>>> get_chemotypes(["CCO", "c1ccccc1"])