mlchem.chem package

Subpackages

Submodules

mlchem.chem.manipulation module

class MolCleaner

Bases: object

A class to clean and process SMILES strings.

The MolCleaner class provides methods to clean and process a list of SMILES strings. The cleaning process includes steps such as initialising SMILES, removing carbon ions, inorganics, organometallics, and mixtures, desalting, neutralising, and performing a final quality check.

Parameters:
  • input_smiles_list (list of str) – A list of SMILES strings representing the molecules to be cleaned.

  • id_list (list of int, optional) – A list of IDs corresponding to the SMILES strings. If not provided, IDs will be generated automatically.

input_smiles_list

The original list of SMILES strings.

Type:

list of str

input_id_list

The original list of IDs.

Type:

list of int

df_input

A DataFrame containing the input IDs and SMILES strings.

Type:

pandas.DataFrame

n_steps

The number of cleaning steps performed.

Type:

int

smiles

The current list of SMILES strings after cleaning steps.

Type:

list of str

ids

The current list of IDs after cleaning steps.

Type:

list of int

IsIsomeric

A flag indicating whether the SMILES strings are isomeric.

Type:

bool

IsCanonical

A flag indicating whether the SMILES strings are canonical.

Type:

bool

IsKekulised

A flag indicating whether the SMILES strings are kekulised.

Type:

bool

df_accepted

A DataFrame to store accepted SMILES strings.

Type:

pandas.DataFrame

df_rejected

A DataFrame to store rejected SMILES strings and the reason for rejection.

Type:

pandas.DataFrame

Examples

>>> cleaner = MolCleaner([smiles_1, smiles_2], [id_1, id_2])
>>> cleaner.initialise_smiles()
>>> cleaner.remove_carbon_ions()
>>> cleaner.remove_inorganics()
>>> cleaner.remove_organometallics()
>>> cleaner.remove_mixtures()
>>> cleaner.desalt_smiles(method='largest')
>>> cleaner.neutralise_smiles()
>>> cleaner.quality_checker()

Alternatively:

>>> cleaner.full_clean()
__init__(input_smiles_list: list[str], id_list: list[int] | None = None) None

Initialise the MolCleaner instance.

This constructor sets up the initial state of the MolCleaner object, including the input SMILES strings, optional IDs, and internal tracking variables for cleaning steps and SMILES processing.

Parameters:
  • input_smiles_list (list of str) – A list of SMILES strings representing the molecules to be cleaned.

  • id_list (list of int, optional) – A list of IDs corresponding to the SMILES strings. If not provided, IDs will be automatically generated as a range from 0 to the number of SMILES strings.

Return type:

None

desalt_smiles(method: Literal['chembl', 'rdkit', 'largest'] = 'largest', dehydrate: bool = True, isomeric: bool | None = None, canonical: bool | None = None, kekulise: bool | None = None, verbose: bool = False) None

Desalt the SMILES strings using the specified method.

This method processes the input SMILES strings to remove salts using one of the available methods: ‘chembl’, ‘rdkit’, or ‘largest’. It updates the SMILES strings and IDs, and tracks rejected SMILES strings with reasons.

Parameters:
  • method ({'chembl', 'rdkit', 'largest'}, optional) – The method to use for desalting. Default is ‘largest’.

  • dehydrate (bool, optional) – Whether to remove water fragments before desalting. Default is True.

  • isomeric (bool, optional) – Whether to generate isomeric SMILES. Default is None.

  • canonical (bool, optional) – Whether to generate canonical SMILES. Default is None.

  • kekulise (bool, optional) – Whether to kekulise the SMILES. Default is None.

  • verbose (bool, optional) – Whether to print verbose output. Default is False.

Return type:

None

full_clean(desalting_method: Literal['rdkit', 'chembl', 'largest'] = 'largest') None

Perform a full cleaning process on the SMILES strings.

This method sequentially applies all cleaning steps to the input SMILES strings, including initialisation, filtering, desalting, neutralisation, and quality checking.

Parameters:

desalting_method ({'rdkit', 'chembl', 'largest'}, optional) – The method to use for desalting. Default is ‘largest’.

Return type:

None

Examples

>>> cleaner = MolCleaner(smiles_list)
>>> cleaner.full_clean(desalting_method='chembl')
initialise_smiles(isomeric: bool = False, canonical: bool = True, kekulise: bool = True) None

Initialise the SMILES strings with specified options.

This method processes the input SMILES strings according to the specified options for isomeric, canonical, and kekulised representations. It updates the SMILES strings and IDs, and tracks rejected SMILES strings with reasons.

Parameters:
  • isomeric (bool, optional) – Whether to generate isomeric SMILES. Default is False.

  • canonical (bool, optional) – Whether to generate canonical SMILES. Default is True.

  • kekulise (bool, optional) – Whether to kekulise the SMILES. Default is True.

Return type:

None

neutralise_smiles(isomeric: bool | None = None, canonical: bool | None = None, kekulise: bool | None = None) None

Neutralise the SMILES strings.

This method processes the input SMILES strings to neutralise charged species. It updates the SMILES strings and IDs, and tracks rejected SMILES strings with reasons.

Parameters:
  • isomeric (bool, optional) – Whether to generate isomeric SMILES. Default is None.

  • canonical (bool, optional) – Whether to generate canonical SMILES. Default is None.

  • kekulise (bool, optional) – Whether to kekulise the SMILES. Default is None.

Return type:

None

quality_checker() None

Perform quality check on SMILES strings.

This method evaluates the structural integrity of accepted SMILES strings using the ChEMBL structure checker. It assigns a priority score and message to each molecule, indicating the severity of any issues found.

Return type:

None

Examples

>>> cleaner.quality_checker()
>>> cleaner.df_accepted[['id', 'PRIORITY', 'MESSAGES']]
remove_carbon_ions() None

Remove SMILES strings containing carbon ions.

This method filters out SMILES strings that contain carbon ions. It updates the internal SMILES list and IDs, and logs rejected entries with the reason for rejection.

Return type:

None

remove_inorganics() None

Remove inorganic SMILES strings.

This method filters out SMILES strings that are classified as inorganic. It updates the internal SMILES list and IDs, and logs rejected entries with the reason for rejection.

Return type:

None

remove_mixtures() None

Remove SMILES strings that are mixtures.

This method filters out SMILES strings that represent mixtures of multiple components, unless they are simple binary mixtures involving metals or halogens. It updates the internal SMILES list and IDs, and logs rejected entries with the reason for rejection.

Return type:

None

remove_organometallics() None

Remove SMILES strings containing organometallic compounds.

This method filters out SMILES strings that contain organometallic structures, excluding simple metal salts. It updates the internal SMILES list and IDs, and logs rejected entries with the reason for rejection.

Return type:

None

class MolGenerator

Bases: object

A class to generate SMILES strings using SELFIES fragments.

The MolGenerator class allows for the generation of new molecules represented as SMILES strings by combining SELFIES fragments either randomly or through substitution into a template molecule.

Examples

>>> generator = MolGenerator()
>>> generator.generate_smiles(template_smiles='c1ccccc1',
...                           n_molecules=20,
...                           n_fragments=5,
...                           n_substitutions=1,
...                           attempt_limit=1000)
>>> cleaner = MolCleaner(generator.smiles_generated)
>>> cleaner.initialise_smiles()
>>> cleaner.neutralise_smiles()
>>> generator.smiles_generated = cleaner.smiles
__init__(dictionary: dict = None)

Initialise the MolGenerator instance.

This constructor sets up the SELFIES fragment dictionary used for molecule generation. If a custom dictionary is provided, it is used to populate the internal fragment bag; otherwise, a default dictionary is used.

Parameters:

dictionary (dict, optional) – A custom dictionary of SELFIES fragments and their frequencies. If None, a default dictionary is used.

Return type:

None

generate_smiles(n_molecules: int, n_fragments: int, template_smiles: str = None, substitution_sites: list = None, n_substitutions: int = None, include_extremities: bool = True, attempt_limit: int = 1000) None

Generate SMILES strings using a template or random fragments.

This method generates a specified number of SMILES strings either by randomly combining SELFIES fragments or by substituting fragments into a template molecule at specified positions.

Parameters:
  • n_molecules (int) – The number of molecules to generate.

  • n_fragments (int) – The number of fragments to use for each molecule.

  • template_smiles (str, optional) – A template SMILES string to use for generating molecules.

  • substitution_sites (list of int, optional) – Indices in the SELFIES string where substitutions should occur.

  • n_substitutions (int, optional) – The number of substitutions to make in the template.

  • include_extremities (bool, optional) – Whether to include the start and end of the SELFIES string as possible substitution sites. Default is True.

  • attempt_limit (int, optional) – The maximum number of attempts to generate the specified number of molecules. Default is 1000.

Returns:

  • None

  • Updates

  • ——-

  • template_smiles (str) – The template SMILES string used for generation.

  • smiles_generated (list of str) – The list of generated SMILES strings.

  • selfies_generated (list of str) – The list of generated SELFIES strings.

  • mols_generated (list of rdkit.Chem.rdchem.Mol) – The list of RDKit Mol objects corresponding to the generated SMILES.

  • pattern_atoms (list) – A list of pattern atom matches for each generated molecule.

  • double_legend (list of str) – A list of strings combining SMILES and SELFIES for each molecule.

Examples

>>> generator.generate_smiles(template_smiles='c1ccccc1',
...                           n_molecules=5,
...                           n_fragments=3,
...                           n_substitutions=1)
class PatternRecognition

Bases: object

A utility class for recognising chemical patterns using SMARTS.

This class provides a reference for common SMARTS-based pattern matching used in cheminformatics, particularly with RDKit. It includes a vocabulary of generic chemical groupings and links to external resources for further information on SMARTS syntax and usage.

References

SMARTS Vocabulary

  • Alkyl (ALK): alkyl side chains (not an H atom)

  • AlkylH (ALH): alkyl side chains including an H atom

  • Alkenyl (AEL): alkenyl side chains

  • AlkenylH (AEH): alkenyl side chains or an H atom

  • Alkynyl (AYL): alkynyl side chains

  • AlkynylH (AYH): alkynyl side chains or an H atom

  • Alkoxy (AOX): alkoxy side chains

  • AlkoxyH (AOH): alkoxy side chains or an H atom

  • Carbocyclic (CBC): carbocyclic side chains

  • CarbocyclicH (CBH): carbocyclic side chains or an H atom

  • Carbocycloalkyl (CAL): cycloalkyl side chains

  • CarbocycloalkylH (CAH): cycloalkyl side chains or an H atom

  • Carbocycloalkenyl (CEL): cycloalkenyl side chains

  • CarbocycloalkenylH (CEH): cycloalkenyl side chains or an H atom

  • Carboaryl (ARY): all-carbon aryl side chains

  • CarboarylH (ARH): all-carbon aryl side chains or an H atom

  • Cyclic (CYC): cyclic side chains

  • CyclicH (CYH): cyclic side chains or an H atom

  • Acyclic (ACY): acyclic side chains (not an H atom)

  • AcyclicH (ACH): acyclic side chains or an H atom

  • Carboacyclic (ABC): all-carbon acyclic side chains

  • CarboacyclicH (ABH): all-carbon acyclic side chains or an H atom

  • Heteroacyclic (AHC): acyclic side chains with at least one heteroatom

  • HeteroacyclicH (AHH): acyclic side chains with at least one heteroatom or an H atom

  • Heterocyclic (CHC): cyclic side chains with at least one heteroatom

  • HeterocyclicH (CHH): cyclic side chains with at least one heteroatom or an H atom

  • Heteroaryl (HAR): aryl side chains with at least one heteroatom

  • HeteroarylH (HAH): aryl side chains with at least one heteroatom or an H atom

  • NoCarbonRing (CXX): ring containing no carbon atoms

  • NoCarbonRingH (CXH): ring containing no carbon atoms or an H atom

  • Group (G): any group (not H atom)

  • GroupH (GH): any group (including H atom)

  • Group* (G*): any group with a ring closure

  • GroupH* (GH*): any group with a ring closure or an H atom

class Atoms

Bases: object

static get_ring_size(atom: Atom) int

Get the size of the ring an atom belongs to.

This method checks whether the given RDKit atom is part of a ring, and if so, determines the size of the smallest ring it is in. If the atom is not in any ring, the method returns 0.

Parameters:

atom (rdkit.Chem.rdchem.Atom) – The atom whose ring membership and size is to be evaluated.

Returns:

The size of the smallest ring the atom is part of, or 0 if the atom is not in a ring.

Return type:

int

Examples

>>> atom = mol.GetAtomWithIdx(3)
>>> MolDrawer.get_ring_size(atom)
static is_SP(atom: Atom) int

Check if an atom is SP-hybridised.

This method evaluates whether the given RDKit atom is SP-hybridised (i.e., has linear geometry with two electron domains).

Parameters:

atom (rdkit.Chem.rdchem.Atom) – The atom to check.

Returns:

1 if the atom is SP-hybridised, 0 otherwise.

Return type:

int

Examples

>>> atom = mol.GetAtomWithIdx(0)
>>> MolDrawer.is_SP(atom)
static is_SP2(atom: Atom) int

Check if an atom is SP2-hybridised.

This method evaluates whether the given RDKit atom is SP2-hybridised, which typically corresponds to trigonal planar geometry with three electron domains.

Parameters:

atom (rdkit.Chem.rdchem.Atom) – The atom to check.

Returns:

1 if the atom is SP2-hybridised, 0 otherwise.

Return type:

int

Examples

>>> atom = mol.GetAtomWithIdx(1)
>>> MolDrawer.is_SP2(atom)
static is_SP3(atom: Atom) int

Check if an atom is SP3-hybridised.

This method determines whether the given RDKit atom is SP3-hybridised, which typically corresponds to tetrahedral geometry with four electron domains.

Parameters:

atom (rdkit.Chem.rdchem.Atom) – The atom to check.

Returns:

1 if the atom is SP3-hybridised, 0 otherwise.

Return type:

int

Examples

>>> atom = mol.GetAtomWithIdx(2)
>>> MolDrawer.is_SP3(atom)
class Base

Bases: object

static check_smarts_pattern(target: str | Mol, smarts_pattern: str, generic_keywords: list = []) tuple[bool, ndarray[int], str]

Check if a given SMARTS pattern matches a target molecule, optionally using generic keywords.

This function supports both standard SMARTS syntax and generic keywords for more intuitive pattern matching.

Parameters:
  • target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or RDKit Mol object representing the target molecule.

  • smarts_pattern (str) – A SMARTS pattern to match against the target.

  • generic_keywords (list of str, optional) – Generic keywords to substitute into the SMARTS pattern.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices where the pattern matches. - The final SMARTS pattern used for matching.

Return type:

tuple of (bool, numpy.ndarray of int, str)

Examples

>>> from mlchem.chem.manipulation import PatternRecognition as pr
>>> pr.Base.check_smarts_pattern('CCCCc1ccccc1', smarts_pattern='CC*', generic_keywords=['CYC'])
(True, array([3, 4, 5]), 'CC* |$;;CYC$|')
>>> pr.Base.check_smarts_pattern('CCCCc1ccccc1', smarts_pattern='[R]')
(True, array([4, 5, 6, 7, 8, 9]), '[R]')
>>> pr.Base.check_smarts_pattern('CCCCc1ccccc1', smarts_pattern='[*]', generic_keywords=['CYC'])
(True, array([4, 5, 6, 7, 8, 9]), '* |$CYC$|')
static check_smiles_pattern(target: str | Mol, smiles_pattern: str) tuple[bool, list[int]]

Check if a given SMILES pattern matches a target molecule.

This function checks whether a SMILES pattern matches any substructure within the target molecule.

Parameters:
  • target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or RDKit Mol object representing the target molecule.

  • smiles_pattern (str) – A SMILES pattern to match against the target.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices where the pattern matches.

Return type:

tuple of (bool, list of int)

static count_atoms(target: str | Mol) int

Count the number of atoms in a target molecule.

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – The target molecule.

Returns:

The number of atoms in the molecule.

Return type:

int

static count_bonds(target: str | Mol) int

Count the number of bonds in a target molecule.

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – The target molecule.

Returns:

The number of bonds in the molecule.

Return type:

int

static get_MCS(input1: str | Mol, input2: str | Mol, threshold: float = 0.0, completeAromaticRings: bool = False, similarity_type: Literal['tanimoto', 'johnson'] = 'tanimoto') tuple[bool, str, list[int], list[int], float]

Find the Maximum Common Substructure (MCS) between two molecules.

This function identifies the MCS between two molecules, which can be provided as SMILES strings or RDKit molecule objects. It returns the SMARTS string of the MCS, atom indices in both molecules, and a similarity score.

More information: https://greglandrum.github.io/rdkit-blog/posts/2023-11-08-introducingrascalmces.html

Parameters:
  • input1 (str or rdkit.Chem.rdchem.Mol) – The first molecule in SMILES format or as an RDKit object.

  • input2 (str or rdkit.Chem.rdchem.Mol) – The second molecule in SMILES format or as an RDKit object.

  • threshold (float, optional) – Similarity threshold for MCS detection. Must be in the interval [0, 1). Default is 0.0.

  • completeAromaticRings (bool, optional) – Whether to require complete aromatic ring matches. Default is False.

  • similarity_type ({'tanimoto', 'johnson'}, optional) – The similarity metric to use. Default is ‘tanimoto’.

Returns:

A tuple containing: - bool : Whether an MCS was found. - str : SMARTS string of the MCS. - list of int : Atom indices in the first molecule. - list of int : Atom indices in the second molecule. - float : Similarity score.

Return type:

tuple

static get_atoms(target: str | Mol) list

Retrieve a list of atoms from a target molecule.

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – The target molecule.

Returns:

A list of atom objects in the molecule.

Return type:

list of rdkit.Chem.rdchem.Atom

static get_bonds(target: str | Mol) list

Retrieve a list of bonds from a target molecule.

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – The target molecule.

Returns:

A list of bond objects in the molecule.

Return type:

list of rdkit.Chem.rdchem.Bond

static get_stereoisomers(target: str | Mol, drawer=None) tuple[list[Mol], list]

Retrieve stereoisomers and their images for a target molecule.

This function takes a target molecule, which can be either a SMILES string or an RDKit molecule object, and returns a list of stereoisomers and their corresponding images. An optional MolDrawer instance can be provided to customise drawing options.

Parameters:
  • target (str or rdkit.Chem.rdchem.Mol) – A molecule in SMILES format or an RDKit molecule object.

  • drawer (MolDrawer, optional) – An instance of MolDrawer to preserve drawing options. Default is None.

Returns:

A tuple containing a list of stereoisomer molecules and their corresponding images.

Return type:

tuple of (list of rdkit.Chem.rdchem.Mol, list)

static get_tautomers(target: str | Mol) list[str]

Retrieve a list of tautomers for a target molecule.

This function takes a target molecule, which can be either a SMILES string or an RDKit molecule object, and returns a list of SMILES strings representing the tautomers of the molecule.

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A molecule in SMILES format or an RDKit molecule object.

Returns:

A list of SMILES strings representing the tautomers of the target molecule.

Return type:

list of str

static has_carbon_ion(target: str | Mol) bool

Detect the presence of carbon ions in a target molecule.

This function checks if a molecule contains charged carbon atoms (carbocations or carbanions). Carbanions that are part of nitrile groups are excluded unless carbocations are also present.

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A molecule in SMILES format or an RDKit molecule object.

Returns:

True if the molecule contains carbon ions, False otherwise.

Return type:

bool

static has_metal_salt(target: str | Mol, custom_metals: list | None = None) bool

Determine whether a target molecule contains a metal salt.

This function checks for the presence of metal salts in a molecule. A custom list of metal elements can be provided, otherwise a default list is used.

Parameters:
  • target (str or rdkit.Chem.rdchem.Mol) – A molecule in SMILES format or an RDKit molecule object.

  • custom_metals (list of str, optional) – A custom list of metal elements to check for. Default is None.

Returns:

True if the molecule contains a metal salt, False otherwise.

Return type:

bool

static is_organic(target: str | Mol) bool

Determine whether a target molecule is organic.

This function checks if a molecule contains carbon atoms and is not classified as a carbonic acid. A molecule is considered inorganic if the only carbon atoms present belong to carbonic acid groups.

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A molecule in SMILES format or an RDKit molecule object.

Returns:

True if the molecule is organic, False otherwise.

Return type:

bool

static pattern_abs_fraction_greater_than(target: str | Mol, func, threshold: float, hidden_pattern_function=None) bool

Determine if the fraction of atoms belonging to a pattern exceeds a given threshold.

This function calculates the fraction of atoms in a target molecule that match a given pattern. The pattern is defined by a function (e.g. a SMARTS matcher). An optional hidden pattern function can be provided to refine the numerator (e.g. to count only aromatic carbon atoms).

The denominator is always the total number of atoms in the molecule.

Use this when you want to know:

“Does this pattern make up more than X% of the molecule?”

Parameters:
  • target (str or rdkit.Chem.rdchem.Mol) – The target molecule, as a SMILES string or RDKit Mol object.

  • func (callable) – A function that returns a tuple like (True, [atom_indices]) for the pattern.

  • threshold (float) – The minimum fraction of atoms that must match the pattern.

  • hidden_pattern_function (callable, optional) – A secondary function to refine the atom subset in the numerator.

Returns:

True if the fraction of matching atoms exceeds the threshold.

Return type:

bool

Examples

>>> from mlchem.chem.manipulation import PatternRecognition as pr
>>> # Define a pattern function to find carbon atoms
>>> def check_carbon(target):
...     return pr.Base.check_smarts_pattern(target, smarts_pattern='[C]')
>>> # Check if more than 60% of atoms are carbon
>>> pr.Base.pattern_abs_fraction_greater_than('CCCC(=O)O', check_carbon, threshold=0.6)
True
>>> # Example with a hidden pattern function (e.g. aromatic carbon among all atoms)
>>> def check_aromatic(target, pattern_function):
...     return pr.Base.check_smarts_pattern(target, smarts_pattern='[a]')
>>> pr.Base.pattern_abs_fraction_greater_than('OCCc1ccccc1', check_aromatic,
...     threshold=0.5, hidden_pattern_function=check_carbon)
True
Use Cases
  • “Are more than 30% of all atoms aromatic carbon?”

  • “Do heteroatoms make up more than 20% of the molecule?”

Notes

This method always uses the total number of atoms in the molecule as the denominator. To compare two patterns directly, use pattern_rel_fraction_greater_than.

>>> # Using abs method with hidden pattern
>>> pr.Base.pattern_abs_fraction_greater_than(
...     target,
...     func=check_pattern_aromatic,
...     threshold=0.5,
...     hidden_pattern_function=check_carbon)
static pattern_rel_fraction_greater_than(target: str | Mol, func1, func2, threshold: float, hidden_pattern_function=None) bool

Determine if the fraction of atoms belonging to one pattern exceeds a given threshold relative to another pattern.

This function compares the number of atoms matching a primary pattern to those matching a secondary pattern. An optional hidden pattern function can be passed if the primary function requires two arguments.

Use this when you want to know:

“Does pattern A make up more than X% of pattern B?”

Parameters:
  • target (str or rdkit.Chem.rdchem.Mol) – The target molecule, as a SMILES string or RDKit Mol object.

  • func1 (callable) – Function identifying the primary pattern (numerator).

  • func2 (callable) – Function identifying the reference pattern (denominator).

  • threshold (float) – The minimum relative fraction required (e.g. 0.5 means 50% of func2 atoms must match func1).

  • hidden_pattern_function (callable, optional) – A secondary function to pass into func1 if it requires it.

Returns:

True if the relative fraction exceeds the threshold.

Return type:

bool

Examples

>>> from mlchem.chem.manipulation import PatternRecognition as pr
>>> # Define pattern functions
>>> def check_carbon(target):
...     return pr.Base.check_smarts_pattern(target, smarts_pattern='[C]')
>>> def check_alkyl_carbon(target):
...     return pr.Base.check_smarts_pattern(target, smarts_pattern='[CX3]')
>>> # Check if more than 30% of carbon atoms are alkyl
>>> pr.Base.pattern_rel_fraction_greater_than('CC(C)C(=O)O', check_alkyl_carbon, check_carbon, threshold=0.3)
True
Use Cases
  • “Are more than 30% of carbon atoms alkyl?”

  • “Are more than 50% of ring atoms aromatic?”

Notes

This method uses the number of atoms matched by func2 as the denominator. If func1 requires a second argument (e.g. a filtering function), it will be passed hidden_pattern_function.

class Bonds

Bases: object

static check_aromatic_bonds(target: str | Mol) tuple[bool, list[int], str]

Check for aromatic bonds in a molecule.

This function takes a molecule and performs an aromatic bond pattern check using the SMARTS pattern *:*.

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A molecule in SMILES format or an RDKit molecule object.

Returns:

A tuple containing: - bool : Whether the pattern was found. - list of int : Atom indices that match the pattern. - str : The SMARTS string representing the matched pattern.

Return type:

tuple

static check_bonds(target: str | Mol) tuple[bool, list[int], str]

Check the bonds in a molecule.

This function takes a molecule and performs a generic bond pattern check using the SMARTS pattern *~*.

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A molecule in SMILES format or an RDKit molecule object.

Returns:

A tuple containing: - bool : Whether the pattern was found. - list of int : Atom indices that match the pattern. - str : The SMARTS string representing the matched pattern.

Return type:

tuple

static check_cyclic_bonds(target: str | Mol) tuple[bool, list[int], str]

Check for cyclic bonds in a molecule.

This function takes a molecule and performs a cyclic bond pattern check using the SMARTS pattern *@*.

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A molecule in SMILES format or an RDKit molecule object.

Returns:

A tuple containing: - bool : Whether the pattern was found. - list of int : Atom indices that match the pattern. - str : The SMARTS string representing the matched pattern.

Return type:

tuple

static check_double_bonds(target: str | Mol) tuple[bool, list[int], str]

Check for double bonds in a molecule.

This function takes a molecule and performs a double bond pattern check using the SMARTS pattern *=*.

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A molecule in SMILES format or an RDKit molecule object.

Returns:

A tuple containing: - bool : Whether the pattern was found. - list of int : Atom indices that match the pattern. - str : The SMARTS string representing the matched pattern.

Return type:

tuple

static check_rotatable_bonds(target: str | Mol) tuple[bool, list[int], str]

Check for rotatable bonds in a molecule.

This function takes a molecule and performs a rotatable bond pattern check using the SMARTS pattern: [!$(*#*)&!D1]-!@[!$(*#*)&!D1].

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A molecule in SMILES format or an RDKit molecule object.

Returns:

A tuple containing: - bool : Whether the pattern was found. - list of int : Atom indices that match the pattern. - str : The SMARTS string representing the matched pattern.

Return type:

tuple

static check_single_bonds(target: str | Mol) tuple[bool, list[int], str]

Check for single bonds in a molecule.

This function takes a molecule and performs a single bond pattern check using the SMARTS pattern *-*.

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A molecule in SMILES format or an RDKit molecule object.

Returns:

A tuple containing: - bool : Whether the pattern was found. - list of int : Atom indices that match the pattern. - str : The SMARTS string representing the matched pattern.

Return type:

tuple

static check_triple_bonds(target: str | Mol) tuple[bool, list[int], str]

Check for triple bonds in a molecule.

This function takes a molecule and performs a triple bond pattern check using the SMARTS pattern *#*.

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A molecule in SMILES format or an RDKit molecule object.

Returns:

A tuple containing: - bool : Whether the pattern was found. - list of int : Atom indices that match the pattern. - str : The SMARTS string representing the matched pattern.

Return type:

tuple

static is_dative_bond(bond: Bond) int

Check if a bond is a dative bond.

This function takes an RDKit bond object and checks whether it is a dative bond.

Parameters:

bond (rdkit.Chem.rdchem.Bond) – An RDKit bond object.

Returns:

1 if the bond is a dative bond, 0 otherwise.

Return type:

int

static is_double_bond(bond: Bond) int

Check if a bond is a double bond.

This function takes an RDKit bond object and checks whether it is a double bond.

Parameters:

bond (rdkit.Chem.rdchem.Bond) – An RDKit bond object.

Returns:

1 if the bond is a double bond, 0 otherwise.

Return type:

int

static is_single_bond(bond: Bond) int

Check if a bond is a single bond.

This function takes an RDKit bond object and checks whether it is a single bond.

Parameters:

bond (rdkit.Chem.rdchem.Bond) – An RDKit bond object.

Returns:

1 if the bond is a single bond, 0 otherwise.

Return type:

int

static is_triple_bond(bond: Bond) int

Check if a bond is a triple bond.

This function takes an RDKit bond object and checks whether it is a triple bond.

Parameters:

bond (rdkit.Chem.rdchem.Bond) – An RDKit bond object.

Returns:

1 if the bond is a triple bond, 0 otherwise.

Return type:

int

class MolPatterns

Bases: object

static alpha_nitroalkane(target: str | Mol) tuple[bool, list[int], str]

Check for alpha-nitroalkane groups in a molecule.

SMARTS pattern used: - [CX4H1,H2,H3]#7D3[#8]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.alpha_nitroalkane("CC(N(=O)=O)C")
static check_acetylenic_carbon(target: str | Mol) tuple[bool, list[int], str]

Check for acetylenic carbon atoms in a molecule.

The SMARTS pattern used to identify acetylenic carbon atoms is [$([CX2]#C)].

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_acetylenic_carbon("CC#C")
static check_acyl_halide(target: str | Mol) tuple[bool, list[int], str]

Check for acyl halides in a molecule.

The SMARTS pattern used to identify acyl halides is CX3[F,Cl,Br,I].

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_acyl_halide("CC(=O)Cl")
static check_alanine(target: str | Mol) tuple[bool, list[int], str]

Check for alanine residues in a molecule.

SMARTS pattern used: - [$([NX3H2,NX4H3+]),$(NX3H(C))]CX4HCX3[OX2H,OX1-,N]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_alanine("CC(C(=O)O)N")
static check_alcohol(target: str | Mol) tuple[bool, list[int], str]

Check for alcohol groups in a molecule.

SMARTS pattern used: - [#6][OX2H]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_alcohol("CCO")
static check_aldehyde(target: str | Mol) tuple[bool, list[int], str]

Check for aldehyde groups in a molecule.

The SMARTS pattern used to identify aldehyde groups is CX3H1[#6].

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_aldehyde("CC=O")
static check_alkali_metals(target: str | Mol) tuple[bool, list[int], str]

Check for alkali metals in a molecule.

SMARTS pattern used: - [Li,Na,K,Rb,Cs,Fr]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether any alkali metal was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_alkali_metals("[Na+]")
static check_alkaline_earth_metals(target: str | Mol) tuple[bool, list[int], str]

Check for alkaline earth metals in a molecule.

SMARTS pattern used: - [Be,Mg,Ca,Sr,Ba,Ra]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether any alkaline earth metal was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_alkaline_earth_metals("[Mg++]")
static check_alkyl_carbon(target: str | Mol) tuple[bool, list[int], str]

Check for alkyl carbon atoms in a molecule.

The SMARTS pattern used to identify alkyl carbon atoms is [CX4].

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_alkyl_carbon("CC")
static check_allenic_carbon(target: str | Mol) tuple[bool, list[int], str]

Check for allenic carbon atoms in a molecule.

The SMARTS pattern used to identify allenic carbon atoms is $([CX2=C)].

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_allenic_carbon("C=C=C")
static check_alpha_dicarbonyl(target: str | Mol) tuple[bool, list[int], str]

Check for alpha-dicarbonyl groups in a molecule.

The SMARTS pattern used to identify alpha-dicarbonyl groups is O=[#6][#6]=O.

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_alpha_dicarbonyl("O=CC=O")
static check_alpha_diketone(target: str | Mol) tuple[bool, list[int], str]

Check for alpha-diketone groups in a molecule.

The SMARTS pattern used to identify alpha-diketone groups is O=#6D3#6D3=O.

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_alpha_diketone("CC(=O)CC(=O)C")
static check_amide(target: str | Mol) tuple[bool, list[int], str]

Check for amide groups in a molecule.

The SMARTS pattern used to identify amide groups is [#7X3]#6X3[#6].

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_amide("CC(=O)NC")
static check_amine(target: str | Mol) tuple[bool, list[int], tuple[str, str, str, str]]

Check for amine groups in a molecule.

This function checks for primary, secondary, tertiary, and quaternary amine groups in the target molecule. The results are combined and returned as a single tuple.

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether any amine group was found. - A list of atom indices matching any of the amine group patterns. - A tuple of SMARTS strings representing the matched patterns for

each type of amine group.

Return type:

tuple[bool, list[int], tuple[str, str, str, str]]

Examples

>>> MolPatterns.check_amine("CN(C)C")
static check_amine_primary(target: str | Mol) tuple[bool, list[int], tuple[str, str]]

Check for primary amine groups in a molecule.

SMARTS patterns used: - Nitrogen: [#7] - Amine: [#6][#7D1&!$(NC=O)]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A tuple of SMARTS strings representing the matched patterns.

Return type:

tuple[bool, list[int], tuple[str, str]]

Examples

>>> MolPatterns.check_amine_primary("CCNH2")
static check_amine_quaternary(target: str | Mol) tuple[bool, list[int], tuple[str, str]]

Check for quaternary amine groups in a molecule.

SMARTS patterns used: - Nitrogen: [#7] - Amine: [#6][#7D4+&!$NC=O)([#6])([#6])

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A tuple of SMARTS strings representing the matched patterns.

Return type:

tuple[bool, list[int], tuple[str, str]]

Examples

>>> MolPatterns.check_amine_quaternary("CN+(C)C")
static check_amine_secondary(target: str | Mol) tuple[bool, list[int], tuple[str, str]]

Check for secondary amine groups in a molecule.

SMARTS patterns used: - Nitrogen: [#7] - Amine: [#6][#7D2&!$(NC=O)][#6]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A tuple of SMARTS strings representing the matched patterns.

Return type:

tuple[bool, list[int], tuple[str, str]]

Examples

>>> MolPatterns.check_amine_secondary("CCNC")
static check_amine_tertiary(target: str | Mol) tuple[bool, list[int], tuple[str, str]]

Check for tertiary amine groups in a molecule.

SMARTS patterns used: - Nitrogen: [#7] - Amine: [#6]#7D3&!$(NC=O)[#6]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A tuple of SMARTS strings representing the matched patterns.

Return type:

tuple[bool, list[int], tuple[str, str]]

Examples

>>> MolPatterns.check_amine_tertiary("CN(C)C")
static check_aminoacid(target: str | Mol) tuple[bool, list[int], str]

Check for generic amino acid residues in a molecule, including proline and glycine.

SMARTS patterns used: - Generic amino acid: [$([NX3H2,NX4H3+]),$(NX3H(C))]CX4HCX3[OX2H,OX1-,N] - Glycine: [$([$([NX3H2,NX4H3+]),$(NX3H(C))][CX4H2]CX3[OX2H,OX1-,N])] - Proline: [$([NX3H,NX4H2+]),$(NX3(C)(C))]1CX4HCX3[OX2H,OX1-,N]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether any of the patterns were found. - A list of atom indices matching the first found pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_aminoacid("C(C(=O)O)N")
static check_anhydride(target: str | Mol) tuple[bool, list[int], str]

Check for anhydride groups in a molecule.

The SMARTS pattern used to identify anhydride groups is ``CX3[OX2][CX3].

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_anhydride("CC(=O)OC(=O)C")
static check_arginine(target: str | Mol) tuple[bool, list[int], str]

Check for arginine residues in a molecule.

SMARTS pattern used: - [CX3[OX2])CH1X4[CH2X4][CH2X4][CH2X4][ND2]=CD3[NX3]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_arginine("NC(CCCNC(N)=N)C(=O)O")
static check_asparagine(target: str | Mol) tuple[bool, list[int], str]

Check for asparagine residues in a molecule.

SMARTS pattern used: - [$([NX3H2,NX4H3+]),$([NXH(C))]CX4H[NX3H2])CX3[OX2H,OX1-,N]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_asparagine("NC(CC(=O)N)C(=O)O")
static check_aspartate(target: str | Mol) tuple[bool, list[int], str]

Check for aspartate residues in a molecule.

SMARTS pattern used: - [$([NX3H2,NX4H3+]),$(NX3H(C))][CX4H]H0-,OH])CX3[OX2H,OX1-,N]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_aspartate("NC(CC(=O)O)C(=O)O")
static check_azide(target: str | Mol) tuple[bool, list[int], str]

Check for azide groups in a molecule.

SMARTS pattern used: - [$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_azide("CCN=[N+]=[N-]")
static check_azo(target: str | Mol) tuple[bool, list[int], str]

Check for azo groups in a molecule.

SMARTS pattern used: - [#6][#7D2]=[#7D2][#6]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_azo("C1=CC=C(C=C1)N=NC2=CC=CC=C2")
static check_azoxy(target: str | Mol) tuple[bool, list[int], str]

Check for azoxy groups in a molecule.

SMARTS pattern used: - [$([NX2]=NX3+[#6]),$([NX2]=NX3+0[#6])]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_azoxy("CC1=NN(O)=CC=C1")
static check_beta_dicarbonyl(target: str | Mol) tuple[bool, list[int], str]

Check for beta-dicarbonyl groups in a molecule.

The SMARTS pattern used to identify beta-dicarbonyl groups is O=[#6][#6][#6]=O.

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_beta_dicarbonyl("O=CCC=O")
static check_beta_diketone(target: str | Mol) tuple[bool, list[int], str]

Check for beta-diketone groups in a molecule.

The SMARTS pattern used to identify beta-diketone groups is O=#6D3[#6]#6D3=O.

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_beta_diketone("CC(=O)CCC(=O)C")
static check_boron_group_elements(target: str | Mol) tuple[bool, list[int], str]

Check for boron group elements in a molecule.

SMARTS pattern used: - [B,Al,Ga,In,Ti,Nh]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether any boron group element was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_boron_group_elements("B(Cl)(Cl)Cl")
static check_bromine(target: str | Mol) tuple[bool, list[int], str]

Check for bromine atoms in a molecule.

SMARTS pattern used: - [Br]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether bromine was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_bromine("CCBr")
static check_carbamate(target: str | Mol) tuple[bool, list[int], str]

Check for carbamate groups in a molecule.

SMARTS pattern used: - [NX3,NX4+]CX3[OX2,OX1-]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_carbamate("CN(C)C(=O)OC")
static check_carbanion(target: str | Mol) tuple[bool, list[int], str]

Check for carbanions in a molecule.

The SMARTS pattern used to identify carbanions is [#6-].

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_carbanion("[CH2-]C")
static check_carbocation(target: str | Mol) tuple[bool, list[int], str]

Check for carbocations in a molecule.

The SMARTS pattern used to identify carbocations is [#6+].

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_carbocation("[CH3+]")
static check_carbon(target: str | Mol) tuple[bool, list[int], str]

Check for carbon atoms in a molecule.

The SMARTS pattern used to identify carbon atoms is [#6].

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_carbon("CCO")
static check_carbon_group_elements(target: str | Mol) tuple[bool, list[int], str]

Check for carbon group elements in a molecule (excluding carbon).

SMARTS pattern used: - [Si,Ge,Sn,Pb,Fl]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether any carbon group element was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_carbon_group_elements("Si(Cl)(Cl)(Cl)Cl")
static check_carbonate_ester(target: str | Mol) tuple[bool, list[int], tuple[str, str]]

Check for carbonate esters in a molecule.

This function identifies mono- and diesters of carbonic acid in the target molecule.

SMARTS patterns used: - Monoester: CX3([OX2H0])[OX2H,OX1H0-1] - Diester: CX3([OX2H0])[OX2H0]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A tuple of SMARTS strings representing the matched patterns.

Return type:

tuple[bool, list[int], tuple[str, str]]

Examples

>>> MolPatterns.check_carbonate_ester("COC(=O)OC")
static check_carbonic_acid(target: str | Mol) tuple[bool, list[int], str]

Check for carbonic acid groups in a molecule.

The SMARTS pattern used to identify carbonic acid groups is CX3([OX2])[OX2H,OX1H0-1]. This pattern matches both the acid and its conjugate base, but not carbonic acid diesters.

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_carbonic_acid("O=C(O)O")
static check_carbonyl(target: str | Mol) tuple[bool, list[int], str]

Check for carbonyl groups in a molecule.

The SMARTS pattern used to identify carbonyl groups is [$([CX3]=[OX1]),$([CX3+]-[OX1-])].

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_carbonyl("CC(=O)O")
static check_carbosulphone(target: str | Mol) tuple[bool, list[int], str]

Check for carbosulphone groups in a molecule.

SMARTS pattern used: - $([#16X4(=[OX1])([#6])[#6]), $(#16X4+2([OX1-])([#6])[#6])]

Carbosulphones are sulphones with two carbon substituents.

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_carbosulphone("CCS(=O)(=O)C")
static check_carbosulphoxide(target: str | Mol) tuple[bool, list[int], str]

Check for carbosulphoxide groups in a molecule.

SMARTS pattern used: - $([#16X3([#6])[#6]), $(#16X3+([#6])[#6])]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_carbosulphoxide("CCS(=O)C")
static check_carboxyl(target: str | Mol) tuple[bool, list[int], str]

Check for carboxyl groups in a molecule.

The SMARTS pattern used to identify carboxyl groups is CX3[OX1H0-,OX2H1].

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_carboxyl("CC(=O)O")
static check_chalcogens(target: str | Mol) tuple[bool, list[int], str]

Check for chalcogens in a molecule (excluding oxygen and sulphur).

SMARTS pattern used: - [Se,Te,Po,Lv]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether any chalcogen was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_chalcogens("Se=C")
static check_chlorine(target: str | Mol) tuple[bool, list[int], str]

Check for chlorine atoms in a molecule.

SMARTS pattern used: - [Cl]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether chlorine was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_chlorine("CCCl")
static check_cyanamide(target: str | Mol) tuple[bool, list[int], str]

Check for cyanamide groups in a molecule.

SMARTS pattern used: - [NX3][CX2]#[NX1]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_cyanamide("NC#N")
static check_cyanate(target: str | Mol) tuple[bool, list[int], str]

Check for cyanate groups in a molecule.

SMARTS pattern used: - [#8D2][#6D2]#[#7D1]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_cyanate("OC#N")
static check_cysteine(target: str | Mol) tuple[bool, list[int], str]

Check for cysteine residues in a molecule.

SMARTS pattern used: - [$([NX3H2,NX4H3+]),$(NX3H(C))][CX4H]([CH2X4][SX2H,SX1H0-])CX3[OX2H,OX1-,N]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTe matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_cysteine("C(C(C(=O)O)N)S")
static check_delta_dicarbonyl(target: str | Mol) tuple[bool, list[int], str]

Check for delta-dicarbonyl groups in a molecule.

The SMARTS pattern used to identify delta-dicarbonyl groups is O=[#6][#6][#6][#6][#6]=O.

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_delta_dicarbonyl("O=CCCCC=O")
static check_diazo(target: str | Mol) tuple[bool, list[int], str]

Check for diazo groups in a molecule.

SMARTS pattern used: - [$([#6]=[N+]=[N-]),$([#6-]-[N+]#[N])]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_diazo("C=N=N")
static check_disulphide(target: str | Mol) tuple[bool, list[int], str]

Check for disulphide groups in a molecule.

SMARTS pattern used: - [#16X2H0][#16X2H0]

Disulphides contain an S-S bond, commonly found in biological systems such as cystine.

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_disulphide("CSSC")
static check_enamine(target: str | Mol) tuple[bool, list[int], str]

Check for enamine groups in a molecule.

The SMARTS pattern used to identify enamine groups is [NX3][CX3]=[CX3].

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_enamine("C=CN")
static check_enol(target: str | Mol) tuple[bool, list[int], str]

Check for enol groups in a molecule. Matches both enol and enolate forms.

SMARTS pattern used: - [$([OX2H][#6X3]=[#6]),$([OX1-][#6X3]=[#6])]

Enols are vinylic alcohols with the structure HOCR’=CR₂, tautomeric with aldehydes or ketones.

Reference

https://doi.org/10.1351/goldbook.E02124

param target:

A SMILES string or an RDKit molecule object.

type target:

str or rdkit.Chem.rdchem.Mol

returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

rtype:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_enol("C=C(O)C")
static check_ester(target: str | Mol) tuple[bool, list[int], str]

Check for ester groups in a molecule.

The SMARTS pattern used to identify ester groups is [#6]CX3[OX2H0][#6].

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_ester("CC(=O)OC")
static check_ether(target: str | Mol) tuple[bool, list[int], str]

Check for ether groups in a molecule.

The SMARTS pattern used to identify ether groups is OD2[#6].

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_ether("COC")
static check_fluorine(target: str | Mol) tuple[bool, list[int], str]

Check for fluorine atoms in a molecule.

SMARTS pattern used: - [F]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether fluorine was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_fluorine("CCF")
static check_gamma_dicarbonyl(target: str | Mol) tuple[bool, list[int], str]

Check for gamma-dicarbonyl groups in a molecule.

The SMARTS pattern used to identify gamma-dicarbonyl groups is O=[#6][#6][#6][#6]=O.

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_gamma_dicarbonyl("O=CCCC=O")
static check_gamma_diketone(target: str | Mol) tuple[bool, list[int], str]

Check for gamma-diketone groups in a molecule.

The SMARTS pattern used to identify gamma-diketone groups is O=#6D3[#6][#6]#6D3=O.

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_gamma_diketone("CC(=O)CCCC(=O)C")
static check_glutamate(target: str | Mol) tuple[bool, list[int], str]

Check for glutamate residues in a molecule.

SMARTS pattern used: - [$([NX3H2,NX4H3+]),$(NX3H(C))]CX4H[OH0-,OH])CX3[OX2H,OX1-,N]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_glutamate("NC(CCC(=O)O)C(=O)O")
static check_glycine(target: str | Mol) tuple[bool, list[int], str]

Check for glycine residues in a molecule.

SMARTS pattern used: - [$([$([NX3H2,NX4H3+]),$(NX3H(C))][CX4H2]CX3[OX2H,OX1-,N])]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_glycine("C(C(=O)O)N")
static check_haloalkane(target: str | Mol) tuple[bool, list[int], str]

Check for haloalkane groups in a molecule.

SMARTS pattern used: - [CX4]-[F,Cl,Br,I]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_haloalkane("CCl")
static check_haloalkane_primary(target: str | Mol) tuple[bool, list[int], str]

Check for primary haloalkane groups in a molecule.

SMARTS pattern used: - [CX4H3,CX4H2]-[F,Cl,Br,I]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_haloalkane_primary("CCl")
static check_haloalkane_secondary(target: str | Mol) tuple[bool, list[int], str]

Check for secondary haloalkane groups in a molecule.

SMARTS pattern used: - [CX4H1]-[F,Cl,Br,I]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_haloalkane_secondary("CCClCC")
static check_haloalkane_tertiary(target: str | Mol) tuple[bool, list[int], str]

Check for tertiary haloalkane groups in a molecule.

SMARTS pattern used: - [CX4H0]-[F,Cl,Br,I]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_haloalkane_tertiary("C(C)(C)(Cl)C")
static check_haloalkene(target: str | Mol) tuple[bool, list[int], str]

Check for haloalkene groups in a molecule.

SMARTS pattern used: - [C&!c]=[C&!c][F,Cl,Br,I]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_haloalkene("C=CCl")
static check_halogen(target: str | Mol) tuple[bool, list[int], str]

Check for halogen atoms in a molecule.

SMARTS pattern used: - [F,Cl,Br,I]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether any halogen atom was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_halogen("CCCl")
static check_halogen_carbon(target: str | Mol) tuple[bool, list[int], str]

Check for carbon atoms connected to halogens in a molecule.

SMARTS pattern used: - [#6]~[F,Cl,Br,I]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_halogen_carbon("CCCl")
static check_halogen_nitrogen(target: str | Mol) tuple[bool, list[int], str]

Check for nitrogen atoms connected to halogens in a molecule.

SMARTS pattern used: - [#7]~[F,Cl,Br,I]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_halogen_nitrogen("NCl")
static check_halogen_oxygen(target: str | Mol) tuple[bool, list[int], str]

Check for oxygen atoms connected to halogens in a molecule.

SMARTS pattern used: - [#8]~[F,Cl,Br,I]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_halogen_oxygen("OCl")
static check_hbond_acceptors(target: str | Mol) tuple[bool, list[int], str]

Check for hydrogen bond acceptors in a molecule.

SMARTS pattern used: - H-bond acceptor: [!$([#6,F,Cl,Br,I,o,s,nX3,#7v5,#15v5,#16v4,#16v6,*+1,*+2,*+3])]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_hbond_acceptors("CC(=O)O")
static check_hbond_acceptors_higher_than(target: str | Mol, n: int) tuple[bool, list[int], str]

Check for a number of hydrogen bond acceptors strictly higher than a threshold.

Parameters:
  • target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

  • n (int) – Minimum number of hydrogen bond acceptors (exclusive).

Returns:

A tuple containing: - A boolean indicating whether the number of acceptors is greater than n. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_hbond_acceptors_higher_than("CC(=O)O", 1)
static check_hbond_acceptors_lower_than(target: str | Mol, n: int) tuple[bool, list[int], str]

Check for a number of hydrogen bond acceptors strictly lower than a threshold.

Parameters:
  • target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

  • n (int) – Maximum number of hydrogen bond acceptors (exclusive).

Returns:

A tuple containing: - A boolean indicating whether the number of acceptors is less than n. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_hbond_acceptors_lower_than("CC(=O)O", 3)
static check_hbond_donors(target: str | Mol) tuple[bool, list[int], str]

Check for hydrogen bond donors in a molecule.

SMARTS pattern used: - H-bond donor: [!$([#6,H0,-,-2,-3])]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_hbond_donors("CC(O)N")
static check_hbond_donors_higher_than(target: str | Mol, n: int) tuple[bool, list[int], str]

Check for a number of hydrogen bond donors strictly higher than a threshold.

Parameters:
  • target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

  • n (int) – Minimum number of hydrogen bond donors (exclusive).

Returns:

A tuple containing: - A boolean indicating whether the number of donors is greater than n. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_hbond_donors_higher_than("CC(O)N", 1)
static check_hbond_donors_lower_than(target: str | Mol, n: int) tuple[bool, list[int], str]

Check for a number of hydrogen bond donors strictly lower than a threshold.

Parameters:
  • target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

  • n (int) – Maximum number of hydrogen bond donors (exclusive).

Returns:

A tuple containing: - A boolean indicating whether the number of donors is less than n. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_hbond_donors_lower_than("CC(O)N", 3)
static check_histidine(target: str | Mol) tuple[bool, list[int], str]

Check for histidine residues in a molecule.

SMARTS pattern used: - [$([NX3H2,NX4H3+]),$(NX3H(C))]CX4H,$([#7X3H])]:[#6X3H]:[$([#7X3H+,#7X2H0+0]:[#6X3H]:[#7X3H]),$([#7X3H])]:[#6X3H]1)CX3[OX2H,OX1-,N]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_histidine("NC(Cc1c[nH]cn1)C(=O)O")
static check_hydrazine(target: str | Mol) tuple[bool, list[int], str]

Check for hydrazine groups in a molecule.

SMARTS pattern used: - [NX3][NX3]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_hydrazine("NN")
static check_hydrazone(target: str | Mol) tuple[bool, list[int], str]

Check for hydrazone groups in a molecule.

Hydrazones are compounds with the structure R₂C=NNR₂, derived from aldehydes or ketones by replacing =O with =NNH₂ or analogues. Reference: https://doi.org/10.1351/goldbook.H02884

SMARTS pattern used: - [#7X3][#7D2]=#6D3[#6]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_hydrazone("C=NN(C)C")
static check_imide(target: str | Mol) tuple[bool, list[int], str]

Check for imide groups in a molecule.

SMARTS pattern used: - [#6][#6D3#7X3]#6D3[#6]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_imide("O=C1NC(=O)CC1")
static check_imine(target: str | Mol) tuple[bool, list[int], str]

Check for imine groups in a molecule.

SMARTS pattern used: - $([CX3[#6]),$([CX3H][#6])]=[$([NX2][#6]),$([NX2H])]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_imine("C=NC")
static check_iminium(target: str | Mol) tuple[bool, list[int], str]

Check for iminium groups in a molecule.

SMARTS pattern used: - [NX3+]=[CX3]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_iminium("C=N+C")
static check_iodine(target: str | Mol) tuple[bool, list[int], str]

Check for iodine atoms in a molecule.

SMARTS pattern used: - [I]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether iodine was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_iodine("CCI")
static check_isocyanate(target: str | Mol) tuple[bool, list[int], str]

Check for isocyanate groups in a molecule.

SMARTS pattern used: - [#7D2]=[#6D2]=[#8D1]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_isocyanate("N=C=O")
static check_isoleucine(target: str | Mol) tuple[bool, list[int], str]

Check for isoleucine residues in a molecule.

SMARTS pattern used: - [$([NX3H2,NX4H3+]),$(NX3H(C))]CX4H[CH2X4][CH3X4])CX3[OX2H,OX1-,N]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_isoleucine("CC(C)CC(N)C(=O)O")
static check_isonitrile(target: str | Mol) tuple[bool, list[int], str]

Check for isonitrile groups in a molecule.

SMARTS pattern used: - [CX1-]#[NX2+]

Isomeric forms of hydrocyanic acid and its derivatives (RN≡C).

Reference

https://doi.org/10.1351/goldbook.I03270

param target:

A SMILES string or an RDKit molecule object.

type target:

str or rdkit.Chem.rdchem.Mol

returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

rtype:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_isonitrile("CN#C")
static check_isothiocyanate(target: str | Mol) tuple[bool, list[int], str]

Check for isothiocyanate groups in a molecule.

SMARTS pattern used: - [#7D2]=[#6]=[#16D1]

Isothiocyanates are sulphur analogues of isocyanates (RN=C=S).

Reference

https://doi.org/10.1351/goldbook.I03320

param target:

A SMILES string or an RDKit molecule object.

type target:

str or rdkit.Chem.rdchem.Mol

returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

rtype:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_isothiocyanate("NC(=S)N")
static check_ketone(target: str | Mol) tuple[bool, list[int], str]

Check for ketone groups in a molecule.

The SMARTS pattern used to identify ketone groups is [#6]CX3[#6].

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_ketone("CC(=O)C")
static check_leucine(target: str | Mol) tuple[bool, list[int], str]

Check for leucine residues in a molecule.

SMARTS pattern used: - [$([NX3H2,NX4H3+]),$(NX3H(C))]CX4H[CH3X4])CX3[OX2H,OX1-,N]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_leucine("CC(C)C(C(=O)O)N")
static check_lysine(target: str | Mol) tuple[bool, list[int], str]

Check for lysine residues in a molecule.

SMARTS pattern used: - CX3([OX2])CH1X4[CH2X4][CH2X4][CH2X4]CD3[CD1]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_lysine("NCCCCCC(N)C(=O)O")
static check_methionine(target: str | Mol) tuple[bool, list[int], str]

Check for methionine residues in a molecule.

SMARTS pattern used: - [$([NX3H2,NX4H3+]),$([NX3H]))]CX4HCX3[OX2H,OX1-,N]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_methionine("CSCC(C(=O)O)N")
static check_n_oxide(target: str | Mol) tuple[bool, list[int], str]

Check for N-oxide groups in a molecule.

SMARTS pattern used: - [$([#7X3H1,#7X3&!#7X3H2,#7X3H0,#7X4+][#8]); !$(#7~[O]); !$([#7]=[#7])]

Derived from tertiary amines by attachment of an oxygen atom to nitrogen.

Reference

https://doi.org/10.1351/goldbook.A00273

param target:

A SMILES string or an RDKit molecule object.

type target:

str or rdkit.Chem.rdchem.Mol

returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

rtype:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_n_oxide("CN+(C)[O-]")
static check_neg_charge_1(target: str | Mol) tuple[bool, list[int], str]

Check for negatively charged atoms in a molecule.

SMARTS pattern used: - Negative charge: [-]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_neg_charge_1("[O-]C(=O)C")
static check_neg_charge_2(target: str | Mol) tuple[bool, list[int], str]

Check for two negatively charged atoms in a molecule.

SMARTS pattern used: - Two negative charges: [-].[-]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_neg_charge_2("[O-].[O-]C(=O)C")
static check_neg_charge_3(target: str | Mol) tuple[bool, list[int], str]

Check for three negatively charged atoms in a molecule.

SMARTS pattern used: - Three negative charges: [-].[-].[-]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_neg_charge_3("[O-].[O-].[O-]C(=O)C")
static check_nitrate(target: str | Mol) tuple[bool, list[int], str]

Check for nitrate groups in a molecule.

SMARTS pattern used: - $([NX3(=[OX1])O),$(NX3+(=[OX1])O)]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_nitrate("C(C(=O)O)N+[O-]")
static check_nitrile(target: str | Mol) tuple[bool, list[int], str]

Check for nitrile groups in a molecule.

SMARTS pattern used: - [NX1]#[CX2]

Compounds with the structure RC≡N, i.e., C-substituted derivatives of hydrocyanic acid.

Reference

https://doi.org/10.1351/goldbook.N04151

param target:

A SMILES string or an RDKit molecule object.

type target:

str or rdkit.Chem.rdchem.Mol

returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

rtype:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_nitrile("CC#N")
static check_nitro(target: str | Mol) tuple[bool, list[int], str]

Check for nitro groups in a molecule.

SMARTS pattern used: - $(NX3=O),$([NX3+[O-])][!#8]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_nitro("CC(=O)N(=O)=O")
static check_nitrogen(target: str | Mol) tuple[bool, list[int], str]

Check for nitrogen atoms in a molecule.

The SMARTS pattern used to identify nitrogen atoms is [#7].

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_nitrogen("CN")
static check_nitrogen_group_elements(target: str | Mol) tuple[bool, list[int], str]

Check for nitrogen group elements in a molecule (excluding nitrogen and phosphorus).

SMARTS pattern used: - [As,Sb,Bi]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether any nitrogen group element was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_nitrogen_group_elements("As")
static check_nitroso(target: str | Mol) tuple[bool, list[int], str]

Check for nitroso groups in a molecule.

SMARTS pattern used: - [NX2]=[OX1]

Nitroso groups (-NO) attached to carbon or other elements.

Reference

https://doi.org/10.1351/goldbook.N04169

param target:

A SMILES string or an RDKit molecule object.

type target:

str or rdkit.Chem.rdchem.Mol

returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

rtype:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_nitroso("C1=CC=CC=C1N=O")
static check_noble_gases(target: str | Mol) tuple[bool, list[int], str]

Check for noble gases in a molecule.

SMARTS pattern used: - [He,Ne,Ar,Kr,Xe,Rn]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether any noble gas was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_noble_gases("[Ar]")
static check_oxohalide(target: str | Mol) tuple[bool, list[int], str]

Check for oxohalide groups in a molecule.

SMARTS pattern used: - [#8]=[*H0]~[F,Cl,Br,I]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_oxohalide("O=CCl")
static check_oxygen(target: str | Mol) tuple[bool, list[int], str]

Check for oxygen atoms in a molecule.

SMARTS pattern used: - [#8]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_oxygen("CCO")
static check_pattern_aliphatic(target: str | Mol, pattern_function) tuple[bool, list[int], tuple[str, str]]

Check if a pattern is aliphatic in a molecule.

This function takes a molecule and a pattern function, and checks whether the pattern is aliphatic in the molecule.

Parameters:
  • target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

  • pattern_function (Callable) – A function that takes a molecule and returns a tuple of (bool, list of atom indices, SMARTS string).

Returns:

A tuple containing: - A boolean indicating whether the pattern is aliphatic. - A list of atom indices matching the aliphatic pattern. - A tuple of SMARTS strings representing the matched patterns.

Return type:

tuple[bool, list[int], tuple[str, str]]

Examples

>>> MolPatterns.check_pattern_aliphatic("CC", some_function)
static check_pattern_aromatic(target: str | Mol, pattern_function) tuple[bool, list[int], tuple[str, str]]

Check if a pattern is aromatic in a molecule.

This function takes a molecule and a pattern function, and checks whether the pattern is aromatic in the molecule.

Parameters:
  • target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

  • pattern_function (Callable) – A function that takes a molecule and returns a tuple of (bool, list of atom indices, SMARTS string).

Returns:

A tuple containing: - A boolean indicating whether the pattern is aromatic. - A list of atom indices matching the aromatic pattern. - A tuple of SMARTS strings representing the matched patterns.

Return type:

tuple[bool, list[int], tuple[str, str]]

Examples

>>> MolPatterns.check_pattern_aromatic("c1ccccc1", some_function)
static check_pattern_aromatic_substituent(target: str | Mol, pattern_function) tuple[bool, list[int], tuple[str, str]]

Check if a pattern is an aromatic substituent in a molecule.

This function takes a molecule and a pattern function, and checks whether the pattern is an aromatic substituent in the molecule.

Parameters:
  • target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

  • pattern_function (Callable) – A function that takes a molecule and returns a tuple of (bool, list of atom indices, SMARTS string).

Returns:

A tuple containing: - A boolean indicating whether the pattern is an aromatic substituent. - A list of atom indices matching the substituent pattern. - A tuple of SMARTS strings representing the matched patterns.

Return type:

tuple[bool, list[int], tuple[str, str]]

Examples

>>> MolPatterns.check_pattern_aromatic_substituent("c1ccccc1C", some_function)
static check_phenylalanine(target: str | Mol) tuple[bool, list[int], str]

Check for phenylalanine residues in a molecule.

SMARTS pattern used: - [$([NX3H2,NX4H3+]),$(NX3H(C))][CX4H]([OX2H,OX1-,N]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_phenylalanine("NC(CC1=CC=CC=C1)C(=O)O")
static check_phosphoric_acid(target: str | Mol) tuple[bool, list[int], str]

Check for phosphoric acid groups in a molecule.

SMARTS pattern used: - [$(P(=[OX1])([$([OX2H]),$([OX1-]),$([OX2]P)])([$([OX2H]),$([OX1-]),$([OX2]P)])[$([OX2H]),$([OX1-]),$([OX2]P)]), $(P+([$([OX2H]),$([OX1-]),$([OX2]P)])([$([OX2H]),$([OX1-]),$([OX2]P)])[$([OX2H]),$([OX1-]),$([OX2]P)])]

This pattern matches orthophosphoric acid and polyphosphoric acid anhydrides, but not mono- or di-esters of monophosphoric acid.

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_phosphoric_acid("OP(=O)(O)O")
static check_phosphoric_ester(target: str | Mol) tuple[bool, list[int], str]

Check for phosphoric ester groups in a molecule.

SMARTS pattern used: - [$(P(=[OX1])([OX2][#6])([$([OX2H]),$([OX1-]),$([OX2][#6])])[$([OX2H]),$([OX1-]),$([OX2][#6]),$([OX2]P)]), $(P+([OX2][#6])([$([OX2H]),$([OX1-]),$([OX2][#6])])[$([OX2H]),$([OX1-]),$([OX2][#6]),$([OX2]P)])]

This pattern matches both neutral and charged forms of phosphoric esters, but not non-ester phosphoric acids.

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_phosphoric_ester("COP(=O)(OC)OC")
static check_phosphorus(target: str | Mol) tuple[bool, list[int], str]

Check for phosphorus atoms in a molecule.

SMARTS pattern used: - [P]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_phosphorus("CP(=O)(O)O")
static check_pos_charge_1(target: str | Mol) tuple[bool, list[int], str]

Check for positively charged atoms in a molecule.

SMARTS pattern used: - [+]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether any positive charge was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_pos_charge_1("C[N+](C)(C)C")
static check_pos_charge_2(target: str | Mol) tuple[bool, list[int], str]

Check for two positively charged atoms in a molecule.

SMARTS pattern used: - [+].[+]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether two positive charges were found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_pos_charge_2("[Na+].[Na+]")
static check_pos_charge_3(target: str | Mol) tuple[bool, list[int], str]

Check for three positively charged atoms in a molecule.

SMARTS pattern used: - [+].[+].[+]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether three positive charges were found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_pos_charge_3("[Na+].[Na+].[K+]")
static check_proline(target: str | Mol) tuple[bool, list[int], str]

Check for proline residues in a molecule.

SMARTS pattern used: - [$([NX3H,NX4H2+]),$(NX3(C)(C))]1CX4HCX3[OX2H,OX1-,N]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_proline("C1CC(NC1)C(=O)O")
static check_serine(target: str | Mol) tuple[bool, list[int], str]

Check for serine residues in a molecule.

SMARTS pattern used: - [$([NX3H2,NX4H3+]),$(NX3H(C))]CX4HCX3[OX2H,OX1-,N]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_serine("NC(CO)C(=O)O")
static check_sulphamic_acid(target: str | Mol) tuple[bool, list[int], str]

Check for sulphamic acid groups in a molecule.

SMARTS pattern used: - $([#16X4(=[OX1])(=[OX1])[OX2H,OX1H0-]), $(#16X4+2([OX1-])([OX1-])[OX2H,OX1H0-])]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_sulphamic_acid("NS(=O)(=O)O")
static check_sulphamic_ester(target: str | Mol) tuple[bool, list[int], str]

Check for sulphamic ester groups in a molecule.

SMARTS pattern used: - $([#16X4(=[OX1])(=[OX1])[OX2][#6]), $(#16X4+2([OX1-])([OX1-])[OX2][#6])]

Parameters:

kit.Chem.rdchem.Mol – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_sulphamic_ester("NS(=O)(=O)OC")
static check_sulphenic_acid(target: str | Mol) tuple[bool, list[int], str]

Check for sulphenic acid groups in a molecule.

SMARTS pattern used: - [#16X2][OX2H,OX1H0-]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_sulphenic_acid("CSO")
static check_sulphenic_ester(target: str | Mol) tuple[bool, list[int], str]

Check for sulphenic ester groups in a molecule.

SMARTS pattern used: - [#16X2][OX2H0][#6]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_sulphenic_ester("CSOC")
static check_sulphide(target: str | Mol) tuple[bool, list[int], str]

Check for sulphide groups in a molecule.

SMARTS pattern used: - [#6][#16D2][#6]

Sulphides are compounds with the structure R-S-R’ (R ≠ H), also known as thioethers.

Reference

https://doi.org/10.1351/goldbook.S06102

param target:

A SMILES string or an RDKit molecule object.

type target:

str or rdkit.Chem.rdchem.Mol

returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

rtype:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_sulphide("CCSC")
static check_sulphinic_acid(target: str | Mol) tuple[bool, list[int], str]

Check for sulphinic acid groups in a molecule.

SMARTS pattern used: - [$([#6]#16X3[OX2H,OX1H0-]), $([#6]#16X3+[OX2H,OX1H0-])]

Sulphinic acids (RS(=O)OH) and their conjugate bases (sulphinates) are included.

Reference

https://doi.org/10.1351/goldbook.S06109

param target:

A SMILES string or an RDKit molecule object.

type target:

str or rdkit.Chem.rdchem.Mol

returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

rtype:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_sulphinic_acid("CS(=O)O")
static check_sulphinic_ester(target: str | Mol) tuple[bool, list[int], str]

Check for sulphinic ester groups in a molecule.

SMARTS pattern used: - [$([#6]#16X3[OX2][#6]), $([#6]#16X3+[OX2][#6])]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_sulphinic_ester("CS(=O)OC")
static check_sulphonamide(target: str | Mol) tuple[bool, list[int], str]

Check for sulphonamide groups in a molecule.

SMARTS pattern used: - $([SX4(=[OX1])([!O])[NX3]), $(SX4+2([OX1-])([!O])[NX3])]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_sulphonamide("CS(=O)(=O)N")
static check_sulphone(target: str | Mol) tuple[bool, list[int], str]

Check for sulphone groups in a molecule.

SMARTS pattern used: - $([#16X4=[OX1]), $(#16X4+2[OX1-])]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_sulphone("CS(=O)(=O)C")
static check_sulphonic_acid(target: str | Mol) tuple[bool, list[int], str]

Check for sulphonic acid groups in a molecule.

SMARTS pattern used: - $([#16X4(=[OX1])[OX2H,OX1H0-]), $([#16X42([OX1-])[OX2H,OX1H0-])]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_sulphonic_acid("CS(=O)(=O)O")
static check_sulphonic_ester(target: str | Mol) tuple[bool, list[int], str]

Check for sulphonic ester groups in a molecule.

SMARTS pattern used: - $([#16X4(=[OX1])[OX2H0]), $(#16X4+2([OX1-])[OX2H0])]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_sulphonic_ester("CS(=O)(=O)OC")
static check_sulphoxide(target: str | Mol) tuple[bool, list[int], str]

Check for sulphoxide groups in a molecule.

SMARTS pattern used: - [$([#16X3]=[OX1]), $([#16X3+][OX1-])]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_sulphoxide("CS(=O)C")
static check_sulphur(target: str | Mol) tuple[bool, list[int], str]

Check for sulphur atoms in a molecule.

SMARTS pattern used: - [#16]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_sulphur("CCS")
static check_sulphuric_acid(target: str | Mol) tuple[bool, list[int], str]

Check for sulphuric acid groups in a molecule.

SMARTS pattern used: - $([SX4(=[OX1])([OX2H1,OX1-])[OX2H1,OX1-])]

Matches both acid and conjugate base forms.

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_sulphuric_acid("OS(=O)(=O)O")
static check_sulphuric_ester(target: str | Mol) tuple[bool, list[int], str]

Check for sulphuric ester groups in a molecule.

SMARTS pattern used: - $([SX4(=[OX1])([OX2H1])[OX2H0][#6]), $(SX4(=[OX1])([OX2H0][#6])[OX2H0][#6])]

Matches both mono- and di-esters of sulphuric acid.

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_sulphuric_ester("COS(=O)(=O)OC")
static check_thioaldehyde(target: str | Mol) tuple[bool, list[int], str]

Check for thioaldehyde groups in a molecule.

SMARTS pattern used: - [#6][#6X3H1]=[#16X1]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_thioaldehyde("CC(=S)H")
static check_thioanhydride(target: str | Mol) tuple[bool, list[int], str]

Check for thioanhydride groups in a molecule.

SMARTS pattern used: - CX3[SX2]CX3

Thioanhydrides are compounds with the structure acyl-S-acyl, also called diacylsulfanes.

Reference

https://doi.org/10.1351/goldbook.T06351

param target:

A SMILES string or an RDKit molecule object.

type target:

str or rdkit.Chem.rdchem.Mol

returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

rtype:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_thioanhydride("CC(=O)SC(=O)C")
static check_thiocarbamate(target: str | Mol) tuple[bool, list[int], str]

Check for thiocarbamate groups in a molecule.

SMARTS pattern used: - [$([#6][#8D2]CD3[#7X3,#7X4+]), $([#6][#16D2]CD3[#7X3,#7X4+])]

Matches both O- and S-organyl thiocarbamates and their conjugated bases.

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_thiocarbamate("COC(=S)NC")
static check_thiocarbonyl(target: str | Mol) tuple[bool, list[int], str]

Check for thiocarbonyl groups in a molecule.

SMARTS pattern used: - [#6X3]=[#16X1]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_thiocarbonyl("CC(=S)C")
static check_thiocarboxylic(target: str | Mol) tuple[bool, list[int], str]

Check for thiocarboxylic groups in a molecule.

SMARTS pattern used: - $([$([CX3[OX2H1]),$(CX3[OX1-])]), $($([CX3[SX2H1]),$(CX3[SX1-])]), $($([CX3[SX2H1]),$(CX3[SX1-])])]

Thiocarboxylic acids are compounds where one or both oxygens of a carboxy group are replaced by sulphur.

Reference

https://doi.org/10.1351/goldbook.T06352

param target:

A SMILES string or an RDKit molecule object.

type target:

str or rdkit.Chem.rdchem.Mol

returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

rtype:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_thiocarboxylic("CC(=S)SH")
static check_thiocyanate(target: str | Mol) tuple[bool, list[int], str]

Check for thiocyanate groups in a molecule.

SMARTS pattern used: - [#16D2]#[#7]

Thiocyanates are salts and esters of thiocyanic acid (HSC≡N), e.g. methyl thiocyanate (CH₃SC≡N).

Reference

https://doi.org/10.1351/goldbook.T06353

param target:

A SMILES string or an RDKit molecule object.

type target:

str or rdkit.Chem.rdchem.Mol

returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

rtype:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_thiocyanate("CSC#N")
static check_thioester(target: str | Mol) tuple[bool, list[int], str]

Check for thioester groups in a molecule.

SMARTS pattern used: - [$(S([#6])CX3),$(O([#6])CX3),$(#16CX3)]

Matches mono- and di-thioesters.

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_thioester("CC(=O)SC")
static check_thioketone(target: str | Mol) tuple[bool, list[int], str]

Check for thioketone groups in a molecule.

SMARTS pattern used: - [#6]#6D3=[#16X1]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_thioketone("CC(=S)C")
static check_thiol(target: str | Mol) tuple[bool, list[int], str]

Check for thiol groups in a molecule.

SMARTS pattern used: - [#16X2H]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_thiol("CCSH")
static check_threonine(target: str | Mol) tuple[bool, list[int], str]

Check for threonine residues in a molecule.

SMARTS pattern used: - [$([NX3H2,NX4H3+]),$(NX3H(C))]CX4H[OX2H])CX3[OX2H,OX1-,N]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_threonine("CC(O)C(N)C(=O)O")
static check_transition_metals(target: str | Mol) tuple[bool, list[int], str]

Check for transition metals in a molecule.

SMARTS pattern used: - [Sc,Ti,V,Cr,Mn,Fe,Co,Ni,Cu,Zn,Y,Zr,Nb,Mo,Tc,Ru,Rh,Pd,Ag,Cd,La,Ce,Pr,Nd,Pm,Sm,Eu,Gd,Tb,Dy,Ho,Er,Tm,Yb,Lu,Ac,Th,Pa,U,Np,Pu,Am,Cm,Bk,Cf,Es,Fm,Md,No,Lr,Rf,Db,Sg,Bh,Hs,Mt,Ds,Rg,Cn]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether any transition metal was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_transition_metals("[Fe++]")
static check_tryptophan(target: str | Mol) tuple[bool, list[int], str]

Check for tryptophan residues in a molecule.

SMARTS pattern used: - [$([NX3H2,NX4H3+]),$(NX3H(C))]CX4HCX3[OX2H,OX1-,N]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_tryptophan("NC(Cc1c[nH]c2ccccc12)C(=O)O")
static check_tyrosine(target: str | Mol) tuple[bool, list[int], str]

Check for tyrosine residues in a molecule.

SMARTS pattern used: - [$([NX3H2,NX4H3+]),$(NX3H(C))]CX4H[cX3H][cX3H]1)CX3[OX2H,OX1-,N]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_tyrosine("NC(Cc1ccc(O)cc1)C(=O)O")
static check_unbranched_rotatable_carbons(target: str | Mol, n_units: int) tuple[bool, list[int], str]

Check for unbranched rotatable carbon chains in a molecule.

SMARTS pattern used: - Unbranched rotatable carbon: [R0;CD2]- repeated n_units times.

Matches: Specifically carbon atoms (C) that are non-cyclic (R0) and have two connections (D2).

Use case: More specific — it only detects unbranched chains made of aliphatic carbon atoms.

Parameters:
  • target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

  • n_units (int) – Number of repeated carbon rotatable units.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_unbranched_rotatable_carbons("CCCC", 3)
static check_unbranched_rotatable_chain(target: str | Mol, n_units: int) tuple[bool, list[int], str]

Check for unbranched rotatable chains in a molecule.

SMARTS pattern used: - Unbranched rotatable chain: [R0;D2]- repeated n_units times.

Matches: Any non-cyclic, aliphatic atom with two connections (degree 2), regardless of element type.

Use case: More general — it detects unbranched chains of any atoms (not just carbon) that are rotatable and not part of a ring

Parameters:
  • target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

  • n_units (int) – Number of repeated rotatable units.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_unbranched_rotatable_chain("CCCC", 3)
static check_unbranched_structure(target: str | Mol, n_units: int) tuple[bool, list[int], str]

Check for unbranched chains in a molecule.

SMARTS pattern used: - Unbranched chain: [R0;D2]~ repeated n_units times

Matches: Any non-cyclic atom with two connections, connected by any bond type (~ = single, double, or triple).

Use case: More general — detects unbranched chains regardless of bond type (e.g. C=C-C≡C), and not limited to rotatable bonds.

This pattern matches non-cyclic atoms with two connections (degree 2), connected by any bond type (single, double, or triple), forming a linear, unbranched chain.

Parameters:
  • target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

  • n_units (int) – Number of repeated unbranched units.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_unbranched_structure("CC=CCCC", 4)
static check_valine(target: str | Mol) tuple[bool, list[int], str]

Check for valine residues in a molecule.

SMARTS pattern used: - [$([NX3H2,NX4H3+]),$(NX3H(C))][CX4H](3X4])CX3[OX2H,OX1-,N]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_valine("CC(C)C(N)C(=O)O")
static check_vinylic_carbon(target: str | Mol) tuple[bool, list[int], str]

Check for vinylic carbon atoms in a molecule.

The SMARTS pattern used to identify vinylic carbon atoms is [$([CX3]=[CX3])].

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_vinylic_carbon("C=CC")
static check_zwitterion(target: str | Mol) tuple[bool, list[int], str]

Check for zwitterions in a molecule.

SMARTS pattern used: - Zwitterion: multiple patterns with oppositely charged atoms separated by up to ten bonds.

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> MolPatterns.check_zwitterion("CN+(C)CC(=O)[O-]")
class Rings

Bases: object

static check_heterocycle(target: str | Mol) tuple[bool, list[int], str]

Check for heterocycles in a molecule.

This method uses RDKit’s generic matcher shortcut CHC to identify any heterocyclic ring system.

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether a heterocycle was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> Rings.check_heterocycle("c1ccncc1")
static check_heterocycle_N(target: str | Mol) tuple[bool, list[int], str]

Check for nitrogen-containing heterocycles in a molecule.

SMARTS pattern used: - Nitrogen in ring: [#7R]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether a nitrogen heterocycle was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> Rings.check_heterocycle_N("c1ccncc1")
static check_heterocycle_O(target: str | Mol) tuple[bool, list[int], str]

Check for oxygen-containing heterocycles in a molecule.

SMARTS pattern used: - Oxygen in ring: [#8R]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether an oxygen heterocycle was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> Rings.check_heterocycle_O("C1COC1")
static check_heterocycle_S(target: str | Mol) tuple[bool, list[int], str]

Check for sulphur-containing heterocycles in a molecule.

SMARTS pattern used: - Sulphur in ring: [#16R]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether a sulphur heterocycle was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> Rings.check_heterocycle_S("C1CSC1")
static check_macrocycle(target: str | Mol) tuple[bool, list[int], str]

Check for macrocycles in a molecule.

SMARTS pattern used: - Macrocycle: [r;!r3;!r4;!r5;!r6;!r7]

This pattern matches atoms in rings larger than 7 members, excluding common small rings (3-7 atoms).

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> Rings.check_macrocycle("C1CCCCCCCCCCCC1")
static check_meta_substituted_aromatic_r6(target: str | Mol) tuple[bool, list[int], str]

Check for meta-substituted aromatic 6-membered rings.

SMARTS pattern used: - Meta substitution: a1(-[*&!#1&!a&!R])aa(-[*&!#1&!a&!R])aaa1

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> Rings.check_meta_substituted_aromatic_r6("c1(C)cc(C)ccc1")
static check_ortho_substituted_aromatic_r6(target: str | Mol) tuple[bool, list[int], str]

Check for ortho-substituted aromatic 6-membered rings.

SMARTS pattern used: - Ortho substitution: a1(-[*&!#1&!a&!R])a(-[*&!#1&!a&!R])aaaa1

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> Rings.check_ortho_substituted_aromatic_r6("c1(C)c(C)cccc1")
static check_para_substituted_aromatic_r6(target: str | Mol) tuple[bool, list[int], str]

Check for para-substituted aromatic 6-membered rings.

SMARTS pattern used: - Para substitution: a1(-[*&!#1&!a&!R])aaa(-[*&!#1&!a&!R])aa1

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> Rings.check_para_substituted_aromatic_r6("c1(C)ccc(C)cc1")
static check_pattern_cyclic(target: str | Mol, pattern_function: Callable) tuple[bool, list[int], tuple[str, str]]

Check whether a given pattern overlaps with ring atoms in a molecule.

This method checks if the atoms matched by a custom pattern function intersect with atoms that are part of a ring.

Parameters:
  • target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

  • pattern_function (Callable) – A function that returns a SMARTS match result for a specific pattern.

Returns:

A tuple containing: - A boolean indicating whether the pattern overlaps with ring atoms. - A list of atom indices in the intersection. - A tuple of SMARTS strings representing the matched patterns.

Return type:

tuple[bool, list[int], tuple[str, str]]

Examples

>>> Rings.check_pattern_cyclic("C1CCCOC1", MolPatterns.check_oxygen)
static check_pattern_cyclic_substituent(target: str | Mol, pattern_function: Callable) tuple[bool, list[int], tuple[str, str]]

Check whether a given pattern overlaps with ring atoms that are connected to non-ring atoms.

This identifies ring atoms that are part of a substituent group (i.e. connected to atoms outside the ring via non-ring bonds).

SMARTS pattern used: - Ring atom with non-ring bond: [R]!@[*]

Parameters:
  • target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

  • pattern_function (Callable) – A function that returns a SMARTS match result for a specific pattern.

Returns:

A tuple containing: - A boolean indicating whether the pattern overlaps with ring substituents. - A list of atom indices in the intersection. - A tuple of SMARTS strings representing the matched patterns.

Return type:

tuple[bool, list[int], tuple[str, str]]

Examples

>>> Rings.check_pattern_cyclic_substituent("c1ccccc1C(=O)O", MolPatterns.check_carboxylic_acid)
static check_ring(target: str | Mol) tuple[bool, list[int], str]

Check for ring atoms in a molecule.

SMARTS pattern used: - Ring atom: [R]

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> Rings.check_ring("c1ccccc1")
static check_ring_fusion(target: str | Mol) tuple[bool, list[int], str]

Check for fused ring systems in a molecule.

SMARTS pattern used: - Fused rings: [#6R2,#6R3,#6R4]

This pattern matches carbon atoms that are part of two or more rings, indicating ring fusion.

Parameters:

target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

Returns:

A tuple containing: - A boolean indicating whether fused rings were found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> Rings.check_ring_fusion("c1ccc2ccccc2c1")
static check_ring_size(target: str | Mol, size: int) tuple[bool, list[int], str]

Check for rings of a specific size in a molecule.

SMARTS pattern used: - Ring of size N: [rN]

Parameters:
  • target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.

  • size (int) – The size of the ring to detect.

Returns:

A tuple containing: - A boolean indicating whether a ring of the specified size was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.

Return type:

tuple[bool, list[int], str]

Examples

>>> Rings.check_ring_size("C1CCCCC1", 6)
class PropManager

Bases: object

Manage molecular properties using RDKit.

The PropManager class provides a structured interface for manipulating molecular properties via RDKit. It is organised into five logical subsections: Base, Mol, Atom, Bond, and Conformation, each containing methods relevant to their respective domains.

Original RDKit reference: https://www.rdkit.org/docs/source/rdkit.Chem.rdchem.html#

Base Methods

  • assign_atom_mapnumbers(mol, atom_ids)

    Assign map numbers to atoms in a molecule.

  • assign_atom_labels(mol, prop_values, atom_ids)

    Assign labels to atoms in a molecule.

  • assign_atom_notes(mol, prop_values, atom_ids)

    Assign notes to atoms in a molecule.

  • assign_bond_notes(mol, prop_values, bond_ids)

    Assign notes to bonds in a molecule.

  • clear_all_atomprops(mol)

    Clear all properties from atoms in a molecule.

  • clear_prop(rdkit_obj, prop)

    Clear a property from an RDKit object.

  • get_props_dict(rdkit_obj)

    Get all properties of an RDKit object as a dictionary.

  • get_prop_names(rdkit_obj)

    Get the names of all properties of an RDKit object.

  • get_prop(rdkit_obj, prop)

    Get the value of a property from an RDKit object.

  • set_prop(rdkit_obj, prop_name, prop_val)

    Set a property on an RDKit object.

  • get_owning_mol(rdkit_obj)

    Get the molecule that owns the RDKit object.

Mol Methods

  • get_atoms(mol)

    Get the atoms of a molecule.

  • get_atoms_from_idx(mol, idx)

    Retrieve atom(s) from a molecule by index.

  • get_bonds(mol)

    Get the bonds of a molecule.

  • get_bonds_from_idx(mol, idx)

    Retrieve bond(s) from a molecule by index.

  • get_bond_between_atoms(mol, idx1, idx2)

    Retrieve the bond between two atoms.

  • get_coordinates(conf_or_mol, is_3d, canonOrient, bondLength)

    Get coordinates of a molecule or conformer.

  • get_conformer(mol, id)

    Get a specific conformer from a molecule.

  • get_conformers(mol)

    Get all conformers from a molecule.

  • get_conf_ids(mol)

    Get all conformer IDs from a molecule.

  • get_distance_matrix(mol, is_3d)

    Get the distance matrix of a molecule.

  • get_gasteiger_charges(mol, atom_ids, nIter)

    Compute Gasteiger charges for specified atoms.

  • get_stereogroups(mol)

    Get stereochemistry groups of a molecule.

  • remove_conformer(mol, id)

    Remove a specific conformer.

  • remove_all_conformers(mol)

    Remove all conformers from a molecule.

Atom Methods

  • clear_atomprops(atom)

    Clear all properties from an atom.

  • get_atomic_num(atom)

    Get the atomic number.

  • get_bonds(atom)

    Get bonds connected to an atom.

  • get_degree(atom)

    Get the degree of an atom.

  • get_total_degree(atom)

    Get total degree including hydrogens.

  • get_explicit_valence(atom)

    Get explicit valence.

  • get_implicit_valence(atom)

    Get implicit valence.

  • get_total_valence(atom)

    Get total valence.

  • has_valence_violation(atom)

    Check for valence violations.

  • get_formal_charge(atom)

    Get formal charge.

  • get_hybridisation(atom)

    Get hybridisation state.

  • get_idx(atom)

    Get atom index.

  • get_neighbours(atom, order)

    Get neighbours up to a given order.

  • is_aromatic(atom)

    Check if atom is aromatic.

  • is_in_ring(atom)

    Check if atom is in a ring.

  • is_in_ring_size(atom, size)

    Check if atom is in a ring of a specific size.

  • get_mass(atom)

    Get atomic mass.

  • get_num_explicit_h(atom)

    Get number of explicit hydrogens.

  • get_num_implicit_h(atom)

    Get number of implicit hydrogens.

  • get_tot_h(atom)

    Get total number of hydrogens.

  • get_num_radical_electrons(atom)

    Get number of radical electrons.

  • set_atom_map_num(atom, num)

    Set atom map number.

  • set_formal_charge(atom, charge)

    Set formal charge.

  • set_is_aromatic(atom, decision)

    Set aromaticity.

  • set_num_explicit_h(atom, num)

    Set number of explicit hydrogens.

  • set_num_radical_electrons(atom, num)

    Set number of radical electrons.

Bond Methods

  • get_begin_atom(bond)

    Get the starting atom of a bond.

  • get_begin_atom_idx(bond)

    Get index of the starting atom.

  • get_bond_type(bond)

    Get bond type.

  • get_end_atom(bond)

    Get the ending atom of a bond.

  • get_end_atom_idx(bond)

    Get index of the ending atom.

  • get_idx(bond)

    Get bond index.

  • get_other_atom(bond, atom)

    Get the other atom in a bond.

  • get_other_atom_idx(bond, idx)

    Get the index of the other atom.

  • get_valence_contribution(bond, atom)

    Get valence contribution of a bond.

  • is_aromatic(bond)

    Check if bond is aromatic.

  • is_conjugated(bond)

    Check if bond is conjugated.

  • is_in_ring(bond)

    Check if bond is in a ring.

  • is_in_ring_size(bond, size)

    Check if bond is in a ring of a specific size.

  • set_is_aromatic(bond, decision)

    Set aromaticity of a bond.

Conformation Methods

  • straighten_mol_2d(mol)

    Straighten the 2D depiction of a molecule.

  • add_conformer(mol, conformer, assignId)

    Add a conformer and return its ID.

  • generate_conformers(…)

    Generate conformers for a molecule.

  • display_conformers(conf, size)

    Display conformers in 3D.

  • display_3dmols_overlapped(…)

    Display multiple 3D molecules overlapped.

  • canonicalise_conformer(conf, ignoreHs)

    Canonicalise a conformer.

  • canonicalise_mol_conformers(mol, ignoreHs)

    Canonicalise all conformers.

  • calculate_conformer_energy_from_mol(mol, conf_id, forcefield)

    Calculate energy of a conformer.

  • optimise_conformers(mol, force_field, max_iter)

    Optimise all conformers.

  • optimise_molecule(mol, conf_id, force_field, max_iter)

    Optimise a specific conformer.

  • get_shape_descriptors(conf_or_mol, include_masses, is_3d)

    Calculate shape descriptors.

class Atom

Bases: object

static clear_atomprops(atom: Atom) None

Clear all properties from an atom.

Shortcut for the analogous RDKit method ClearProp().

Parameters:

atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object from which to clear all properties.

Return type:

None

static get_atomic_num(atom: Atom) int

Get the atomic number of an atom.

Shortcut for the analogous RDKit method GetAtomicNum().

Parameters:

atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.

Returns:

The atomic number of the atom.

Return type:

int

static get_bonds(atom: Atom) tuple

Get the bonds of an atom.

Shortcut for the analogous RDKit method GetBonds().

Parameters:

atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.

Returns:

A tuple of RDKit bond objects associated with the atom.

Return type:

tuple

static get_degree(atom: Atom) int

Get the degree of an atom.

Shortcut for the analogous RDKit method GetDegree().

Parameters:

atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.

Returns:

The degree of the atom.

Return type:

int

static get_explicit_valence(atom: Atom) int

Get the explicit valence of an atom.

Shortcut for the analogous RDKit method GetExplicitValence().

Parameters:

atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.

Returns:

The explicit valence of the atom.

Return type:

int

static get_formal_charge(atom: Atom) int

Get the formal charge of an atom.

Shortcut for the analogous RDKit method GetFormalCharge().

Parameters:

atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.

Returns:

The formal charge of the atom.

Return type:

int

static get_hybridisation(atom: Atom) HybridizationType

Get the hybridisation of an atom.

Shortcut for the analogous RDKit method GetHybridization().

Parameters:

atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.

Returns:

The hybridisation of the atom.

Return type:

rdkit.Chem.rdchem.HybridizationType

static get_idx(atom: Atom) int

Get the index of an atom.

Shortcut for the analogous RDKit method GetIdx().

Parameters:

atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.

Returns:

The index of the atom.

Return type:

int

static get_implicit_valence(atom: Atom) int

Get the implicit valence of an atom.

Shortcut for the analogous RDKit method GetImplicitValence().

Parameters:

atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.

Returns:

The implicit valence of the atom.

Return type:

int

static get_mass(atom: Atom) float

Get the mass of an atom.

Shortcut for the analogous RDKit method GetMass().

Parameters:

atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.

Returns:

The mass of the atom.

Return type:

float

static get_neighbours(atom: Atom, order: int = 1) list

Get the neighbours of an atom up to a specified order.

This function recursively finds the neighbours of a given atom up to the specified order. It ensures that atoms are not revisited by checking their map number.

Parameters:
  • atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.

  • order (int, optional) – The order of neighbours to find. Default is 1.

Returns:

A list of RDKit atom objects representing the neighbours.

Return type:

list

static get_num_explicit_h(atom: Atom) int

Get the number of explicit hydrogen atoms attached to an atom.

Shortcut for the analogous RDKit method GetNumExplicitHs().

Parameters:

atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.

Returns:

The number of explicit hydrogen atoms.

Return type:

int

static get_num_implicit_h(atom: Atom) int

Get the number of implicit hydrogen atoms attached to an atom.

Shortcut for the analogous RDKit method GetNumImplicitHs().

Parameters:

atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.

Returns:

The number of implicit hydrogen atoms.

Return type:

int

static get_num_radical_electrons(atom: Atom) int

Get the number of radical electrons on an atom.

Shortcut for the analogous RDKit method GetNumRadicalElectrons().

Parameters:

atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.

Returns:

The number of radical electrons.

Return type:

int

static get_tot_h(atom: Atom) int

Get the total number of hydrogen atoms attached to an atom.

Shortcut for the analogous RDKit method GetTotalNumHs().

Parameters:

atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.

Returns:

The total number of hydrogen atoms.

Return type:

int

static get_total_degree(atom: Atom) int

Get the total degree of an atom, including hydrogens.

Shortcut for the analogous RDKit method GetTotalDegree().

Parameters:

atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.

Returns:

The total degree of the atom.

Return type:

int

static get_total_valence(atom: Atom) int

Get the total valence of an atom.

Shortcut for the analogous RDKit method GetTotalValence().

Parameters:

atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.

Returns:

The total valence of the atom.

Return type:

int

static has_valence_violation(atom: Atom) bool

Check if an atom has a valence violation.

Shortcut for the analogous RDKit method HasValenceViolation().

Parameters:

atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.

Returns:

True if the atom has a valence violation, False otherwise.

Return type:

bool

static is_aromatic(atom: Atom) bool

Check if an atom is aromatic.

Shortcut for the analogous RDKit method GetIsAromatic().

Parameters:

atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.

Returns:

True if the atom is aromatic, False otherwise.

Return type:

bool

static is_in_ring(atom: Atom) bool

Check if an atom is in a ring.

Shortcut for the analogous RDKit method IsInRing().

Parameters:

atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.

Returns:

True if the atom is in a ring, False otherwise.

Return type:

bool

static is_in_ring_size(atom: Atom, size: int) bool

Check if an atom is in a ring of a specific size.

Shortcut for the analogous RDKit method IsInRingSize().

Parameters:
  • atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.

  • size (int) – The size of the ring to check.

Returns:

True if the atom is in a ring of the specified size, False otherwise.

Return type:

bool

static set_atom_map_num(atom: Atom, num: int) None

Set the atom map number for an atom.

Shortcut for the analogous RDKit method SetAtomMapNum().

Parameters:
  • atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.

  • num (int) – The atom map number to set.

Return type:

None

static set_formal_charge(atom: Atom, charge: int) None

Set the formal charge of an atom.

Shortcut for the analogous RDKit method SetFormalCharge().

Parameters:
  • atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.

  • charge (int) – The formal charge to set.

Return type:

None

static set_is_aromatic(atom: Atom, decision: bool) None

Set the aromaticity of an atom.

Shortcut for the analogous RDKit method SetIsAromatic().

Parameters:
  • atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.

  • decision (bool) – Whether the atom should be marked as aromatic.

Return type:

None

static set_num_explicit_h(atom: Atom, num: int) None

Set the number of explicit hydrogen atoms attached to an atom.

Shortcut for the analogous RDKit method SetNumExplicitHs().

Parameters:
  • atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.

  • num (int) – The number of explicit hydrogen atoms to set.

Return type:

None

static set_num_radical_electrons(atom: Atom, num: int) None

Set the number of radical electrons on an atom.

Shortcut for the analogous RDKit method SetNumRadicalElectrons().

Parameters:
  • atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.

  • num (int) – The number of radical electrons to set.

Return type:

None

class Base

Bases: object

static assign_atom_labels(mol: Mol, prop_values: str | int | float | Iterable | None = None, atom_ids: Iterable[int] = ()) None

Assign labels to atoms in a molecule.

If atom_ids is provided, only those atoms will be labelled. Otherwise, all atoms will be labelled. If prop_values is not provided, atom indices will be used as labels.

Parameters:
  • mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.

  • prop_values (str, int, float, Iterable, optional) – Values to assign as labels. If None, atom indices are used.

  • atom_ids (Iterable[int], optional) – Atom indices to assign labels to. Default is all atoms.

Return type:

None

static assign_atom_mapnumbers(mol: Mol, atom_ids: Iterable[int] = ()) None

Assign map numbers to atoms in a molecule.

If atom_ids is provided, only those atoms will be assigned map numbers. Otherwise, all atoms in the molecule will be assigned map numbers.

Parameters:
  • mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.

  • atom_ids (Iterable[int], optional) – Atom indices to assign map numbers to. Default is all atoms.

Return type:

None

static assign_atom_notes(mol: Mol, prop_values: str | int | float | Iterable | None = None, atom_ids: Iterable[int] = ()) None

Assign notes to atoms in a molecule.

If atom_ids is provided, only those atoms will be annotated. Otherwise, all atoms will be annotated. If prop_values is not provided, atom indices will be used as notes.

Parameters:
  • mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.

  • prop_values (str, int, float, Iterable, optional) – Values to assign as notes. If None, atom indices are used.

  • atom_ids (Iterable[int], optional) – Atom indices to assign notes to. Default is all atoms.

Return type:

None

static assign_bond_notes(mol: Mol, prop_values: str | int | float | Iterable, bond_ids: Iterable[int] = ()) None

Assign notes to bonds in a molecule.

If bond_ids is provided, only those bonds will be annotated. Otherwise, all bonds will be annotated.

Parameters:
  • mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.

  • prop_values (str, int, float, Iterable) – Values to assign as notes.

  • bond_ids (Iterable[int], optional) – Bond indices to assign notes to. Default is all bonds.

Return type:

None

static clear_all_atomprops(mol: Mol) None

Clear all properties from atoms in a molecule.

Parameters:

mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.

Return type:

None

static clear_prop(rdkit_obj: Mol | Atom | Bond | Conformer, prop: str) None

Clear a property from an RDKit object.

Parameters:
  • rdkit_obj (rdkit.Chem.rdchem.Mol or Atom or Bond or Conformer) – The RDKit object from which the property will be removed.

  • prop (str) – The name of the property to remove.

Return type:

None

static get_owning_mol(rdkit_obj: Atom | Bond | Conformer) Mol

Get the molecule that owns the RDKit object.

Shortcut for RDKit’s GetOwningMol() method.

Parameters:

rdkit_obj (rdkit.Chem.rdchem.Atom or Bond or Conformer) – The RDKit object whose parent molecule is to be retrieved.

Returns:

The owning molecule.

Return type:

rdkit.Chem.rdchem.Mol

static get_prop(rdkit_obj: Mol | Atom | Bond | Conformer, prop: str) str

Get the value of a property from an RDKit object.

Shortcut for RDKit’s GetProp() method.

Parameters:
  • rdkit_obj (rdkit.Chem.rdchem.Mol or Atom or Bond or Conformer) – The RDKit object to inspect.

  • prop (str) – The name of the property to retrieve.

Returns:

The value of the specified property.

Return type:

str

static get_prop_names(rdkit_obj: Mol | Atom | Bond | Conformer) list

Get the names of all properties of an RDKit object.

Parameters:

rdkit_obj (rdkit.Chem.rdchem.Mol or Atom or Bond or Conformer) – The RDKit object to inspect.

Returns:

A list of property names.

Return type:

list

static get_props_dict(rdkit_obj: Mol | Atom | Bond | Conformer) dict

Get all properties of an RDKit object as a dictionary.

Parameters:

rdkit_obj (rdkit.Chem.rdchem.Mol or Atom or Bond or Conformer) – The RDKit object to inspect.

Returns:

A dictionary of all properties associated with the object.

Return type:

dict

static set_prop(rdkit_obj: Mol | Atom | Bond | Conformer, prop_name: str, prop_val: str) None

Set a property on an RDKit object.

Shortcut for RDKit’s SetProp() method.

Parameters:
  • rdkit_obj (rdkit.Chem.rdchem.Mol or Atom or Bond or Conformer) – The RDKit object to modify.

  • prop_name (str) – The name of the property to set.

  • prop_val (str) – The value to assign to the property.

Return type:

None

class Bond

Bases: object

static get_begin_atom(bond: Bond) Atom

Get the beginning atom of a bond.

Shortcut for the analogous RDKit method GetBeginAtom().

Parameters:

bond (rdkit.Chem.rdchem.Bond) – The RDKit bond object.

Returns:

The atom at the beginning of the bond.

Return type:

rdkit.Chem.rdchem.Atom

static get_begin_atom_idx(bond: Bond) int

Get the index of the beginning atom of a bond.

Shortcut for the analogous RDKit method GetBeginAtomIdx().

Parameters:

bond (rdkit.Chem.rdchem.Bond) – The RDKit bond object.

Returns:

The index of the beginning atom.

Return type:

int

static get_bond_type(bond: Bond) BondType

Get the type of a bond.

Shortcut for the analogous RDKit method GetBondType().

Parameters:

bond (rdkit.Chem.rdchem.Bond) – The RDKit bond object.

Returns:

The type of the bond.

Return type:

rdkit.Chem.rdchem.BondType

static get_end_atom(bond: Bond) Atom

Get the ending atom of a bond.

Shortcut for the analogous RDKit method GetEndAtom().

Parameters:

bond (rdkit.Chem.rdchem.Bond) – The RDKit bond object.

Returns:

The atom at the end of the bond.

Return type:

rdkit.Chem.rdchem.Atom

static get_end_atom_idx(bond: Bond) int

Get the index of the ending atom of a bond.

Shortcut for the analogous RDKit method GetEndAtomIdx().

Parameters:

bond (rdkit.Chem.rdchem.Bond) – The RDKit bond object.

Returns:

The index of the ending atom.

Return type:

int

static get_idx(bond: Bond) int

Get the index of a bond.

Shortcut for the analogous RDKit method GetIdx().

Parameters:

bond (rdkit.Chem.rdchem.Bond) – The RDKit bond object.

Returns:

The index of the bond.

Return type:

int

static get_other_atom(bond: Bond, atom: Atom) Atom

Given one atom of the bond, get the other atom.

Shortcut for the analogous RDKit method GetOtherAtom().

Parameters:
  • bond (rdkit.Chem.rdchem.Bond) – The RDKit bond object.

  • atom (rdkit.Chem.rdchem.Atom) – One of the atoms in the bond.

Returns:

The other atom in the bond.

Return type:

rdkit.Chem.rdchem.Atom

static get_other_atom_idx(bond: Bond, idx: int) int

Given the index of one atom in the bond, get the index of the other atom.

Shortcut for the analogous RDKit method GetOtherAtomIdx().

Parameters:
  • bond (rdkit.Chem.rdchem.Bond) – The RDKit bond object.

  • idx (int) – The index of one atom in the bond.

Returns:

The index of the other atom in the bond.

Return type:

int

static get_valence_contribution(bond: Bond, atom: Atom) float

Get the valence contribution of a bond to an atom.

Shortcut for the analogous RDKit method GetValenceContrib().

Parameters:
  • bond (rdkit.Chem.rdchem.Bond) – The RDKit bond object.

  • atom (rdkit.Chem.rdchem.Atom) – The atom for which to compute the valence contribution.

Returns:

The valence contribution of the bond to the atom.

Return type:

float

static is_aromatic(bond: Bond) bool

Check if a bond is aromatic.

Shortcut for the analogous RDKit method GetIsAromatic().

Parameters:

bond (rdkit.Chem.rdchem.Bond) – The RDKit bond object.

Returns:

True if the bond is aromatic, False otherwise.

Return type:

bool

static is_conjugated(bond: Bond) bool

Check if a bond is conjugated.

Shortcut for the analogous RDKit method GetIsConjugated().

Parameters:

bond (rdkit.Chem.rdchem.Bond) – The RDKit bond object.

Returns:

True if the bond is conjugated, False otherwise.

Return type:

bool

static is_in_ring(bond: Bond) bool

Check if a bond is in a ring.

Shortcut for the analogous RDKit method IsInRing().

Parameters:

bond (rdkit.Chem.rdchem.Bond) – The RDKit bond object.

Returns:

True if the bond is in a ring, False otherwise.

Return type:

bool

static is_in_ring_size(bond: Bond, size: int) bool

Check if a bond is in a ring of a specific size.

Shortcut for the analogous RDKit method IsInRingSize().

Parameters:
  • bond (rdkit.Chem.rdchem.Bond) – The RDKit bond object.

  • size (int) – The size of the ring to check.

Returns:

True if the bond is in a ring of the specified size, False otherwise.

Return type:

bool

static set_is_aromatic(bond: Bond, decision: bool) None

Check if a bond is in a ring of a specific size.

Shortcut for the analogous RDKit method IsInRingSize().

Parameters:
  • bond (rdkit.Chem.rdchem.Bond) – The RDKit bond object.

  • size (int) – The size of the ring to check.

Returns:

True if the bond is in a ring of the specified size, False otherwise.

Return type:

bool

class Conformation

Bases: object

static add_conformer(mol: Mol, conformer: Conformer, assignId: bool = False) int

Add a conformer to a molecule and return its ID.

Shortcut for the analogous RDKit method AddConformer().

Parameters:
  • mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object to which the conformer will be added.

  • conformer (rdkit.Chem.rdchem.Conformer) – The conformer to add to the molecule.

  • assignId (bool, optional) – Whether to assign a new ID to the conformer. Default is False.

Returns:

The ID of the added conformer.

Return type:

int

static calculate_conformer_energy_from_mol(mol: Mol, conf_id: int = -1, forcefield: Literal['UFF', 'MMFF94'] = 'MMFF94') float

Calculate the energy of a specific conformer.

Parameters:
  • mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.

  • conf_id (int, optional) – The ID of the conformer. Default is -1.

  • forcefield ({'UFF', 'MMFF94'}, optional) – The force field to use. Default is ‘MMFF94’.

Returns:

The energy of the conformer in kcal/mol.

Return type:

float

static canonicalise_conformer(conf: Conformer, ignoreHs: bool = False) None

Canonicalise a conformer.

Shortcut for the analogous RDKit method CanonicalizeConformer().

Parameters:
  • conf (rdkit.Chem.rdchem.Conformer) – The conformer to canonicalise.

  • ignoreHs (bool, optional) – Whether to ignore hydrogen atoms. Default is False.

Return type:

None

static canonicalise_mol_conformers(mol: Mol, ignoreHs: bool = False) None

Canonicalise all conformers of a molecule.

Shortcut for the analogous RDKit method CanonicalizeMol().

Parameters:
  • mol (rdkit.Chem.rdchem.Mol) – The molecule whose conformers are to be canonicalised.

  • ignoreHs (bool, optional) – Whether to ignore hydrogen atoms. Default is False.

Return type:

None

static display_3dmols_overlapped(mols: list[Mol], py3dmolviewer=None, size: tuple | list | int = (400, 400), confIds: list[int] | None = None, removeHs: bool = False, colours: list[str] = None) None

Display multiple 3D molecules overlapped in a py3Dmol viewer.

This function displays a list of RDKit molecules in a 3D viewer, with options to customise size, conformer IDs, hydrogen removal, and colour schemes.

Parameters:
  • mols (list of rdkit.Chem.rdchem.Mol) – List of RDKit molecule objects to display.

  • py3dmolviewer (py3Dmol.view, optional) – An existing py3Dmol viewer instance. If None, a new viewer is created.

  • size (int or tuple or list, optional) – Size of the viewer. Can be a single integer or a tuple/list of two integers. Default is (400, 400).

  • confIds (list of int, optional) – List of conformer IDs to display. If None, uses default conformer (-1). Default is None.

  • removeHs (bool, optional) – Whether to remove hydrogen atoms before displaying. Default is False.

  • colours (list of str, optional) – List of colour schemes for the molecules. Default is a predefined list.

Return type:

None

static display_conformers(conf: Conformer | Iterable[Conformer], size: tuple = (300, 300)) None

Display conformers in 3D.

Shortcut for the analogous RDKit method drawMol3D().

Parameters:
  • conf (rdkit.Chem.rdchem.Conformer or Iterable[rdkit.Chem.rdchem.Conformer]) – A single RDKit conformer or an iterable of conformers to display.

  • size (tuple, optional) – The size of the display window. Default is (300, 300).

Return type:

None

static generate_conformers(mol: Mol, n_conf: int, rms_threshold: int = 0, embedding_params: Literal['ETDG', 'ETKDG', 'ETKDGv2', 'ETKDGv3'] = 'ETKDGv3', show: bool = False, size: tuple = (300, 300), force_field: Literal['MMFF94', 'UFF'] = 'MMFF94', optimise: bool = True, max_iter: int = 500, random_seed: int = 61453) tuple

Generate conformers for a molecule.

This method embeds multiple conformers using specified embedding parameters, optionally optimises them using a force field, and returns the conformers along with their energies and optimisation results.

Parameters:
  • mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.

  • n_conf (int) – The number of conformers to generate.

  • rms_threshold (int, optional) – RMS threshold for pruning conformers. Default is 0.

  • embedding_params ({'ETDG', 'ETKDG', 'ETKDGv2', 'ETKDGv3'}, optional) – The embedding parameters to use. Default is ‘ETKDGv3’.

  • show (bool, optional) – Whether to display the conformers. Default is False.

  • size (tuple, optional) – The size of the display window. Default is (300, 300).

  • force_field ({'MMFF94', 'UFF'}, optional) – The force field to use for energy calculation. Default is ‘MMFF94’.

  • optimise (bool, optional) – Whether to optimise the conformers. Default is True.

  • max_iter (int, optional) – Maximum number of iterations for optimisation. Default is 500.

  • random_seed (int, optional) – Random seed for conformer generation. Default is 0xf00d.

Returns:

A tuple containing: - A list of RDKit conformer objects. - A list of conformer energies. - A list of optimisation results (if optimise is True).

Return type:

tuple

static get_shape_descriptors(conf_or_mol: Conformer | Mol, include_masses: bool = True, is_3d: bool = True) dict

Calculate shape descriptors for a conformer or molecule.

Parameters:
  • conf_or_mol (rdkit.Chem.rdchem.Conformer or rdkit.Chem.rdchem.Mol) – The conformer or molecule to analyse.

  • include_masses (bool, optional) – Whether to include atomic masses. Default is True.

  • is_3d (bool, optional) – Whether to use 3D coordinates. Default is True.

Returns:

A dictionary of shape descriptors.

Return type:

dict

static optimise_conformers(mol: Mol, force_field: Literal['UFF', 'MMFF94'] = 'MMFF94', max_iter: int = 500) list

Optimise all conformers of a molecule.

Shortcut for the analogous RDKit methods MMFFOptimizeMoleculeConfs() and UFFOptimizeMoleculeConfs().

Parameters:
  • mol (rdkit.Chem.rdchem.Mol) – The molecule whose conformers are to be optimised.

  • force_field ({'UFF', 'MMFF94'}, optional) – The force field to use. Default is ‘MMFF94’.

  • max_iter (int, optional) – Maximum number of iterations. Default is 500.

Returns:

Optimisation results for each conformer.

Return type:

list of tuple

static optimise_molecule(mol: Mol, conf_id: int = -1, force_field: Literal['UFF', 'MMFF94'] = 'MMFF94', max_iter: int = 500) int

Optimise a specific conformer of a molecule.

Shortcut for the analogous RDKit methods MMFFOptimizeMolecule() and UFFOptimizeMolecule().

Parameters:
  • mol (rdkit.Chem.rdchem.Mol) – The molecule whose conformer is to be optimised.

  • conf_id (int, optional) – The ID of the conformer. Default is -1. If value is -1, whole molecule will be optimised.

  • force_field ({'UFF', 'MMFF94'}, optional) – The force field to use. Default is ‘MMFF94’.

  • max_iter (int, optional) – Maximum number of iterations. Default is 500.

Returns:

0 if converged, -1 if force field setup failed, 1 if more iterations are needed.

Return type:

int

static straighten_mol_2d(mol: Mol) None

Straighten the 2D depiction of a molecule.

This method computes 2D coordinates and straightens the depiction of the molecule using RDKit’s depiction tools.

Parameters:

mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.

Return type:

None

class Mol

Bases: object

static get_atoms(mol: Mol) list

Get the atoms of a molecule.

Retrieves all atoms from the given RDKit molecule object.

Parameters:

mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.

Returns:

A list of RDKit atom objects.

Return type:

list of rdkit.Chem.rdchem.Atom

static get_atoms_from_idx(mol: Mol, idx: int | Iterable[int]) Atom | list[Atom]

Retrieve atom(s) from a molecule based on index or indices.

Parameters:
  • mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.

  • idx (int or Iterable[int]) – The index or indices of the atom(s) to retrieve.

Returns:

A single atom object if idx is an int, otherwise a list of atom objects.

Return type:

rdkit.Chem.rdchem.Atom or list of rdkit.Chem.rdchem.Atom

static get_bond_between_atoms(mol: Mol, idx1: int, idx2: int) Bond

Retrieve the bond between two atoms in a molecule.

Parameters:
  • mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.

  • idx1 (int) – The index of the first atom.

  • idx2 (int) – The index of the second atom.

Returns:

The bond object between the specified atoms.

Return type:

rdkit.Chem.rdchem.Bond

static get_bonds(mol: Mol) list

Get the bonds of a molecule.

Retrieves all bonds from the given RDKit molecule object.

Parameters:

mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.

Returns:

A list of RDKit bond objects.

Return type:

list of rdkit.Chem.rdchem.Bond

static get_bonds_from_idx(mol: Mol, idx: int | Iterable[int]) list

Retrieve bond(s) from a molecule based on index or indices.

Parameters:
  • mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.

  • idx (int or Iterable[int]) – The index or indices of the bond(s) to retrieve.

Returns:

A list of RDKit bond objects.

Return type:

list of rdkit.Chem.rdchem.Bond

static get_conf_ids(mol: Mol) list

Get all conformer IDs associated with a molecule.

Parameters:

mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.

Returns:

A list of conformer IDs.

Return type:

list of int

static get_conformer(mol: Mol, id: int = -1) Conformer

Get the conformer associated with a molecule.

Parameters:
  • mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.

  • id (int, optional) – The ID of the conformer to retrieve. Default is -1.

Returns:

The conformer object.

Return type:

rdkit.Chem.rdchem.Conformer

static get_conformers(mol: Mol) list

Get all conformers associated with a molecule.

Parameters:

mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.

Returns:

A list of conformer objects.

Return type:

list of rdkit.Chem.rdchem.Conformer

static get_coordinates(conf_or_mol: Conformer | Mol, is_3d: bool = False, canonOrient: bool = True, bondLength: float = -1.0) ndarray

Get the coordinates of a molecule or conformer.

Parameters:
  • conf_or_mol (rdkit.Chem.rdchem.Conformer or rdkit.Chem.rdchem.Mol) – The conformer or molecule object.

  • is_3d (bool, optional) – Whether to retrieve 3D coordinates. Default is False.

  • canonOrient (bool, optional) – Whether to use canonical orientation for 2D coordinates. Default is True.

  • bondLength (float, optional) – Bond length for 2D coordinate generation. Default is -1.0.

Returns:

An array of atomic coordinates.

Return type:

numpy.ndarray

static get_distance_matrix(mol: Mol, is_3d: bool = True) ndarray

Get the distance matrix of a molecule.

Calculates the pairwise distance matrix using atomic coordinates.

Parameters:
  • mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.

  • is_3d (bool, optional) – Whether to use 3D coordinates. Default is True.

Returns:

The distance matrix.

Return type:

numpy.ndarray

static get_gasteiger_charges(mol: Mol, atom_ids: int | Iterable[int] = [], nIter: int = 12) list

Compute Gasteiger charges for specified atoms in a molecule.

Parameters:
  • mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.

  • atom_ids (int or Iterable[int], optional) – Atom indices to compute charges for. Default is all atoms.

  • nIter (int, optional) – Number of iterations for charge computation. Default is 12.

Returns:

Gasteiger charges for the specified atoms.

Return type:

list of float

static get_stereogroups(mol: Mol) list

Get the stereochemistry groups of a molecule.

Parameters:

mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.

Returns:

A list of stereochemistry groups.

Return type:

list of rdkit.Chem.rdchem.StereoGroup

static remove_all_conformers(mol: Mol) None

Remove all conformers from a molecule.

Parameters:

mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.

Return type:

None

static remove_conformer(mol: Mol, id: int) None

Remove a conformer from a molecule.

Parameters:
  • mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.

  • id (int) – The ID of the conformer to remove.

Return type:

None

create_molecule(mol_input: str | str_ | Mol, add_hydrogens: bool = False, show: bool = False, solid_sticks: bool = False, is_3d: bool = False, embedding_params: Literal['ETDG', 'ETKDG', 'ETKDGv2', 'ETKDGv3'] = 'ETKDGv3', size: tuple = [300, 300], optimise: bool = True, optimiser: Literal['MMFF94', 'UFF'] = 'MMFF94', random_seed: int = 61453) Mol | None

Create and optionally display a molecule from a SMILES string or RDKit Mol object.

Supports 2D/3D visualisation, hydrogen addition, geometry optimisation, and rendering options.

Parameters:
  • mol_input (str or numpy.str_ or rdkit.Chem.rdchem.Mol) – The molecule input as a SMILES string, numpy string, or RDKit Mol object.

  • add_hydrogens (bool, optional) – Whether to add hydrogens to the molecule. Default is False.

  • show (bool, optional) – Whether to display the molecule. Default is False.

  • solid_sticks (bool, optional) – Whether to render the molecule as solid sticks. Default is False.

  • is_3d (bool, optional) – Whether to generate a 3D representation. Default is False.

  • embedding_params ({'ETDG', 'ETKDG', 'ETKDGv2', 'ETKDGv3'}, optional) – Embedding parameters for 3D generation. Default is ‘ETKDGv3’.

  • size (tuple, optional) – Size of the displayed image. Default is (300, 300).

  • optimise (bool, optional) – Whether to optimise the geometry. Default is True.

  • optimiser ({'MMFF94', 'UFF'}, optional) – Optimisation method. Default is ‘MMFF94’.

  • random_seed (int, optional) – Random seed for embedding. Default is 0xf00d.

Returns:

The processed molecule object, or None if only visualisation is requested.

Return type:

rdkit.Chem.rdchem.Mol or None

generate_resonance(smi: str, save: bool = False, path_name: str = '') list[bytes]

Generate resonance structures for a SMILES string and optionally save images.

Parameters:
  • smi (str) – A SMILES string representing the molecule.

  • save (bool, optional) – Whether to save the images. Default is False.

  • path_name (str, optional) – Directory path to save images. Default is current directory.

Returns:

List of binary image data for the resonance structures.

Return type:

list of bytes

kekulise_smiles(smiles: str) str

Convert a SMILES string to its Kekulé form.

Parameters:

smiles (str) – A SMILES string.

Returns:

The Kekulé form of the SMILES string.

Return type:

str

mol_from_string(mol_input: str) Mol

Convert a molecular string (InChI or SMILES) to an RDKit molecule object.

Attempts to interpret the input string as an InChI first, then as a SMILES. Raises a ValueError if both conversions fail.

Parameters:

mol_input (str) – A string representing a molecule in InChI or SMILES format.

Returns:

RDKit molecule object corresponding to the input string.

Return type:

rdkit.Chem.rdchem.Mol

Raises:

ValueError – If the input string is not a valid InChI or SMILES.

mol_to_binary(mol: Mol) bytes

Convert an RDKit molecule object to its binary representation.

Parameters:

mol (rdkit.Chem.rdchem.Mol) – An RDKit molecule object.

Returns:

Binary representation of the molecule.

Return type:

bytes

neutralise_mol(mol: Mol) Mol

Neutralise an RDKit molecule by removing formal charges.

Based on RDKit’s neutralisation recipe.

Parameters:

mol (rdkit.Chem.rdchem.Mol) – An RDKit molecule object.

Returns:

The neutralised molecule.

Return type:

rdkit.Chem.rdchem.Mol

remove_smarts_pattern(mol: Mol, smarts_string: str) Mol

Remove substructures matching a SMARTS pattern from a molecule.

Parameters:
  • mol (rdkit.Chem.rdchem.Mol) – An RDKit molecule object.

  • smarts_string (str) – A SMARTS pattern to remove.

Returns:

The molecule with matching substructures removed.

Return type:

rdkit.Chem.rdchem.Mol

smarts_from_string(string: str) str

Convert a SMILES or InChI string into a SMARTS string.

Parameters:

string (str) – A string representing a molecule in SMILES or InChI format.

Returns:

A SMARTS string representing the molecular pattern.

Return type:

str

smiles_to_inchi(smiles: str) str

Convert a SMILES string to an InChI string.

Parameters:

smiles (str) – A string representing a molecule in SMILES format.

Returns:

An InChI string corresponding to the input SMILES.

Return type:

str

Raises:

ValueError – If the SMILES string is invalid.

unkekulise_smiles(smiles: str) str

Convert a Kekulé SMILES string to its canonical form.

Parameters:

smiles (str) – A Kekulé SMILES string.

Returns:

The canonical SMILES string.

Return type:

str