mlchem.chem package¶
Subpackages¶
Submodules¶
mlchem.chem.manipulation module¶
- class MolCleaner¶
Bases:
objectA class to clean and process SMILES strings.
The MolCleaner class provides methods to clean and process a list of SMILES strings. The cleaning process includes steps such as initialising SMILES, removing carbon ions, inorganics, organometallics, and mixtures, desalting, neutralising, and performing a final quality check.
- Parameters:
input_smiles_list (list of str) – A list of SMILES strings representing the molecules to be cleaned.
id_list (list of int, optional) – A list of IDs corresponding to the SMILES strings. If not provided, IDs will be generated automatically.
- input_smiles_list¶
The original list of SMILES strings.
- Type:
list of str
- input_id_list¶
The original list of IDs.
- Type:
list of int
- df_input¶
A DataFrame containing the input IDs and SMILES strings.
- Type:
pandas.DataFrame
- n_steps¶
The number of cleaning steps performed.
- Type:
int
- smiles¶
The current list of SMILES strings after cleaning steps.
- Type:
list of str
- ids¶
The current list of IDs after cleaning steps.
- Type:
list of int
- IsIsomeric¶
A flag indicating whether the SMILES strings are isomeric.
- Type:
bool
- IsCanonical¶
A flag indicating whether the SMILES strings are canonical.
- Type:
bool
- IsKekulised¶
A flag indicating whether the SMILES strings are kekulised.
- Type:
bool
- df_accepted¶
A DataFrame to store accepted SMILES strings.
- Type:
pandas.DataFrame
- df_rejected¶
A DataFrame to store rejected SMILES strings and the reason for rejection.
- Type:
pandas.DataFrame
Examples
>>> cleaner = MolCleaner([smiles_1, smiles_2], [id_1, id_2]) >>> cleaner.initialise_smiles() >>> cleaner.remove_carbon_ions() >>> cleaner.remove_inorganics() >>> cleaner.remove_organometallics() >>> cleaner.remove_mixtures() >>> cleaner.desalt_smiles(method='largest') >>> cleaner.neutralise_smiles() >>> cleaner.quality_checker()
Alternatively:
>>> cleaner.full_clean()
- __init__(input_smiles_list: list[str], id_list: list[int] | None = None) None¶
Initialise the MolCleaner instance.
This constructor sets up the initial state of the MolCleaner object, including the input SMILES strings, optional IDs, and internal tracking variables for cleaning steps and SMILES processing.
- Parameters:
input_smiles_list (list of str) – A list of SMILES strings representing the molecules to be cleaned.
id_list (list of int, optional) – A list of IDs corresponding to the SMILES strings. If not provided, IDs will be automatically generated as a range from 0 to the number of SMILES strings.
- Return type:
None
- desalt_smiles(method: Literal['chembl', 'rdkit', 'largest'] = 'largest', dehydrate: bool = True, isomeric: bool | None = None, canonical: bool | None = None, kekulise: bool | None = None, verbose: bool = False) None¶
Desalt the SMILES strings using the specified method.
This method processes the input SMILES strings to remove salts using one of the available methods: ‘chembl’, ‘rdkit’, or ‘largest’. It updates the SMILES strings and IDs, and tracks rejected SMILES strings with reasons.
- Parameters:
method ({'chembl', 'rdkit', 'largest'}, optional) – The method to use for desalting. Default is ‘largest’.
dehydrate (bool, optional) – Whether to remove water fragments before desalting. Default is True.
isomeric (bool, optional) – Whether to generate isomeric SMILES. Default is None.
canonical (bool, optional) – Whether to generate canonical SMILES. Default is None.
kekulise (bool, optional) – Whether to kekulise the SMILES. Default is None.
verbose (bool, optional) – Whether to print verbose output. Default is False.
- Return type:
None
- full_clean(desalting_method: Literal['rdkit', 'chembl', 'largest'] = 'largest') None¶
Perform a full cleaning process on the SMILES strings.
This method sequentially applies all cleaning steps to the input SMILES strings, including initialisation, filtering, desalting, neutralisation, and quality checking.
- Parameters:
desalting_method ({'rdkit', 'chembl', 'largest'}, optional) – The method to use for desalting. Default is ‘largest’.
- Return type:
None
Examples
>>> cleaner = MolCleaner(smiles_list) >>> cleaner.full_clean(desalting_method='chembl')
- initialise_smiles(isomeric: bool = False, canonical: bool = True, kekulise: bool = True) None¶
Initialise the SMILES strings with specified options.
This method processes the input SMILES strings according to the specified options for isomeric, canonical, and kekulised representations. It updates the SMILES strings and IDs, and tracks rejected SMILES strings with reasons.
- Parameters:
isomeric (bool, optional) – Whether to generate isomeric SMILES. Default is False.
canonical (bool, optional) – Whether to generate canonical SMILES. Default is True.
kekulise (bool, optional) – Whether to kekulise the SMILES. Default is True.
- Return type:
None
- neutralise_smiles(isomeric: bool | None = None, canonical: bool | None = None, kekulise: bool | None = None) None¶
Neutralise the SMILES strings.
This method processes the input SMILES strings to neutralise charged species. It updates the SMILES strings and IDs, and tracks rejected SMILES strings with reasons.
- Parameters:
isomeric (bool, optional) – Whether to generate isomeric SMILES. Default is None.
canonical (bool, optional) – Whether to generate canonical SMILES. Default is None.
kekulise (bool, optional) – Whether to kekulise the SMILES. Default is None.
- Return type:
None
- quality_checker() None¶
Perform quality check on SMILES strings.
This method evaluates the structural integrity of accepted SMILES strings using the ChEMBL structure checker. It assigns a priority score and message to each molecule, indicating the severity of any issues found.
- Return type:
None
Examples
>>> cleaner.quality_checker() >>> cleaner.df_accepted[['id', 'PRIORITY', 'MESSAGES']]
- remove_carbon_ions() None¶
Remove SMILES strings containing carbon ions.
This method filters out SMILES strings that contain carbon ions. It updates the internal SMILES list and IDs, and logs rejected entries with the reason for rejection.
- Return type:
None
- remove_inorganics() None¶
Remove inorganic SMILES strings.
This method filters out SMILES strings that are classified as inorganic. It updates the internal SMILES list and IDs, and logs rejected entries with the reason for rejection.
- Return type:
None
- remove_mixtures() None¶
Remove SMILES strings that are mixtures.
This method filters out SMILES strings that represent mixtures of multiple components, unless they are simple binary mixtures involving metals or halogens. It updates the internal SMILES list and IDs, and logs rejected entries with the reason for rejection.
- Return type:
None
- remove_organometallics() None¶
Remove SMILES strings containing organometallic compounds.
This method filters out SMILES strings that contain organometallic structures, excluding simple metal salts. It updates the internal SMILES list and IDs, and logs rejected entries with the reason for rejection.
- Return type:
None
- class MolGenerator¶
Bases:
objectA class to generate SMILES strings using SELFIES fragments.
The MolGenerator class allows for the generation of new molecules represented as SMILES strings by combining SELFIES fragments either randomly or through substitution into a template molecule.
Examples
>>> generator = MolGenerator() >>> generator.generate_smiles(template_smiles='c1ccccc1', ... n_molecules=20, ... n_fragments=5, ... n_substitutions=1, ... attempt_limit=1000) >>> cleaner = MolCleaner(generator.smiles_generated) >>> cleaner.initialise_smiles() >>> cleaner.neutralise_smiles() >>> generator.smiles_generated = cleaner.smiles
- __init__(dictionary: dict = None)¶
Initialise the MolGenerator instance.
This constructor sets up the SELFIES fragment dictionary used for molecule generation. If a custom dictionary is provided, it is used to populate the internal fragment bag; otherwise, a default dictionary is used.
- Parameters:
dictionary (dict, optional) – A custom dictionary of SELFIES fragments and their frequencies. If None, a default dictionary is used.
- Return type:
None
- generate_smiles(n_molecules: int, n_fragments: int, template_smiles: str = None, substitution_sites: list = None, n_substitutions: int = None, include_extremities: bool = True, attempt_limit: int = 1000) None¶
Generate SMILES strings using a template or random fragments.
This method generates a specified number of SMILES strings either by randomly combining SELFIES fragments or by substituting fragments into a template molecule at specified positions.
- Parameters:
n_molecules (int) – The number of molecules to generate.
n_fragments (int) – The number of fragments to use for each molecule.
template_smiles (str, optional) – A template SMILES string to use for generating molecules.
substitution_sites (list of int, optional) – Indices in the SELFIES string where substitutions should occur.
n_substitutions (int, optional) – The number of substitutions to make in the template.
include_extremities (bool, optional) – Whether to include the start and end of the SELFIES string as possible substitution sites. Default is True.
attempt_limit (int, optional) – The maximum number of attempts to generate the specified number of molecules. Default is 1000.
- Returns:
None
Updates
——-
template_smiles (str) – The template SMILES string used for generation.
smiles_generated (list of str) – The list of generated SMILES strings.
selfies_generated (list of str) – The list of generated SELFIES strings.
mols_generated (list of rdkit.Chem.rdchem.Mol) – The list of RDKit Mol objects corresponding to the generated SMILES.
pattern_atoms (list) – A list of pattern atom matches for each generated molecule.
double_legend (list of str) – A list of strings combining SMILES and SELFIES for each molecule.
Examples
>>> generator.generate_smiles(template_smiles='c1ccccc1', ... n_molecules=5, ... n_fragments=3, ... n_substitutions=1)
- class PatternRecognition¶
Bases:
objectA utility class for recognising chemical patterns using SMARTS.
This class provides a reference for common SMARTS-based pattern matching used in cheminformatics, particularly with RDKit. It includes a vocabulary of generic chemical groupings and links to external resources for further information on SMARTS syntax and usage.
References
https://daylight.com/dayhtml_tutorials/languages/smarts/smarts_examples.html
https://www.daylight.com/dayhtml_tutorials/languages/smarts/index.html
https://www.daylight.com/dayhtml/doc/theory/theory.smarts.html
https://www.labcognition.com/onlinehelp/en/smiles_and_smarts_nomenclature.htm
SMARTS Vocabulary¶
Alkyl (ALK): alkyl side chains (not an H atom)
AlkylH (ALH): alkyl side chains including an H atom
Alkenyl (AEL): alkenyl side chains
AlkenylH (AEH): alkenyl side chains or an H atom
Alkynyl (AYL): alkynyl side chains
AlkynylH (AYH): alkynyl side chains or an H atom
Alkoxy (AOX): alkoxy side chains
AlkoxyH (AOH): alkoxy side chains or an H atom
Carbocyclic (CBC): carbocyclic side chains
CarbocyclicH (CBH): carbocyclic side chains or an H atom
Carbocycloalkyl (CAL): cycloalkyl side chains
CarbocycloalkylH (CAH): cycloalkyl side chains or an H atom
Carbocycloalkenyl (CEL): cycloalkenyl side chains
CarbocycloalkenylH (CEH): cycloalkenyl side chains or an H atom
Carboaryl (ARY): all-carbon aryl side chains
CarboarylH (ARH): all-carbon aryl side chains or an H atom
Cyclic (CYC): cyclic side chains
CyclicH (CYH): cyclic side chains or an H atom
Acyclic (ACY): acyclic side chains (not an H atom)
AcyclicH (ACH): acyclic side chains or an H atom
Carboacyclic (ABC): all-carbon acyclic side chains
CarboacyclicH (ABH): all-carbon acyclic side chains or an H atom
Heteroacyclic (AHC): acyclic side chains with at least one heteroatom
HeteroacyclicH (AHH): acyclic side chains with at least one heteroatom or an H atom
Heterocyclic (CHC): cyclic side chains with at least one heteroatom
HeterocyclicH (CHH): cyclic side chains with at least one heteroatom or an H atom
Heteroaryl (HAR): aryl side chains with at least one heteroatom
HeteroarylH (HAH): aryl side chains with at least one heteroatom or an H atom
NoCarbonRing (CXX): ring containing no carbon atoms
NoCarbonRingH (CXH): ring containing no carbon atoms or an H atom
Group (G): any group (not H atom)
GroupH (GH): any group (including H atom)
Group* (G*): any group with a ring closure
GroupH* (GH*): any group with a ring closure or an H atom
- class Atoms¶
Bases:
object- static get_ring_size(atom: Atom) int¶
Get the size of the ring an atom belongs to.
This method checks whether the given RDKit atom is part of a ring, and if so, determines the size of the smallest ring it is in. If the atom is not in any ring, the method returns 0.
- Parameters:
atom (rdkit.Chem.rdchem.Atom) – The atom whose ring membership and size is to be evaluated.
- Returns:
The size of the smallest ring the atom is part of, or 0 if the atom is not in a ring.
- Return type:
int
Examples
>>> atom = mol.GetAtomWithIdx(3) >>> MolDrawer.get_ring_size(atom)
- static is_SP(atom: Atom) int¶
Check if an atom is SP-hybridised.
This method evaluates whether the given RDKit atom is SP-hybridised (i.e., has linear geometry with two electron domains).
- Parameters:
atom (rdkit.Chem.rdchem.Atom) – The atom to check.
- Returns:
1 if the atom is SP-hybridised, 0 otherwise.
- Return type:
int
Examples
>>> atom = mol.GetAtomWithIdx(0) >>> MolDrawer.is_SP(atom)
- static is_SP2(atom: Atom) int¶
Check if an atom is SP2-hybridised.
This method evaluates whether the given RDKit atom is SP2-hybridised, which typically corresponds to trigonal planar geometry with three electron domains.
- Parameters:
atom (rdkit.Chem.rdchem.Atom) – The atom to check.
- Returns:
1 if the atom is SP2-hybridised, 0 otherwise.
- Return type:
int
Examples
>>> atom = mol.GetAtomWithIdx(1) >>> MolDrawer.is_SP2(atom)
- static is_SP3(atom: Atom) int¶
Check if an atom is SP3-hybridised.
This method determines whether the given RDKit atom is SP3-hybridised, which typically corresponds to tetrahedral geometry with four electron domains.
- Parameters:
atom (rdkit.Chem.rdchem.Atom) – The atom to check.
- Returns:
1 if the atom is SP3-hybridised, 0 otherwise.
- Return type:
int
Examples
>>> atom = mol.GetAtomWithIdx(2) >>> MolDrawer.is_SP3(atom)
- class Base¶
Bases:
object- static check_smarts_pattern(target: str | Mol, smarts_pattern: str, generic_keywords: list = []) tuple[bool, ndarray[int], str]¶
Check if a given SMARTS pattern matches a target molecule, optionally using generic keywords.
This function supports both standard SMARTS syntax and generic keywords for more intuitive pattern matching.
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or RDKit Mol object representing the target molecule.
smarts_pattern (str) – A SMARTS pattern to match against the target.
generic_keywords (list of str, optional) – Generic keywords to substitute into the SMARTS pattern.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices where the pattern matches. - The final SMARTS pattern used for matching.
- Return type:
tuple of (bool, numpy.ndarray of int, str)
Examples
>>> from mlchem.chem.manipulation import PatternRecognition as pr >>> pr.Base.check_smarts_pattern('CCCCc1ccccc1', smarts_pattern='CC*', generic_keywords=['CYC']) (True, array([3, 4, 5]), 'CC* |$;;CYC$|')
>>> pr.Base.check_smarts_pattern('CCCCc1ccccc1', smarts_pattern='[R]') (True, array([4, 5, 6, 7, 8, 9]), '[R]')
>>> pr.Base.check_smarts_pattern('CCCCc1ccccc1', smarts_pattern='[*]', generic_keywords=['CYC']) (True, array([4, 5, 6, 7, 8, 9]), '* |$CYC$|')
- static check_smiles_pattern(target: str | Mol, smiles_pattern: str) tuple[bool, list[int]]¶
Check if a given SMILES pattern matches a target molecule.
This function checks whether a SMILES pattern matches any substructure within the target molecule.
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or RDKit Mol object representing the target molecule.
smiles_pattern (str) – A SMILES pattern to match against the target.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices where the pattern matches.
- Return type:
tuple of (bool, list of int)
- static count_atoms(target: str | Mol) int¶
Count the number of atoms in a target molecule.
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – The target molecule.
- Returns:
The number of atoms in the molecule.
- Return type:
int
- static count_bonds(target: str | Mol) int¶
Count the number of bonds in a target molecule.
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – The target molecule.
- Returns:
The number of bonds in the molecule.
- Return type:
int
- static get_MCS(input1: str | Mol, input2: str | Mol, threshold: float = 0.0, completeAromaticRings: bool = False, similarity_type: Literal['tanimoto', 'johnson'] = 'tanimoto') tuple[bool, str, list[int], list[int], float]¶
Find the Maximum Common Substructure (MCS) between two molecules.
This function identifies the MCS between two molecules, which can be provided as SMILES strings or RDKit molecule objects. It returns the SMARTS string of the MCS, atom indices in both molecules, and a similarity score.
More information: https://greglandrum.github.io/rdkit-blog/posts/2023-11-08-introducingrascalmces.html
- Parameters:
input1 (str or rdkit.Chem.rdchem.Mol) – The first molecule in SMILES format or as an RDKit object.
input2 (str or rdkit.Chem.rdchem.Mol) – The second molecule in SMILES format or as an RDKit object.
threshold (float, optional) – Similarity threshold for MCS detection. Must be in the interval [0, 1). Default is 0.0.
completeAromaticRings (bool, optional) – Whether to require complete aromatic ring matches. Default is False.
similarity_type ({'tanimoto', 'johnson'}, optional) – The similarity metric to use. Default is ‘tanimoto’.
- Returns:
A tuple containing: - bool : Whether an MCS was found. - str : SMARTS string of the MCS. - list of int : Atom indices in the first molecule. - list of int : Atom indices in the second molecule. - float : Similarity score.
- Return type:
tuple
- static get_atoms(target: str | Mol) list¶
Retrieve a list of atoms from a target molecule.
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – The target molecule.
- Returns:
A list of atom objects in the molecule.
- Return type:
list of rdkit.Chem.rdchem.Atom
- static get_bonds(target: str | Mol) list¶
Retrieve a list of bonds from a target molecule.
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – The target molecule.
- Returns:
A list of bond objects in the molecule.
- Return type:
list of rdkit.Chem.rdchem.Bond
- static get_stereoisomers(target: str | Mol, drawer=None) tuple[list[Mol], list]¶
Retrieve stereoisomers and their images for a target molecule.
This function takes a target molecule, which can be either a SMILES string or an RDKit molecule object, and returns a list of stereoisomers and their corresponding images. An optional MolDrawer instance can be provided to customise drawing options.
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A molecule in SMILES format or an RDKit molecule object.
drawer (MolDrawer, optional) – An instance of MolDrawer to preserve drawing options. Default is None.
- Returns:
A tuple containing a list of stereoisomer molecules and their corresponding images.
- Return type:
tuple of (list of rdkit.Chem.rdchem.Mol, list)
- static get_tautomers(target: str | Mol) list[str]¶
Retrieve a list of tautomers for a target molecule.
This function takes a target molecule, which can be either a SMILES string or an RDKit molecule object, and returns a list of SMILES strings representing the tautomers of the molecule.
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A molecule in SMILES format or an RDKit molecule object.
- Returns:
A list of SMILES strings representing the tautomers of the target molecule.
- Return type:
list of str
- static has_carbon_ion(target: str | Mol) bool¶
Detect the presence of carbon ions in a target molecule.
This function checks if a molecule contains charged carbon atoms (carbocations or carbanions). Carbanions that are part of nitrile groups are excluded unless carbocations are also present.
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A molecule in SMILES format or an RDKit molecule object.
- Returns:
True if the molecule contains carbon ions, False otherwise.
- Return type:
bool
- static has_metal_salt(target: str | Mol, custom_metals: list | None = None) bool¶
Determine whether a target molecule contains a metal salt.
This function checks for the presence of metal salts in a molecule. A custom list of metal elements can be provided, otherwise a default list is used.
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A molecule in SMILES format or an RDKit molecule object.
custom_metals (list of str, optional) – A custom list of metal elements to check for. Default is None.
- Returns:
True if the molecule contains a metal salt, False otherwise.
- Return type:
bool
- static is_organic(target: str | Mol) bool¶
Determine whether a target molecule is organic.
This function checks if a molecule contains carbon atoms and is not classified as a carbonic acid. A molecule is considered inorganic if the only carbon atoms present belong to carbonic acid groups.
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A molecule in SMILES format or an RDKit molecule object.
- Returns:
True if the molecule is organic, False otherwise.
- Return type:
bool
- static pattern_abs_fraction_greater_than(target: str | Mol, func, threshold: float, hidden_pattern_function=None) bool¶
Determine if the fraction of atoms belonging to a pattern exceeds a given threshold.
This function calculates the fraction of atoms in a target molecule that match a given pattern. The pattern is defined by a function (e.g. a SMARTS matcher). An optional hidden pattern function can be provided to refine the numerator (e.g. to count only aromatic carbon atoms).
The denominator is always the total number of atoms in the molecule.
Use this when you want to know:
“Does this pattern make up more than X% of the molecule?”
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – The target molecule, as a SMILES string or RDKit Mol object.
func (callable) – A function that returns a tuple like (True, [atom_indices]) for the pattern.
threshold (float) – The minimum fraction of atoms that must match the pattern.
hidden_pattern_function (callable, optional) – A secondary function to refine the atom subset in the numerator.
- Returns:
True if the fraction of matching atoms exceeds the threshold.
- Return type:
bool
Examples
>>> from mlchem.chem.manipulation import PatternRecognition as pr
>>> # Define a pattern function to find carbon atoms >>> def check_carbon(target): ... return pr.Base.check_smarts_pattern(target, smarts_pattern='[C]')
>>> # Check if more than 60% of atoms are carbon >>> pr.Base.pattern_abs_fraction_greater_than('CCCC(=O)O', check_carbon, threshold=0.6) True
>>> # Example with a hidden pattern function (e.g. aromatic carbon among all atoms) >>> def check_aromatic(target, pattern_function): ... return pr.Base.check_smarts_pattern(target, smarts_pattern='[a]')
>>> pr.Base.pattern_abs_fraction_greater_than('OCCc1ccccc1', check_aromatic, ... threshold=0.5, hidden_pattern_function=check_carbon) True
Use Cases¶
“Are more than 30% of all atoms aromatic carbon?”
“Do heteroatoms make up more than 20% of the molecule?”
Notes
This method always uses the total number of atoms in the molecule as the denominator. To compare two patterns directly, use pattern_rel_fraction_greater_than.
>>> # Using abs method with hidden pattern >>> pr.Base.pattern_abs_fraction_greater_than( ... target, ... func=check_pattern_aromatic, ... threshold=0.5, ... hidden_pattern_function=check_carbon)
- static pattern_rel_fraction_greater_than(target: str | Mol, func1, func2, threshold: float, hidden_pattern_function=None) bool¶
Determine if the fraction of atoms belonging to one pattern exceeds a given threshold relative to another pattern.
This function compares the number of atoms matching a primary pattern to those matching a secondary pattern. An optional hidden pattern function can be passed if the primary function requires two arguments.
Use this when you want to know:
“Does pattern A make up more than X% of pattern B?”
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – The target molecule, as a SMILES string or RDKit Mol object.
func1 (callable) – Function identifying the primary pattern (numerator).
func2 (callable) – Function identifying the reference pattern (denominator).
threshold (float) – The minimum relative fraction required (e.g. 0.5 means 50% of func2 atoms must match func1).
hidden_pattern_function (callable, optional) – A secondary function to pass into func1 if it requires it.
- Returns:
True if the relative fraction exceeds the threshold.
- Return type:
bool
Examples
>>> from mlchem.chem.manipulation import PatternRecognition as pr
>>> # Define pattern functions >>> def check_carbon(target): ... return pr.Base.check_smarts_pattern(target, smarts_pattern='[C]')
>>> def check_alkyl_carbon(target): ... return pr.Base.check_smarts_pattern(target, smarts_pattern='[CX3]')
>>> # Check if more than 30% of carbon atoms are alkyl >>> pr.Base.pattern_rel_fraction_greater_than('CC(C)C(=O)O', check_alkyl_carbon, check_carbon, threshold=0.3) True
Use Cases¶
“Are more than 30% of carbon atoms alkyl?”
“Are more than 50% of ring atoms aromatic?”
Notes
This method uses the number of atoms matched by func2 as the denominator. If func1 requires a second argument (e.g. a filtering function), it will be passed hidden_pattern_function.
- class Bonds¶
Bases:
object- static check_aromatic_bonds(target: str | Mol) tuple[bool, list[int], str]¶
Check for aromatic bonds in a molecule.
This function takes a molecule and performs an aromatic bond pattern check using the SMARTS pattern *:*.
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A molecule in SMILES format or an RDKit molecule object.
- Returns:
A tuple containing: - bool : Whether the pattern was found. - list of int : Atom indices that match the pattern. - str : The SMARTS string representing the matched pattern.
- Return type:
tuple
- static check_bonds(target: str | Mol) tuple[bool, list[int], str]¶
Check the bonds in a molecule.
This function takes a molecule and performs a generic bond pattern check using the SMARTS pattern *~*.
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A molecule in SMILES format or an RDKit molecule object.
- Returns:
A tuple containing: - bool : Whether the pattern was found. - list of int : Atom indices that match the pattern. - str : The SMARTS string representing the matched pattern.
- Return type:
tuple
- static check_cyclic_bonds(target: str | Mol) tuple[bool, list[int], str]¶
Check for cyclic bonds in a molecule.
This function takes a molecule and performs a cyclic bond pattern check using the SMARTS pattern *@*.
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A molecule in SMILES format or an RDKit molecule object.
- Returns:
A tuple containing: - bool : Whether the pattern was found. - list of int : Atom indices that match the pattern. - str : The SMARTS string representing the matched pattern.
- Return type:
tuple
- static check_double_bonds(target: str | Mol) tuple[bool, list[int], str]¶
Check for double bonds in a molecule.
This function takes a molecule and performs a double bond pattern check using the SMARTS pattern *=*.
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A molecule in SMILES format or an RDKit molecule object.
- Returns:
A tuple containing: - bool : Whether the pattern was found. - list of int : Atom indices that match the pattern. - str : The SMARTS string representing the matched pattern.
- Return type:
tuple
- static check_rotatable_bonds(target: str | Mol) tuple[bool, list[int], str]¶
Check for rotatable bonds in a molecule.
This function takes a molecule and performs a rotatable bond pattern check using the SMARTS pattern: [!$(*#*)&!D1]-!@[!$(*#*)&!D1].
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A molecule in SMILES format or an RDKit molecule object.
- Returns:
A tuple containing: - bool : Whether the pattern was found. - list of int : Atom indices that match the pattern. - str : The SMARTS string representing the matched pattern.
- Return type:
tuple
- static check_single_bonds(target: str | Mol) tuple[bool, list[int], str]¶
Check for single bonds in a molecule.
This function takes a molecule and performs a single bond pattern check using the SMARTS pattern *-*.
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A molecule in SMILES format or an RDKit molecule object.
- Returns:
A tuple containing: - bool : Whether the pattern was found. - list of int : Atom indices that match the pattern. - str : The SMARTS string representing the matched pattern.
- Return type:
tuple
- static check_triple_bonds(target: str | Mol) tuple[bool, list[int], str]¶
Check for triple bonds in a molecule.
This function takes a molecule and performs a triple bond pattern check using the SMARTS pattern *#*.
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A molecule in SMILES format or an RDKit molecule object.
- Returns:
A tuple containing: - bool : Whether the pattern was found. - list of int : Atom indices that match the pattern. - str : The SMARTS string representing the matched pattern.
- Return type:
tuple
- static is_dative_bond(bond: Bond) int¶
Check if a bond is a dative bond.
This function takes an RDKit bond object and checks whether it is a dative bond.
- Parameters:
bond (rdkit.Chem.rdchem.Bond) – An RDKit bond object.
- Returns:
1 if the bond is a dative bond, 0 otherwise.
- Return type:
int
- static is_double_bond(bond: Bond) int¶
Check if a bond is a double bond.
This function takes an RDKit bond object and checks whether it is a double bond.
- Parameters:
bond (rdkit.Chem.rdchem.Bond) – An RDKit bond object.
- Returns:
1 if the bond is a double bond, 0 otherwise.
- Return type:
int
- static is_single_bond(bond: Bond) int¶
Check if a bond is a single bond.
This function takes an RDKit bond object and checks whether it is a single bond.
- Parameters:
bond (rdkit.Chem.rdchem.Bond) – An RDKit bond object.
- Returns:
1 if the bond is a single bond, 0 otherwise.
- Return type:
int
- static is_triple_bond(bond: Bond) int¶
Check if a bond is a triple bond.
This function takes an RDKit bond object and checks whether it is a triple bond.
- Parameters:
bond (rdkit.Chem.rdchem.Bond) – An RDKit bond object.
- Returns:
1 if the bond is a triple bond, 0 otherwise.
- Return type:
int
- class MolPatterns¶
Bases:
object- static alpha_nitroalkane(target: str | Mol) tuple[bool, list[int], str]¶
Check for alpha-nitroalkane groups in a molecule.
SMARTS pattern used: -
[CX4H1,H2,H3]#7D3[#8]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.alpha_nitroalkane("CC(N(=O)=O)C")
- static check_acetylenic_carbon(target: str | Mol) tuple[bool, list[int], str]¶
Check for acetylenic carbon atoms in a molecule.
The SMARTS pattern used to identify acetylenic carbon atoms is
[$([CX2]#C)].- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_acetylenic_carbon("CC#C")
- static check_acyl_halide(target: str | Mol) tuple[bool, list[int], str]¶
Check for acyl halides in a molecule.
The SMARTS pattern used to identify acyl halides is
CX3[F,Cl,Br,I].- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_acyl_halide("CC(=O)Cl")
- static check_alanine(target: str | Mol) tuple[bool, list[int], str]¶
Check for alanine residues in a molecule.
SMARTS pattern used: -
[$([NX3H2,NX4H3+]),$(NX3H(C))]CX4HCX3[OX2H,OX1-,N]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_alanine("CC(C(=O)O)N")
- static check_alcohol(target: str | Mol) tuple[bool, list[int], str]¶
Check for alcohol groups in a molecule.
SMARTS pattern used: -
[#6][OX2H]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_alcohol("CCO")
- static check_aldehyde(target: str | Mol) tuple[bool, list[int], str]¶
Check for aldehyde groups in a molecule.
The SMARTS pattern used to identify aldehyde groups is
CX3H1[#6].- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_aldehyde("CC=O")
- static check_alkali_metals(target: str | Mol) tuple[bool, list[int], str]¶
Check for alkali metals in a molecule.
SMARTS pattern used: -
[Li,Na,K,Rb,Cs,Fr]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether any alkali metal was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_alkali_metals("[Na+]")
- static check_alkaline_earth_metals(target: str | Mol) tuple[bool, list[int], str]¶
Check for alkaline earth metals in a molecule.
SMARTS pattern used: -
[Be,Mg,Ca,Sr,Ba,Ra]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether any alkaline earth metal was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_alkaline_earth_metals("[Mg++]")
- static check_alkyl_carbon(target: str | Mol) tuple[bool, list[int], str]¶
Check for alkyl carbon atoms in a molecule.
The SMARTS pattern used to identify alkyl carbon atoms is
[CX4].- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_alkyl_carbon("CC")
- static check_allenic_carbon(target: str | Mol) tuple[bool, list[int], str]¶
Check for allenic carbon atoms in a molecule.
The SMARTS pattern used to identify allenic carbon atoms is
$([CX2=C)].- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_allenic_carbon("C=C=C")
- static check_alpha_dicarbonyl(target: str | Mol) tuple[bool, list[int], str]¶
Check for alpha-dicarbonyl groups in a molecule.
The SMARTS pattern used to identify alpha-dicarbonyl groups is
O=[#6][#6]=O.- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_alpha_dicarbonyl("O=CC=O")
- static check_alpha_diketone(target: str | Mol) tuple[bool, list[int], str]¶
Check for alpha-diketone groups in a molecule.
The SMARTS pattern used to identify alpha-diketone groups is
O=#6D3#6D3=O.- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_alpha_diketone("CC(=O)CC(=O)C")
- static check_amide(target: str | Mol) tuple[bool, list[int], str]¶
Check for amide groups in a molecule.
The SMARTS pattern used to identify amide groups is
[#7X3]#6X3[#6].- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_amide("CC(=O)NC")
- static check_amine(target: str | Mol) tuple[bool, list[int], tuple[str, str, str, str]]¶
Check for amine groups in a molecule.
This function checks for primary, secondary, tertiary, and quaternary amine groups in the target molecule. The results are combined and returned as a single tuple.
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether any amine group was found. - A list of atom indices matching any of the amine group patterns. - A tuple of SMARTS strings representing the matched patterns for
each type of amine group.
- Return type:
tuple[bool, list[int], tuple[str, str, str, str]]
Examples
>>> MolPatterns.check_amine("CN(C)C")
- static check_amine_primary(target: str | Mol) tuple[bool, list[int], tuple[str, str]]¶
Check for primary amine groups in a molecule.
SMARTS patterns used: - Nitrogen:
[#7]- Amine:[#6][#7D1&!$(NC=O)]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A tuple of SMARTS strings representing the matched patterns.
- Return type:
tuple[bool, list[int], tuple[str, str]]
Examples
>>> MolPatterns.check_amine_primary("CCNH2")
- static check_amine_quaternary(target: str | Mol) tuple[bool, list[int], tuple[str, str]]¶
Check for quaternary amine groups in a molecule.
SMARTS patterns used: - Nitrogen:
[#7]- Amine:[#6][#7D4+&!$NC=O)([#6])([#6])- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A tuple of SMARTS strings representing the matched patterns.
- Return type:
tuple[bool, list[int], tuple[str, str]]
Examples
>>> MolPatterns.check_amine_quaternary("CN+(C)C")
- static check_amine_secondary(target: str | Mol) tuple[bool, list[int], tuple[str, str]]¶
Check for secondary amine groups in a molecule.
SMARTS patterns used: - Nitrogen:
[#7]- Amine:[#6][#7D2&!$(NC=O)][#6]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A tuple of SMARTS strings representing the matched patterns.
- Return type:
tuple[bool, list[int], tuple[str, str]]
Examples
>>> MolPatterns.check_amine_secondary("CCNC")
- static check_amine_tertiary(target: str | Mol) tuple[bool, list[int], tuple[str, str]]¶
Check for tertiary amine groups in a molecule.
SMARTS patterns used: - Nitrogen:
[#7]- Amine:[#6]#7D3&!$(NC=O)[#6]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A tuple of SMARTS strings representing the matched patterns.
- Return type:
tuple[bool, list[int], tuple[str, str]]
Examples
>>> MolPatterns.check_amine_tertiary("CN(C)C")
- static check_aminoacid(target: str | Mol) tuple[bool, list[int], str]¶
Check for generic amino acid residues in a molecule, including proline and glycine.
SMARTS patterns used: - Generic amino acid:
[$([NX3H2,NX4H3+]),$(NX3H(C))]CX4HCX3[OX2H,OX1-,N]- Glycine:[$([$([NX3H2,NX4H3+]),$(NX3H(C))][CX4H2]CX3[OX2H,OX1-,N])]- Proline:[$([NX3H,NX4H2+]),$(NX3(C)(C))]1CX4HCX3[OX2H,OX1-,N]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether any of the patterns were found. - A list of atom indices matching the first found pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_aminoacid("C(C(=O)O)N")
- static check_anhydride(target: str | Mol) tuple[bool, list[int], str]¶
Check for anhydride groups in a molecule.
The SMARTS pattern used to identify anhydride groups is ``CX3[OX2][CX3].
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_anhydride("CC(=O)OC(=O)C")
- static check_arginine(target: str | Mol) tuple[bool, list[int], str]¶
Check for arginine residues in a molecule.
SMARTS pattern used: -
[CX3[OX2])CH1X4[CH2X4][CH2X4][CH2X4][ND2]=CD3[NX3]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_arginine("NC(CCCNC(N)=N)C(=O)O")
- static check_asparagine(target: str | Mol) tuple[bool, list[int], str]¶
Check for asparagine residues in a molecule.
SMARTS pattern used: -
[$([NX3H2,NX4H3+]),$([NXH(C))]CX4H[NX3H2])CX3[OX2H,OX1-,N]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_asparagine("NC(CC(=O)N)C(=O)O")
- static check_aspartate(target: str | Mol) tuple[bool, list[int], str]¶
Check for aspartate residues in a molecule.
SMARTS pattern used: -
[$([NX3H2,NX4H3+]),$(NX3H(C))][CX4H]H0-,OH])CX3[OX2H,OX1-,N]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_aspartate("NC(CC(=O)O)C(=O)O")
- static check_azide(target: str | Mol) tuple[bool, list[int], str]¶
Check for azide groups in a molecule.
SMARTS pattern used: -
[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_azide("CCN=[N+]=[N-]")
- static check_azo(target: str | Mol) tuple[bool, list[int], str]¶
Check for azo groups in a molecule.
SMARTS pattern used: -
[#6][#7D2]=[#7D2][#6]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_azo("C1=CC=C(C=C1)N=NC2=CC=CC=C2")
- static check_azoxy(target: str | Mol) tuple[bool, list[int], str]¶
Check for azoxy groups in a molecule.
SMARTS pattern used: -
[$([NX2]=NX3+[#6]),$([NX2]=NX3+0[#6])]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_azoxy("CC1=NN(O)=CC=C1")
- static check_beta_dicarbonyl(target: str | Mol) tuple[bool, list[int], str]¶
Check for beta-dicarbonyl groups in a molecule.
The SMARTS pattern used to identify beta-dicarbonyl groups is
O=[#6][#6][#6]=O.- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_beta_dicarbonyl("O=CCC=O")
- static check_beta_diketone(target: str | Mol) tuple[bool, list[int], str]¶
Check for beta-diketone groups in a molecule.
The SMARTS pattern used to identify beta-diketone groups is
O=#6D3[#6]#6D3=O.- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_beta_diketone("CC(=O)CCC(=O)C")
- static check_boron_group_elements(target: str | Mol) tuple[bool, list[int], str]¶
Check for boron group elements in a molecule.
SMARTS pattern used: -
[B,Al,Ga,In,Ti,Nh]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether any boron group element was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_boron_group_elements("B(Cl)(Cl)Cl")
- static check_bromine(target: str | Mol) tuple[bool, list[int], str]¶
Check for bromine atoms in a molecule.
SMARTS pattern used: -
[Br]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether bromine was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_bromine("CCBr")
- static check_carbamate(target: str | Mol) tuple[bool, list[int], str]¶
Check for carbamate groups in a molecule.
SMARTS pattern used: -
[NX3,NX4+]CX3[OX2,OX1-]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_carbamate("CN(C)C(=O)OC")
- static check_carbanion(target: str | Mol) tuple[bool, list[int], str]¶
Check for carbanions in a molecule.
The SMARTS pattern used to identify carbanions is
[#6-].- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_carbanion("[CH2-]C")
- static check_carbocation(target: str | Mol) tuple[bool, list[int], str]¶
Check for carbocations in a molecule.
The SMARTS pattern used to identify carbocations is
[#6+].- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_carbocation("[CH3+]")
- static check_carbon(target: str | Mol) tuple[bool, list[int], str]¶
Check for carbon atoms in a molecule.
The SMARTS pattern used to identify carbon atoms is
[#6].- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_carbon("CCO")
- static check_carbon_group_elements(target: str | Mol) tuple[bool, list[int], str]¶
Check for carbon group elements in a molecule (excluding carbon).
SMARTS pattern used: -
[Si,Ge,Sn,Pb,Fl]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether any carbon group element was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_carbon_group_elements("Si(Cl)(Cl)(Cl)Cl")
- static check_carbonate_ester(target: str | Mol) tuple[bool, list[int], tuple[str, str]]¶
Check for carbonate esters in a molecule.
This function identifies mono- and diesters of carbonic acid in the target molecule.
SMARTS patterns used: - Monoester:
CX3([OX2H0])[OX2H,OX1H0-1]- Diester:CX3([OX2H0])[OX2H0]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A tuple of SMARTS strings representing the matched patterns.
- Return type:
tuple[bool, list[int], tuple[str, str]]
Examples
>>> MolPatterns.check_carbonate_ester("COC(=O)OC")
- static check_carbonic_acid(target: str | Mol) tuple[bool, list[int], str]¶
Check for carbonic acid groups in a molecule.
The SMARTS pattern used to identify carbonic acid groups is
CX3([OX2])[OX2H,OX1H0-1]. This pattern matches both the acid and its conjugate base, but not carbonic acid diesters.- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_carbonic_acid("O=C(O)O")
- static check_carbonyl(target: str | Mol) tuple[bool, list[int], str]¶
Check for carbonyl groups in a molecule.
The SMARTS pattern used to identify carbonyl groups is
[$([CX3]=[OX1]),$([CX3+]-[OX1-])].- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_carbonyl("CC(=O)O")
- static check_carbosulphone(target: str | Mol) tuple[bool, list[int], str]¶
Check for carbosulphone groups in a molecule.
SMARTS pattern used: -
$([#16X4(=[OX1])([#6])[#6]), $(#16X4+2([OX1-])([#6])[#6])]Carbosulphones are sulphones with two carbon substituents.
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_carbosulphone("CCS(=O)(=O)C")
- static check_carbosulphoxide(target: str | Mol) tuple[bool, list[int], str]¶
Check for carbosulphoxide groups in a molecule.
SMARTS pattern used: -
$([#16X3([#6])[#6]), $(#16X3+([#6])[#6])]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_carbosulphoxide("CCS(=O)C")
- static check_carboxyl(target: str | Mol) tuple[bool, list[int], str]¶
Check for carboxyl groups in a molecule.
The SMARTS pattern used to identify carboxyl groups is
CX3[OX1H0-,OX2H1].- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_carboxyl("CC(=O)O")
- static check_chalcogens(target: str | Mol) tuple[bool, list[int], str]¶
Check for chalcogens in a molecule (excluding oxygen and sulphur).
SMARTS pattern used: -
[Se,Te,Po,Lv]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether any chalcogen was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_chalcogens("Se=C")
- static check_chlorine(target: str | Mol) tuple[bool, list[int], str]¶
Check for chlorine atoms in a molecule.
SMARTS pattern used: -
[Cl]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether chlorine was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_chlorine("CCCl")
- static check_cyanamide(target: str | Mol) tuple[bool, list[int], str]¶
Check for cyanamide groups in a molecule.
SMARTS pattern used: -
[NX3][CX2]#[NX1]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_cyanamide("NC#N")
- static check_cyanate(target: str | Mol) tuple[bool, list[int], str]¶
Check for cyanate groups in a molecule.
SMARTS pattern used: -
[#8D2][#6D2]#[#7D1]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_cyanate("OC#N")
- static check_cysteine(target: str | Mol) tuple[bool, list[int], str]¶
Check for cysteine residues in a molecule.
SMARTS pattern used: -
[$([NX3H2,NX4H3+]),$(NX3H(C))][CX4H]([CH2X4][SX2H,SX1H0-])CX3[OX2H,OX1-,N]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTe matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_cysteine("C(C(C(=O)O)N)S")
- static check_delta_dicarbonyl(target: str | Mol) tuple[bool, list[int], str]¶
Check for delta-dicarbonyl groups in a molecule.
The SMARTS pattern used to identify delta-dicarbonyl groups is
O=[#6][#6][#6][#6][#6]=O.- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_delta_dicarbonyl("O=CCCCC=O")
- static check_diazo(target: str | Mol) tuple[bool, list[int], str]¶
Check for diazo groups in a molecule.
SMARTS pattern used: -
[$([#6]=[N+]=[N-]),$([#6-]-[N+]#[N])]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_diazo("C=N=N")
- static check_disulphide(target: str | Mol) tuple[bool, list[int], str]¶
Check for disulphide groups in a molecule.
SMARTS pattern used: -
[#16X2H0][#16X2H0]Disulphides contain an S-S bond, commonly found in biological systems such as cystine.
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_disulphide("CSSC")
- static check_enamine(target: str | Mol) tuple[bool, list[int], str]¶
Check for enamine groups in a molecule.
The SMARTS pattern used to identify enamine groups is
[NX3][CX3]=[CX3].- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_enamine("C=CN")
- static check_enol(target: str | Mol) tuple[bool, list[int], str]¶
Check for enol groups in a molecule. Matches both enol and enolate forms.
SMARTS pattern used: -
[$([OX2H][#6X3]=[#6]),$([OX1-][#6X3]=[#6])]Enols are vinylic alcohols with the structure HOCR’=CR₂, tautomeric with aldehydes or ketones.
Reference¶
https://doi.org/10.1351/goldbook.E02124
- param target:
A SMILES string or an RDKit molecule object.
- type target:
str or rdkit.Chem.rdchem.Mol
- returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- rtype:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_enol("C=C(O)C")
- static check_ester(target: str | Mol) tuple[bool, list[int], str]¶
Check for ester groups in a molecule.
The SMARTS pattern used to identify ester groups is
[#6]CX3[OX2H0][#6].- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_ester("CC(=O)OC")
- static check_ether(target: str | Mol) tuple[bool, list[int], str]¶
Check for ether groups in a molecule.
The SMARTS pattern used to identify ether groups is
OD2[#6].- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_ether("COC")
- static check_fluorine(target: str | Mol) tuple[bool, list[int], str]¶
Check for fluorine atoms in a molecule.
SMARTS pattern used: -
[F]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether fluorine was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_fluorine("CCF")
- static check_gamma_dicarbonyl(target: str | Mol) tuple[bool, list[int], str]¶
Check for gamma-dicarbonyl groups in a molecule.
The SMARTS pattern used to identify gamma-dicarbonyl groups is
O=[#6][#6][#6][#6]=O.- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_gamma_dicarbonyl("O=CCCC=O")
- static check_gamma_diketone(target: str | Mol) tuple[bool, list[int], str]¶
Check for gamma-diketone groups in a molecule.
The SMARTS pattern used to identify gamma-diketone groups is
O=#6D3[#6][#6]#6D3=O.- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_gamma_diketone("CC(=O)CCCC(=O)C")
- static check_glutamate(target: str | Mol) tuple[bool, list[int], str]¶
Check for glutamate residues in a molecule.
SMARTS pattern used: -
[$([NX3H2,NX4H3+]),$(NX3H(C))]CX4H[OH0-,OH])CX3[OX2H,OX1-,N]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_glutamate("NC(CCC(=O)O)C(=O)O")
- static check_glycine(target: str | Mol) tuple[bool, list[int], str]¶
Check for glycine residues in a molecule.
SMARTS pattern used: -
[$([$([NX3H2,NX4H3+]),$(NX3H(C))][CX4H2]CX3[OX2H,OX1-,N])]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_glycine("C(C(=O)O)N")
- static check_haloalkane(target: str | Mol) tuple[bool, list[int], str]¶
Check for haloalkane groups in a molecule.
SMARTS pattern used: -
[CX4]-[F,Cl,Br,I]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_haloalkane("CCl")
- static check_haloalkane_primary(target: str | Mol) tuple[bool, list[int], str]¶
Check for primary haloalkane groups in a molecule.
SMARTS pattern used: -
[CX4H3,CX4H2]-[F,Cl,Br,I]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_haloalkane_primary("CCl")
- static check_haloalkane_secondary(target: str | Mol) tuple[bool, list[int], str]¶
Check for secondary haloalkane groups in a molecule.
SMARTS pattern used: -
[CX4H1]-[F,Cl,Br,I]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_haloalkane_secondary("CCClCC")
- static check_haloalkane_tertiary(target: str | Mol) tuple[bool, list[int], str]¶
Check for tertiary haloalkane groups in a molecule.
SMARTS pattern used: -
[CX4H0]-[F,Cl,Br,I]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_haloalkane_tertiary("C(C)(C)(Cl)C")
- static check_haloalkene(target: str | Mol) tuple[bool, list[int], str]¶
Check for haloalkene groups in a molecule.
SMARTS pattern used: -
[C&!c]=[C&!c][F,Cl,Br,I]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_haloalkene("C=CCl")
- static check_halogen(target: str | Mol) tuple[bool, list[int], str]¶
Check for halogen atoms in a molecule.
SMARTS pattern used: -
[F,Cl,Br,I]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether any halogen atom was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_halogen("CCCl")
- static check_halogen_carbon(target: str | Mol) tuple[bool, list[int], str]¶
Check for carbon atoms connected to halogens in a molecule.
SMARTS pattern used: -
[#6]~[F,Cl,Br,I]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_halogen_carbon("CCCl")
- static check_halogen_nitrogen(target: str | Mol) tuple[bool, list[int], str]¶
Check for nitrogen atoms connected to halogens in a molecule.
SMARTS pattern used: -
[#7]~[F,Cl,Br,I]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_halogen_nitrogen("NCl")
- static check_halogen_oxygen(target: str | Mol) tuple[bool, list[int], str]¶
Check for oxygen atoms connected to halogens in a molecule.
SMARTS pattern used: -
[#8]~[F,Cl,Br,I]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_halogen_oxygen("OCl")
- static check_hbond_acceptors(target: str | Mol) tuple[bool, list[int], str]¶
Check for hydrogen bond acceptors in a molecule.
SMARTS pattern used: - H-bond acceptor:
[!$([#6,F,Cl,Br,I,o,s,nX3,#7v5,#15v5,#16v4,#16v6,*+1,*+2,*+3])]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_hbond_acceptors("CC(=O)O")
- static check_hbond_acceptors_higher_than(target: str | Mol, n: int) tuple[bool, list[int], str]¶
Check for a number of hydrogen bond acceptors strictly higher than a threshold.
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
n (int) – Minimum number of hydrogen bond acceptors (exclusive).
- Returns:
A tuple containing: - A boolean indicating whether the number of acceptors is greater than n. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_hbond_acceptors_higher_than("CC(=O)O", 1)
- static check_hbond_acceptors_lower_than(target: str | Mol, n: int) tuple[bool, list[int], str]¶
Check for a number of hydrogen bond acceptors strictly lower than a threshold.
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
n (int) – Maximum number of hydrogen bond acceptors (exclusive).
- Returns:
A tuple containing: - A boolean indicating whether the number of acceptors is less than n. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_hbond_acceptors_lower_than("CC(=O)O", 3)
- static check_hbond_donors(target: str | Mol) tuple[bool, list[int], str]¶
Check for hydrogen bond donors in a molecule.
SMARTS pattern used: - H-bond donor:
[!$([#6,H0,-,-2,-3])]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_hbond_donors("CC(O)N")
- static check_hbond_donors_higher_than(target: str | Mol, n: int) tuple[bool, list[int], str]¶
Check for a number of hydrogen bond donors strictly higher than a threshold.
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
n (int) – Minimum number of hydrogen bond donors (exclusive).
- Returns:
A tuple containing: - A boolean indicating whether the number of donors is greater than n. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_hbond_donors_higher_than("CC(O)N", 1)
- static check_hbond_donors_lower_than(target: str | Mol, n: int) tuple[bool, list[int], str]¶
Check for a number of hydrogen bond donors strictly lower than a threshold.
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
n (int) – Maximum number of hydrogen bond donors (exclusive).
- Returns:
A tuple containing: - A boolean indicating whether the number of donors is less than n. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_hbond_donors_lower_than("CC(O)N", 3)
- static check_histidine(target: str | Mol) tuple[bool, list[int], str]¶
Check for histidine residues in a molecule.
SMARTS pattern used: -
[$([NX3H2,NX4H3+]),$(NX3H(C))]CX4H,$([#7X3H])]:[#6X3H]:[$([#7X3H+,#7X2H0+0]:[#6X3H]:[#7X3H]),$([#7X3H])]:[#6X3H]1)CX3[OX2H,OX1-,N]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_histidine("NC(Cc1c[nH]cn1)C(=O)O")
- static check_hydrazine(target: str | Mol) tuple[bool, list[int], str]¶
Check for hydrazine groups in a molecule.
SMARTS pattern used: -
[NX3][NX3]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_hydrazine("NN")
- static check_hydrazone(target: str | Mol) tuple[bool, list[int], str]¶
Check for hydrazone groups in a molecule.
Hydrazones are compounds with the structure R₂C=NNR₂, derived from aldehydes or ketones by replacing =O with =NNH₂ or analogues. Reference: https://doi.org/10.1351/goldbook.H02884
SMARTS pattern used: -
[#7X3][#7D2]=#6D3[#6]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_hydrazone("C=NN(C)C")
- static check_imide(target: str | Mol) tuple[bool, list[int], str]¶
Check for imide groups in a molecule.
SMARTS pattern used: -
[#6][#6D3#7X3]#6D3[#6]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_imide("O=C1NC(=O)CC1")
- static check_imine(target: str | Mol) tuple[bool, list[int], str]¶
Check for imine groups in a molecule.
SMARTS pattern used: -
$([CX3[#6]),$([CX3H][#6])]=[$([NX2][#6]),$([NX2H])]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_imine("C=NC")
- static check_iminium(target: str | Mol) tuple[bool, list[int], str]¶
Check for iminium groups in a molecule.
SMARTS pattern used: -
[NX3+]=[CX3]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_iminium("C=N+C")
- static check_iodine(target: str | Mol) tuple[bool, list[int], str]¶
Check for iodine atoms in a molecule.
SMARTS pattern used: -
[I]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether iodine was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_iodine("CCI")
- static check_isocyanate(target: str | Mol) tuple[bool, list[int], str]¶
Check for isocyanate groups in a molecule.
SMARTS pattern used: -
[#7D2]=[#6D2]=[#8D1]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_isocyanate("N=C=O")
- static check_isoleucine(target: str | Mol) tuple[bool, list[int], str]¶
Check for isoleucine residues in a molecule.
SMARTS pattern used: -
[$([NX3H2,NX4H3+]),$(NX3H(C))]CX4H[CH2X4][CH3X4])CX3[OX2H,OX1-,N]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_isoleucine("CC(C)CC(N)C(=O)O")
- static check_isonitrile(target: str | Mol) tuple[bool, list[int], str]¶
Check for isonitrile groups in a molecule.
SMARTS pattern used: -
[CX1-]#[NX2+]Isomeric forms of hydrocyanic acid and its derivatives (RN≡C).
Reference¶
https://doi.org/10.1351/goldbook.I03270
- param target:
A SMILES string or an RDKit molecule object.
- type target:
str or rdkit.Chem.rdchem.Mol
- returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- rtype:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_isonitrile("CN#C")
- static check_isothiocyanate(target: str | Mol) tuple[bool, list[int], str]¶
Check for isothiocyanate groups in a molecule.
SMARTS pattern used: -
[#7D2]=[#6]=[#16D1]Isothiocyanates are sulphur analogues of isocyanates (RN=C=S).
Reference¶
https://doi.org/10.1351/goldbook.I03320
- param target:
A SMILES string or an RDKit molecule object.
- type target:
str or rdkit.Chem.rdchem.Mol
- returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- rtype:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_isothiocyanate("NC(=S)N")
- static check_ketone(target: str | Mol) tuple[bool, list[int], str]¶
Check for ketone groups in a molecule.
The SMARTS pattern used to identify ketone groups is
[#6]CX3[#6].- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_ketone("CC(=O)C")
- static check_leucine(target: str | Mol) tuple[bool, list[int], str]¶
Check for leucine residues in a molecule.
SMARTS pattern used: -
[$([NX3H2,NX4H3+]),$(NX3H(C))]CX4H[CH3X4])CX3[OX2H,OX1-,N]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_leucine("CC(C)C(C(=O)O)N")
- static check_lysine(target: str | Mol) tuple[bool, list[int], str]¶
Check for lysine residues in a molecule.
SMARTS pattern used: -
CX3([OX2])CH1X4[CH2X4][CH2X4][CH2X4]CD3[CD1]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_lysine("NCCCCCC(N)C(=O)O")
- static check_methionine(target: str | Mol) tuple[bool, list[int], str]¶
Check for methionine residues in a molecule.
SMARTS pattern used: -
[$([NX3H2,NX4H3+]),$([NX3H]))]CX4HCX3[OX2H,OX1-,N]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_methionine("CSCC(C(=O)O)N")
- static check_n_oxide(target: str | Mol) tuple[bool, list[int], str]¶
Check for N-oxide groups in a molecule.
SMARTS pattern used: -
[$([#7X3H1,#7X3&!#7X3H2,#7X3H0,#7X4+][#8]); !$(#7~[O]); !$([#7]=[#7])]Derived from tertiary amines by attachment of an oxygen atom to nitrogen.
Reference¶
https://doi.org/10.1351/goldbook.A00273
- param target:
A SMILES string or an RDKit molecule object.
- type target:
str or rdkit.Chem.rdchem.Mol
- returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- rtype:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_n_oxide("CN+(C)[O-]")
- static check_neg_charge_1(target: str | Mol) tuple[bool, list[int], str]¶
Check for negatively charged atoms in a molecule.
SMARTS pattern used: - Negative charge:
[-]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_neg_charge_1("[O-]C(=O)C")
- static check_neg_charge_2(target: str | Mol) tuple[bool, list[int], str]¶
Check for two negatively charged atoms in a molecule.
SMARTS pattern used: - Two negative charges:
[-].[-]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_neg_charge_2("[O-].[O-]C(=O)C")
- static check_neg_charge_3(target: str | Mol) tuple[bool, list[int], str]¶
Check for three negatively charged atoms in a molecule.
SMARTS pattern used: - Three negative charges:
[-].[-].[-]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_neg_charge_3("[O-].[O-].[O-]C(=O)C")
- static check_nitrate(target: str | Mol) tuple[bool, list[int], str]¶
Check for nitrate groups in a molecule.
SMARTS pattern used: -
$([NX3(=[OX1])O),$(NX3+(=[OX1])O)]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_nitrate("C(C(=O)O)N+[O-]")
- static check_nitrile(target: str | Mol) tuple[bool, list[int], str]¶
Check for nitrile groups in a molecule.
SMARTS pattern used: -
[NX1]#[CX2]Compounds with the structure RC≡N, i.e., C-substituted derivatives of hydrocyanic acid.
Reference¶
https://doi.org/10.1351/goldbook.N04151
- param target:
A SMILES string or an RDKit molecule object.
- type target:
str or rdkit.Chem.rdchem.Mol
- returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- rtype:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_nitrile("CC#N")
- static check_nitro(target: str | Mol) tuple[bool, list[int], str]¶
Check for nitro groups in a molecule.
SMARTS pattern used: -
$(NX3=O),$([NX3+[O-])][!#8]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_nitro("CC(=O)N(=O)=O")
- static check_nitrogen(target: str | Mol) tuple[bool, list[int], str]¶
Check for nitrogen atoms in a molecule.
The SMARTS pattern used to identify nitrogen atoms is
[#7].- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_nitrogen("CN")
- static check_nitrogen_group_elements(target: str | Mol) tuple[bool, list[int], str]¶
Check for nitrogen group elements in a molecule (excluding nitrogen and phosphorus).
SMARTS pattern used: -
[As,Sb,Bi]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether any nitrogen group element was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_nitrogen_group_elements("As")
- static check_nitroso(target: str | Mol) tuple[bool, list[int], str]¶
Check for nitroso groups in a molecule.
SMARTS pattern used: -
[NX2]=[OX1]Nitroso groups (-NO) attached to carbon or other elements.
Reference¶
https://doi.org/10.1351/goldbook.N04169
- param target:
A SMILES string or an RDKit molecule object.
- type target:
str or rdkit.Chem.rdchem.Mol
- returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- rtype:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_nitroso("C1=CC=CC=C1N=O")
- static check_noble_gases(target: str | Mol) tuple[bool, list[int], str]¶
Check for noble gases in a molecule.
SMARTS pattern used: -
[He,Ne,Ar,Kr,Xe,Rn]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether any noble gas was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_noble_gases("[Ar]")
- static check_oxohalide(target: str | Mol) tuple[bool, list[int], str]¶
Check for oxohalide groups in a molecule.
SMARTS pattern used: -
[#8]=[*H0]~[F,Cl,Br,I]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_oxohalide("O=CCl")
- static check_oxygen(target: str | Mol) tuple[bool, list[int], str]¶
Check for oxygen atoms in a molecule.
SMARTS pattern used: -
[#8]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_oxygen("CCO")
- static check_pattern_aliphatic(target: str | Mol, pattern_function) tuple[bool, list[int], tuple[str, str]]¶
Check if a pattern is aliphatic in a molecule.
This function takes a molecule and a pattern function, and checks whether the pattern is aliphatic in the molecule.
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
pattern_function (Callable) – A function that takes a molecule and returns a tuple of (bool, list of atom indices, SMARTS string).
- Returns:
A tuple containing: - A boolean indicating whether the pattern is aliphatic. - A list of atom indices matching the aliphatic pattern. - A tuple of SMARTS strings representing the matched patterns.
- Return type:
tuple[bool, list[int], tuple[str, str]]
Examples
>>> MolPatterns.check_pattern_aliphatic("CC", some_function)
- static check_pattern_aromatic(target: str | Mol, pattern_function) tuple[bool, list[int], tuple[str, str]]¶
Check if a pattern is aromatic in a molecule.
This function takes a molecule and a pattern function, and checks whether the pattern is aromatic in the molecule.
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
pattern_function (Callable) – A function that takes a molecule and returns a tuple of (bool, list of atom indices, SMARTS string).
- Returns:
A tuple containing: - A boolean indicating whether the pattern is aromatic. - A list of atom indices matching the aromatic pattern. - A tuple of SMARTS strings representing the matched patterns.
- Return type:
tuple[bool, list[int], tuple[str, str]]
Examples
>>> MolPatterns.check_pattern_aromatic("c1ccccc1", some_function)
- static check_pattern_aromatic_substituent(target: str | Mol, pattern_function) tuple[bool, list[int], tuple[str, str]]¶
Check if a pattern is an aromatic substituent in a molecule.
This function takes a molecule and a pattern function, and checks whether the pattern is an aromatic substituent in the molecule.
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
pattern_function (Callable) – A function that takes a molecule and returns a tuple of (bool, list of atom indices, SMARTS string).
- Returns:
A tuple containing: - A boolean indicating whether the pattern is an aromatic substituent. - A list of atom indices matching the substituent pattern. - A tuple of SMARTS strings representing the matched patterns.
- Return type:
tuple[bool, list[int], tuple[str, str]]
Examples
>>> MolPatterns.check_pattern_aromatic_substituent("c1ccccc1C", some_function)
- static check_phenylalanine(target: str | Mol) tuple[bool, list[int], str]¶
Check for phenylalanine residues in a molecule.
SMARTS pattern used: -
[$([NX3H2,NX4H3+]),$(NX3H(C))][CX4H]([OX2H,OX1-,N]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_phenylalanine("NC(CC1=CC=CC=C1)C(=O)O")
- static check_phosphoric_acid(target: str | Mol) tuple[bool, list[int], str]¶
Check for phosphoric acid groups in a molecule.
SMARTS pattern used: -
[$(P(=[OX1])([$([OX2H]),$([OX1-]),$([OX2]P)])([$([OX2H]),$([OX1-]),$([OX2]P)])[$([OX2H]),$([OX1-]),$([OX2]P)]), $(P+([$([OX2H]),$([OX1-]),$([OX2]P)])([$([OX2H]),$([OX1-]),$([OX2]P)])[$([OX2H]),$([OX1-]),$([OX2]P)])]This pattern matches orthophosphoric acid and polyphosphoric acid anhydrides, but not mono- or di-esters of monophosphoric acid.
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_phosphoric_acid("OP(=O)(O)O")
- static check_phosphoric_ester(target: str | Mol) tuple[bool, list[int], str]¶
Check for phosphoric ester groups in a molecule.
SMARTS pattern used: -
[$(P(=[OX1])([OX2][#6])([$([OX2H]),$([OX1-]),$([OX2][#6])])[$([OX2H]),$([OX1-]),$([OX2][#6]),$([OX2]P)]), $(P+([OX2][#6])([$([OX2H]),$([OX1-]),$([OX2][#6])])[$([OX2H]),$([OX1-]),$([OX2][#6]),$([OX2]P)])]This pattern matches both neutral and charged forms of phosphoric esters, but not non-ester phosphoric acids.
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_phosphoric_ester("COP(=O)(OC)OC")
- static check_phosphorus(target: str | Mol) tuple[bool, list[int], str]¶
Check for phosphorus atoms in a molecule.
SMARTS pattern used: -
[P]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_phosphorus("CP(=O)(O)O")
- static check_pos_charge_1(target: str | Mol) tuple[bool, list[int], str]¶
Check for positively charged atoms in a molecule.
SMARTS pattern used: -
[+]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether any positive charge was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_pos_charge_1("C[N+](C)(C)C")
- static check_pos_charge_2(target: str | Mol) tuple[bool, list[int], str]¶
Check for two positively charged atoms in a molecule.
SMARTS pattern used: -
[+].[+]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether two positive charges were found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_pos_charge_2("[Na+].[Na+]")
- static check_pos_charge_3(target: str | Mol) tuple[bool, list[int], str]¶
Check for three positively charged atoms in a molecule.
SMARTS pattern used: -
[+].[+].[+]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether three positive charges were found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_pos_charge_3("[Na+].[Na+].[K+]")
- static check_proline(target: str | Mol) tuple[bool, list[int], str]¶
Check for proline residues in a molecule.
SMARTS pattern used: -
[$([NX3H,NX4H2+]),$(NX3(C)(C))]1CX4HCX3[OX2H,OX1-,N]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_proline("C1CC(NC1)C(=O)O")
- static check_serine(target: str | Mol) tuple[bool, list[int], str]¶
Check for serine residues in a molecule.
SMARTS pattern used: -
[$([NX3H2,NX4H3+]),$(NX3H(C))]CX4HCX3[OX2H,OX1-,N]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_serine("NC(CO)C(=O)O")
- static check_sulphamic_acid(target: str | Mol) tuple[bool, list[int], str]¶
Check for sulphamic acid groups in a molecule.
SMARTS pattern used: -
$([#16X4(=[OX1])(=[OX1])[OX2H,OX1H0-]), $(#16X4+2([OX1-])([OX1-])[OX2H,OX1H0-])]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_sulphamic_acid("NS(=O)(=O)O")
- static check_sulphamic_ester(target: str | Mol) tuple[bool, list[int], str]¶
Check for sulphamic ester groups in a molecule.
SMARTS pattern used: -
$([#16X4(=[OX1])(=[OX1])[OX2][#6]), $(#16X4+2([OX1-])([OX1-])[OX2][#6])]- Parameters:
kit.Chem.rdchem.Mol – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_sulphamic_ester("NS(=O)(=O)OC")
- static check_sulphenic_acid(target: str | Mol) tuple[bool, list[int], str]¶
Check for sulphenic acid groups in a molecule.
SMARTS pattern used: -
[#16X2][OX2H,OX1H0-]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_sulphenic_acid("CSO")
- static check_sulphenic_ester(target: str | Mol) tuple[bool, list[int], str]¶
Check for sulphenic ester groups in a molecule.
SMARTS pattern used: -
[#16X2][OX2H0][#6]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_sulphenic_ester("CSOC")
- static check_sulphide(target: str | Mol) tuple[bool, list[int], str]¶
Check for sulphide groups in a molecule.
SMARTS pattern used: -
[#6][#16D2][#6]Sulphides are compounds with the structure R-S-R’ (R ≠ H), also known as thioethers.
Reference¶
https://doi.org/10.1351/goldbook.S06102
- param target:
A SMILES string or an RDKit molecule object.
- type target:
str or rdkit.Chem.rdchem.Mol
- returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- rtype:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_sulphide("CCSC")
- static check_sulphinic_acid(target: str | Mol) tuple[bool, list[int], str]¶
Check for sulphinic acid groups in a molecule.
SMARTS pattern used: -
[$([#6]#16X3[OX2H,OX1H0-]), $([#6]#16X3+[OX2H,OX1H0-])]Sulphinic acids (RS(=O)OH) and their conjugate bases (sulphinates) are included.
Reference¶
https://doi.org/10.1351/goldbook.S06109
- param target:
A SMILES string or an RDKit molecule object.
- type target:
str or rdkit.Chem.rdchem.Mol
- returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- rtype:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_sulphinic_acid("CS(=O)O")
- static check_sulphinic_ester(target: str | Mol) tuple[bool, list[int], str]¶
Check for sulphinic ester groups in a molecule.
SMARTS pattern used: -
[$([#6]#16X3[OX2][#6]), $([#6]#16X3+[OX2][#6])]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_sulphinic_ester("CS(=O)OC")
- static check_sulphonamide(target: str | Mol) tuple[bool, list[int], str]¶
Check for sulphonamide groups in a molecule.
SMARTS pattern used: -
$([SX4(=[OX1])([!O])[NX3]), $(SX4+2([OX1-])([!O])[NX3])]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_sulphonamide("CS(=O)(=O)N")
- static check_sulphone(target: str | Mol) tuple[bool, list[int], str]¶
Check for sulphone groups in a molecule.
SMARTS pattern used: -
$([#16X4=[OX1]), $(#16X4+2[OX1-])]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_sulphone("CS(=O)(=O)C")
- static check_sulphonic_acid(target: str | Mol) tuple[bool, list[int], str]¶
Check for sulphonic acid groups in a molecule.
SMARTS pattern used: -
$([#16X4(=[OX1])[OX2H,OX1H0-]), $([#16X42([OX1-])[OX2H,OX1H0-])]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_sulphonic_acid("CS(=O)(=O)O")
- static check_sulphonic_ester(target: str | Mol) tuple[bool, list[int], str]¶
Check for sulphonic ester groups in a molecule.
SMARTS pattern used: -
$([#16X4(=[OX1])[OX2H0]), $(#16X4+2([OX1-])[OX2H0])]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_sulphonic_ester("CS(=O)(=O)OC")
- static check_sulphoxide(target: str | Mol) tuple[bool, list[int], str]¶
Check for sulphoxide groups in a molecule.
SMARTS pattern used: -
[$([#16X3]=[OX1]), $([#16X3+][OX1-])]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_sulphoxide("CS(=O)C")
- static check_sulphur(target: str | Mol) tuple[bool, list[int], str]¶
Check for sulphur atoms in a molecule.
SMARTS pattern used: -
[#16]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_sulphur("CCS")
- static check_sulphuric_acid(target: str | Mol) tuple[bool, list[int], str]¶
Check for sulphuric acid groups in a molecule.
SMARTS pattern used: -
$([SX4(=[OX1])([OX2H1,OX1-])[OX2H1,OX1-])]Matches both acid and conjugate base forms.
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_sulphuric_acid("OS(=O)(=O)O")
- static check_sulphuric_ester(target: str | Mol) tuple[bool, list[int], str]¶
Check for sulphuric ester groups in a molecule.
SMARTS pattern used: -
$([SX4(=[OX1])([OX2H1])[OX2H0][#6]), $(SX4(=[OX1])([OX2H0][#6])[OX2H0][#6])]Matches both mono- and di-esters of sulphuric acid.
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_sulphuric_ester("COS(=O)(=O)OC")
- static check_thioaldehyde(target: str | Mol) tuple[bool, list[int], str]¶
Check for thioaldehyde groups in a molecule.
SMARTS pattern used: -
[#6][#6X3H1]=[#16X1]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_thioaldehyde("CC(=S)H")
- static check_thioanhydride(target: str | Mol) tuple[bool, list[int], str]¶
Check for thioanhydride groups in a molecule.
SMARTS pattern used: -
CX3[SX2]CX3Thioanhydrides are compounds with the structure acyl-S-acyl, also called diacylsulfanes.
Reference¶
https://doi.org/10.1351/goldbook.T06351
- param target:
A SMILES string or an RDKit molecule object.
- type target:
str or rdkit.Chem.rdchem.Mol
- returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- rtype:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_thioanhydride("CC(=O)SC(=O)C")
- static check_thiocarbamate(target: str | Mol) tuple[bool, list[int], str]¶
Check for thiocarbamate groups in a molecule.
SMARTS pattern used: -
[$([#6][#8D2]CD3[#7X3,#7X4+]), $([#6][#16D2]CD3[#7X3,#7X4+])]Matches both O- and S-organyl thiocarbamates and their conjugated bases.
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_thiocarbamate("COC(=S)NC")
- static check_thiocarbonyl(target: str | Mol) tuple[bool, list[int], str]¶
Check for thiocarbonyl groups in a molecule.
SMARTS pattern used: -
[#6X3]=[#16X1]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_thiocarbonyl("CC(=S)C")
- static check_thiocarboxylic(target: str | Mol) tuple[bool, list[int], str]¶
Check for thiocarboxylic groups in a molecule.
SMARTS pattern used: -
$([$([CX3[OX2H1]),$(CX3[OX1-])]), $($([CX3[SX2H1]),$(CX3[SX1-])]), $($([CX3[SX2H1]),$(CX3[SX1-])])]Thiocarboxylic acids are compounds where one or both oxygens of a carboxy group are replaced by sulphur.
Reference¶
https://doi.org/10.1351/goldbook.T06352
- param target:
A SMILES string or an RDKit molecule object.
- type target:
str or rdkit.Chem.rdchem.Mol
- returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- rtype:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_thiocarboxylic("CC(=S)SH")
- static check_thiocyanate(target: str | Mol) tuple[bool, list[int], str]¶
Check for thiocyanate groups in a molecule.
SMARTS pattern used: -
[#16D2]#[#7]Thiocyanates are salts and esters of thiocyanic acid (HSC≡N), e.g. methyl thiocyanate (CH₃SC≡N).
Reference¶
https://doi.org/10.1351/goldbook.T06353
- param target:
A SMILES string or an RDKit molecule object.
- type target:
str or rdkit.Chem.rdchem.Mol
- returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- rtype:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_thiocyanate("CSC#N")
- static check_thioester(target: str | Mol) tuple[bool, list[int], str]¶
Check for thioester groups in a molecule.
SMARTS pattern used: -
[$(S([#6])CX3),$(O([#6])CX3),$(#16CX3)]Matches mono- and di-thioesters.
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_thioester("CC(=O)SC")
- static check_thioketone(target: str | Mol) tuple[bool, list[int], str]¶
Check for thioketone groups in a molecule.
SMARTS pattern used: -
[#6]#6D3=[#16X1]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_thioketone("CC(=S)C")
- static check_thiol(target: str | Mol) tuple[bool, list[int], str]¶
Check for thiol groups in a molecule.
SMARTS pattern used: -
[#16X2H]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_thiol("CCSH")
- static check_threonine(target: str | Mol) tuple[bool, list[int], str]¶
Check for threonine residues in a molecule.
SMARTS pattern used: -
[$([NX3H2,NX4H3+]),$(NX3H(C))]CX4H[OX2H])CX3[OX2H,OX1-,N]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_threonine("CC(O)C(N)C(=O)O")
- static check_transition_metals(target: str | Mol) tuple[bool, list[int], str]¶
Check for transition metals in a molecule.
SMARTS pattern used: -
[Sc,Ti,V,Cr,Mn,Fe,Co,Ni,Cu,Zn,Y,Zr,Nb,Mo,Tc,Ru,Rh,Pd,Ag,Cd,La,Ce,Pr,Nd,Pm,Sm,Eu,Gd,Tb,Dy,Ho,Er,Tm,Yb,Lu,Ac,Th,Pa,U,Np,Pu,Am,Cm,Bk,Cf,Es,Fm,Md,No,Lr,Rf,Db,Sg,Bh,Hs,Mt,Ds,Rg,Cn]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether any transition metal was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_transition_metals("[Fe++]")
- static check_tryptophan(target: str | Mol) tuple[bool, list[int], str]¶
Check for tryptophan residues in a molecule.
SMARTS pattern used: -
[$([NX3H2,NX4H3+]),$(NX3H(C))]CX4HCX3[OX2H,OX1-,N]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_tryptophan("NC(Cc1c[nH]c2ccccc12)C(=O)O")
- static check_tyrosine(target: str | Mol) tuple[bool, list[int], str]¶
Check for tyrosine residues in a molecule.
SMARTS pattern used: -
[$([NX3H2,NX4H3+]),$(NX3H(C))]CX4H[cX3H][cX3H]1)CX3[OX2H,OX1-,N]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_tyrosine("NC(Cc1ccc(O)cc1)C(=O)O")
- static check_unbranched_rotatable_carbons(target: str | Mol, n_units: int) tuple[bool, list[int], str]¶
Check for unbranched rotatable carbon chains in a molecule.
SMARTS pattern used: - Unbranched rotatable carbon:
[R0;CD2]-repeated n_units times.Matches: Specifically carbon atoms (C) that are non-cyclic (R0) and have two connections (D2).
Use case: More specific — it only detects unbranched chains made of aliphatic carbon atoms.
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
n_units (int) – Number of repeated carbon rotatable units.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_unbranched_rotatable_carbons("CCCC", 3)
- static check_unbranched_rotatable_chain(target: str | Mol, n_units: int) tuple[bool, list[int], str]¶
Check for unbranched rotatable chains in a molecule.
SMARTS pattern used: - Unbranched rotatable chain:
[R0;D2]-repeated n_units times.Matches: Any non-cyclic, aliphatic atom with two connections (degree 2), regardless of element type.
Use case: More general — it detects unbranched chains of any atoms (not just carbon) that are rotatable and not part of a ring
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
n_units (int) – Number of repeated rotatable units.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_unbranched_rotatable_chain("CCCC", 3)
- static check_unbranched_structure(target: str | Mol, n_units: int) tuple[bool, list[int], str]¶
Check for unbranched chains in a molecule.
SMARTS pattern used: - Unbranched chain:
[R0;D2]~repeated n_units timesMatches: Any non-cyclic atom with two connections, connected by any bond type (~ = single, double, or triple).
Use case: More general — detects unbranched chains regardless of bond type (e.g. C=C-C≡C), and not limited to rotatable bonds.
This pattern matches non-cyclic atoms with two connections (degree 2), connected by any bond type (single, double, or triple), forming a linear, unbranched chain.
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
n_units (int) – Number of repeated unbranched units.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_unbranched_structure("CC=CCCC", 4)
- static check_valine(target: str | Mol) tuple[bool, list[int], str]¶
Check for valine residues in a molecule.
SMARTS pattern used: -
[$([NX3H2,NX4H3+]),$(NX3H(C))][CX4H](3X4])CX3[OX2H,OX1-,N]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_valine("CC(C)C(N)C(=O)O")
- static check_vinylic_carbon(target: str | Mol) tuple[bool, list[int], str]¶
Check for vinylic carbon atoms in a molecule.
The SMARTS pattern used to identify vinylic carbon atoms is
[$([CX3]=[CX3])].- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_vinylic_carbon("C=CC")
- static check_zwitterion(target: str | Mol) tuple[bool, list[int], str]¶
Check for zwitterions in a molecule.
SMARTS pattern used: - Zwitterion: multiple patterns with oppositely charged atoms separated by up to ten bonds.
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> MolPatterns.check_zwitterion("CN+(C)CC(=O)[O-]")
- class Rings¶
Bases:
object- static check_heterocycle(target: str | Mol) tuple[bool, list[int], str]¶
Check for heterocycles in a molecule.
This method uses RDKit’s generic matcher shortcut
CHCto identify any heterocyclic ring system.- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether a heterocycle was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> Rings.check_heterocycle("c1ccncc1")
- static check_heterocycle_N(target: str | Mol) tuple[bool, list[int], str]¶
Check for nitrogen-containing heterocycles in a molecule.
SMARTS pattern used: - Nitrogen in ring:
[#7R]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether a nitrogen heterocycle was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> Rings.check_heterocycle_N("c1ccncc1")
- static check_heterocycle_O(target: str | Mol) tuple[bool, list[int], str]¶
Check for oxygen-containing heterocycles in a molecule.
SMARTS pattern used: - Oxygen in ring:
[#8R]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether an oxygen heterocycle was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> Rings.check_heterocycle_O("C1COC1")
- static check_heterocycle_S(target: str | Mol) tuple[bool, list[int], str]¶
Check for sulphur-containing heterocycles in a molecule.
SMARTS pattern used: - Sulphur in ring:
[#16R]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether a sulphur heterocycle was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> Rings.check_heterocycle_S("C1CSC1")
- static check_macrocycle(target: str | Mol) tuple[bool, list[int], str]¶
Check for macrocycles in a molecule.
SMARTS pattern used: - Macrocycle:
[r;!r3;!r4;!r5;!r6;!r7]This pattern matches atoms in rings larger than 7 members, excluding common small rings (3-7 atoms).
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> Rings.check_macrocycle("C1CCCCCCCCCCCC1")
- static check_meta_substituted_aromatic_r6(target: str | Mol) tuple[bool, list[int], str]¶
Check for meta-substituted aromatic 6-membered rings.
SMARTS pattern used: - Meta substitution:
a1(-[*&!#1&!a&!R])aa(-[*&!#1&!a&!R])aaa1- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> Rings.check_meta_substituted_aromatic_r6("c1(C)cc(C)ccc1")
- static check_ortho_substituted_aromatic_r6(target: str | Mol) tuple[bool, list[int], str]¶
Check for ortho-substituted aromatic 6-membered rings.
SMARTS pattern used: - Ortho substitution:
a1(-[*&!#1&!a&!R])a(-[*&!#1&!a&!R])aaaa1- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> Rings.check_ortho_substituted_aromatic_r6("c1(C)c(C)cccc1")
- static check_para_substituted_aromatic_r6(target: str | Mol) tuple[bool, list[int], str]¶
Check for para-substituted aromatic 6-membered rings.
SMARTS pattern used: - Para substitution:
a1(-[*&!#1&!a&!R])aaa(-[*&!#1&!a&!R])aa1- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> Rings.check_para_substituted_aromatic_r6("c1(C)ccc(C)cc1")
- static check_pattern_cyclic(target: str | Mol, pattern_function: Callable) tuple[bool, list[int], tuple[str, str]]¶
Check whether a given pattern overlaps with ring atoms in a molecule.
This method checks if the atoms matched by a custom pattern function intersect with atoms that are part of a ring.
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
pattern_function (Callable) – A function that returns a SMARTS match result for a specific pattern.
- Returns:
A tuple containing: - A boolean indicating whether the pattern overlaps with ring atoms. - A list of atom indices in the intersection. - A tuple of SMARTS strings representing the matched patterns.
- Return type:
tuple[bool, list[int], tuple[str, str]]
Examples
>>> Rings.check_pattern_cyclic("C1CCCOC1", MolPatterns.check_oxygen)
- static check_pattern_cyclic_substituent(target: str | Mol, pattern_function: Callable) tuple[bool, list[int], tuple[str, str]]¶
Check whether a given pattern overlaps with ring atoms that are connected to non-ring atoms.
This identifies ring atoms that are part of a substituent group (i.e. connected to atoms outside the ring via non-ring bonds).
SMARTS pattern used: - Ring atom with non-ring bond:
[R]!@[*]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
pattern_function (Callable) – A function that returns a SMARTS match result for a specific pattern.
- Returns:
A tuple containing: - A boolean indicating whether the pattern overlaps with ring substituents. - A list of atom indices in the intersection. - A tuple of SMARTS strings representing the matched patterns.
- Return type:
tuple[bool, list[int], tuple[str, str]]
Examples
>>> Rings.check_pattern_cyclic_substituent("c1ccccc1C(=O)O", MolPatterns.check_carboxylic_acid)
- static check_ring(target: str | Mol) tuple[bool, list[int], str]¶
Check for ring atoms in a molecule.
SMARTS pattern used: - Ring atom:
[R]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether the pattern was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> Rings.check_ring("c1ccccc1")
- static check_ring_fusion(target: str | Mol) tuple[bool, list[int], str]¶
Check for fused ring systems in a molecule.
SMARTS pattern used: - Fused rings:
[#6R2,#6R3,#6R4]This pattern matches carbon atoms that are part of two or more rings, indicating ring fusion.
- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
- Returns:
A tuple containing: - A boolean indicating whether fused rings were found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> Rings.check_ring_fusion("c1ccc2ccccc2c1")
- static check_ring_size(target: str | Mol, size: int) tuple[bool, list[int], str]¶
Check for rings of a specific size in a molecule.
SMARTS pattern used: - Ring of size N:
[rN]- Parameters:
target (str or rdkit.Chem.rdchem.Mol) – A SMILES string or an RDKit molecule object.
size (int) – The size of the ring to detect.
- Returns:
A tuple containing: - A boolean indicating whether a ring of the specified size was found. - A list of atom indices matching the pattern. - A SMARTS string representing the matched pattern.
- Return type:
tuple[bool, list[int], str]
Examples
>>> Rings.check_ring_size("C1CCCCC1", 6)
- class PropManager¶
Bases:
objectManage molecular properties using RDKit.
The PropManager class provides a structured interface for manipulating molecular properties via RDKit. It is organised into five logical subsections: Base, Mol, Atom, Bond, and Conformation, each containing methods relevant to their respective domains.
Original RDKit reference: https://www.rdkit.org/docs/source/rdkit.Chem.rdchem.html#
Base Methods¶
- assign_atom_mapnumbers(mol, atom_ids)
Assign map numbers to atoms in a molecule.
- assign_atom_labels(mol, prop_values, atom_ids)
Assign labels to atoms in a molecule.
- assign_atom_notes(mol, prop_values, atom_ids)
Assign notes to atoms in a molecule.
- assign_bond_notes(mol, prop_values, bond_ids)
Assign notes to bonds in a molecule.
- clear_all_atomprops(mol)
Clear all properties from atoms in a molecule.
- clear_prop(rdkit_obj, prop)
Clear a property from an RDKit object.
- get_props_dict(rdkit_obj)
Get all properties of an RDKit object as a dictionary.
- get_prop_names(rdkit_obj)
Get the names of all properties of an RDKit object.
- get_prop(rdkit_obj, prop)
Get the value of a property from an RDKit object.
- set_prop(rdkit_obj, prop_name, prop_val)
Set a property on an RDKit object.
- get_owning_mol(rdkit_obj)
Get the molecule that owns the RDKit object.
Mol Methods¶
- get_atoms(mol)
Get the atoms of a molecule.
- get_atoms_from_idx(mol, idx)
Retrieve atom(s) from a molecule by index.
- get_bonds(mol)
Get the bonds of a molecule.
- get_bonds_from_idx(mol, idx)
Retrieve bond(s) from a molecule by index.
- get_bond_between_atoms(mol, idx1, idx2)
Retrieve the bond between two atoms.
- get_coordinates(conf_or_mol, is_3d, canonOrient, bondLength)
Get coordinates of a molecule or conformer.
- get_conformer(mol, id)
Get a specific conformer from a molecule.
- get_conformers(mol)
Get all conformers from a molecule.
- get_conf_ids(mol)
Get all conformer IDs from a molecule.
- get_distance_matrix(mol, is_3d)
Get the distance matrix of a molecule.
- get_gasteiger_charges(mol, atom_ids, nIter)
Compute Gasteiger charges for specified atoms.
- get_stereogroups(mol)
Get stereochemistry groups of a molecule.
- remove_conformer(mol, id)
Remove a specific conformer.
- remove_all_conformers(mol)
Remove all conformers from a molecule.
Atom Methods¶
- clear_atomprops(atom)
Clear all properties from an atom.
- get_atomic_num(atom)
Get the atomic number.
- get_bonds(atom)
Get bonds connected to an atom.
- get_degree(atom)
Get the degree of an atom.
- get_total_degree(atom)
Get total degree including hydrogens.
- get_explicit_valence(atom)
Get explicit valence.
- get_implicit_valence(atom)
Get implicit valence.
- get_total_valence(atom)
Get total valence.
- has_valence_violation(atom)
Check for valence violations.
- get_formal_charge(atom)
Get formal charge.
- get_hybridisation(atom)
Get hybridisation state.
- get_idx(atom)
Get atom index.
- get_neighbours(atom, order)
Get neighbours up to a given order.
- is_aromatic(atom)
Check if atom is aromatic.
- is_in_ring(atom)
Check if atom is in a ring.
- is_in_ring_size(atom, size)
Check if atom is in a ring of a specific size.
- get_mass(atom)
Get atomic mass.
- get_num_explicit_h(atom)
Get number of explicit hydrogens.
- get_num_implicit_h(atom)
Get number of implicit hydrogens.
- get_tot_h(atom)
Get total number of hydrogens.
- get_num_radical_electrons(atom)
Get number of radical electrons.
- set_atom_map_num(atom, num)
Set atom map number.
- set_formal_charge(atom, charge)
Set formal charge.
- set_is_aromatic(atom, decision)
Set aromaticity.
- set_num_explicit_h(atom, num)
Set number of explicit hydrogens.
- set_num_radical_electrons(atom, num)
Set number of radical electrons.
Bond Methods¶
- get_begin_atom(bond)
Get the starting atom of a bond.
- get_begin_atom_idx(bond)
Get index of the starting atom.
- get_bond_type(bond)
Get bond type.
- get_end_atom(bond)
Get the ending atom of a bond.
- get_end_atom_idx(bond)
Get index of the ending atom.
- get_idx(bond)
Get bond index.
- get_other_atom(bond, atom)
Get the other atom in a bond.
- get_other_atom_idx(bond, idx)
Get the index of the other atom.
- get_valence_contribution(bond, atom)
Get valence contribution of a bond.
- is_aromatic(bond)
Check if bond is aromatic.
- is_conjugated(bond)
Check if bond is conjugated.
- is_in_ring(bond)
Check if bond is in a ring.
- is_in_ring_size(bond, size)
Check if bond is in a ring of a specific size.
- set_is_aromatic(bond, decision)
Set aromaticity of a bond.
Conformation Methods¶
- straighten_mol_2d(mol)
Straighten the 2D depiction of a molecule.
- add_conformer(mol, conformer, assignId)
Add a conformer and return its ID.
- generate_conformers(…)
Generate conformers for a molecule.
- display_conformers(conf, size)
Display conformers in 3D.
- display_3dmols_overlapped(…)
Display multiple 3D molecules overlapped.
- canonicalise_conformer(conf, ignoreHs)
Canonicalise a conformer.
- canonicalise_mol_conformers(mol, ignoreHs)
Canonicalise all conformers.
- calculate_conformer_energy_from_mol(mol, conf_id, forcefield)
Calculate energy of a conformer.
- optimise_conformers(mol, force_field, max_iter)
Optimise all conformers.
- optimise_molecule(mol, conf_id, force_field, max_iter)
Optimise a specific conformer.
- get_shape_descriptors(conf_or_mol, include_masses, is_3d)
Calculate shape descriptors.
- class Atom¶
Bases:
object- static clear_atomprops(atom: Atom) None¶
Clear all properties from an atom.
Shortcut for the analogous RDKit method ClearProp().
- Parameters:
atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object from which to clear all properties.
- Return type:
None
- static get_atomic_num(atom: Atom) int¶
Get the atomic number of an atom.
Shortcut for the analogous RDKit method GetAtomicNum().
- Parameters:
atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.
- Returns:
The atomic number of the atom.
- Return type:
int
- static get_bonds(atom: Atom) tuple¶
Get the bonds of an atom.
Shortcut for the analogous RDKit method GetBonds().
- Parameters:
atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.
- Returns:
A tuple of RDKit bond objects associated with the atom.
- Return type:
tuple
- static get_degree(atom: Atom) int¶
Get the degree of an atom.
Shortcut for the analogous RDKit method GetDegree().
- Parameters:
atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.
- Returns:
The degree of the atom.
- Return type:
int
- static get_explicit_valence(atom: Atom) int¶
Get the explicit valence of an atom.
Shortcut for the analogous RDKit method GetExplicitValence().
- Parameters:
atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.
- Returns:
The explicit valence of the atom.
- Return type:
int
- static get_formal_charge(atom: Atom) int¶
Get the formal charge of an atom.
Shortcut for the analogous RDKit method GetFormalCharge().
- Parameters:
atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.
- Returns:
The formal charge of the atom.
- Return type:
int
- static get_hybridisation(atom: Atom) HybridizationType¶
Get the hybridisation of an atom.
Shortcut for the analogous RDKit method GetHybridization().
- Parameters:
atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.
- Returns:
The hybridisation of the atom.
- Return type:
rdkit.Chem.rdchem.HybridizationType
- static get_idx(atom: Atom) int¶
Get the index of an atom.
Shortcut for the analogous RDKit method GetIdx().
- Parameters:
atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.
- Returns:
The index of the atom.
- Return type:
int
- static get_implicit_valence(atom: Atom) int¶
Get the implicit valence of an atom.
Shortcut for the analogous RDKit method GetImplicitValence().
- Parameters:
atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.
- Returns:
The implicit valence of the atom.
- Return type:
int
- static get_mass(atom: Atom) float¶
Get the mass of an atom.
Shortcut for the analogous RDKit method GetMass().
- Parameters:
atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.
- Returns:
The mass of the atom.
- Return type:
float
- static get_neighbours(atom: Atom, order: int = 1) list¶
Get the neighbours of an atom up to a specified order.
This function recursively finds the neighbours of a given atom up to the specified order. It ensures that atoms are not revisited by checking their map number.
- Parameters:
atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.
order (int, optional) – The order of neighbours to find. Default is 1.
- Returns:
A list of RDKit atom objects representing the neighbours.
- Return type:
list
- static get_num_explicit_h(atom: Atom) int¶
Get the number of explicit hydrogen atoms attached to an atom.
Shortcut for the analogous RDKit method GetNumExplicitHs().
- Parameters:
atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.
- Returns:
The number of explicit hydrogen atoms.
- Return type:
int
- static get_num_implicit_h(atom: Atom) int¶
Get the number of implicit hydrogen atoms attached to an atom.
Shortcut for the analogous RDKit method GetNumImplicitHs().
- Parameters:
atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.
- Returns:
The number of implicit hydrogen atoms.
- Return type:
int
- static get_num_radical_electrons(atom: Atom) int¶
Get the number of radical electrons on an atom.
Shortcut for the analogous RDKit method GetNumRadicalElectrons().
- Parameters:
atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.
- Returns:
The number of radical electrons.
- Return type:
int
- static get_tot_h(atom: Atom) int¶
Get the total number of hydrogen atoms attached to an atom.
Shortcut for the analogous RDKit method GetTotalNumHs().
- Parameters:
atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.
- Returns:
The total number of hydrogen atoms.
- Return type:
int
- static get_total_degree(atom: Atom) int¶
Get the total degree of an atom, including hydrogens.
Shortcut for the analogous RDKit method GetTotalDegree().
- Parameters:
atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.
- Returns:
The total degree of the atom.
- Return type:
int
- static get_total_valence(atom: Atom) int¶
Get the total valence of an atom.
Shortcut for the analogous RDKit method GetTotalValence().
- Parameters:
atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.
- Returns:
The total valence of the atom.
- Return type:
int
- static has_valence_violation(atom: Atom) bool¶
Check if an atom has a valence violation.
Shortcut for the analogous RDKit method HasValenceViolation().
- Parameters:
atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.
- Returns:
True if the atom has a valence violation, False otherwise.
- Return type:
bool
- static is_aromatic(atom: Atom) bool¶
Check if an atom is aromatic.
Shortcut for the analogous RDKit method GetIsAromatic().
- Parameters:
atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.
- Returns:
True if the atom is aromatic, False otherwise.
- Return type:
bool
- static is_in_ring(atom: Atom) bool¶
Check if an atom is in a ring.
Shortcut for the analogous RDKit method IsInRing().
- Parameters:
atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.
- Returns:
True if the atom is in a ring, False otherwise.
- Return type:
bool
- static is_in_ring_size(atom: Atom, size: int) bool¶
Check if an atom is in a ring of a specific size.
Shortcut for the analogous RDKit method IsInRingSize().
- Parameters:
atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.
size (int) – The size of the ring to check.
- Returns:
True if the atom is in a ring of the specified size, False otherwise.
- Return type:
bool
- static set_atom_map_num(atom: Atom, num: int) None¶
Set the atom map number for an atom.
Shortcut for the analogous RDKit method SetAtomMapNum().
- Parameters:
atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.
num (int) – The atom map number to set.
- Return type:
None
- static set_formal_charge(atom: Atom, charge: int) None¶
Set the formal charge of an atom.
Shortcut for the analogous RDKit method SetFormalCharge().
- Parameters:
atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.
charge (int) – The formal charge to set.
- Return type:
None
- static set_is_aromatic(atom: Atom, decision: bool) None¶
Set the aromaticity of an atom.
Shortcut for the analogous RDKit method SetIsAromatic().
- Parameters:
atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.
decision (bool) – Whether the atom should be marked as aromatic.
- Return type:
None
- static set_num_explicit_h(atom: Atom, num: int) None¶
Set the number of explicit hydrogen atoms attached to an atom.
Shortcut for the analogous RDKit method SetNumExplicitHs().
- Parameters:
atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.
num (int) – The number of explicit hydrogen atoms to set.
- Return type:
None
- static set_num_radical_electrons(atom: Atom, num: int) None¶
Set the number of radical electrons on an atom.
Shortcut for the analogous RDKit method SetNumRadicalElectrons().
- Parameters:
atom (rdkit.Chem.rdchem.Atom) – The RDKit atom object.
num (int) – The number of radical electrons to set.
- Return type:
None
- class Base¶
Bases:
object- static assign_atom_labels(mol: Mol, prop_values: str | int | float | Iterable | None = None, atom_ids: Iterable[int] = ()) None¶
Assign labels to atoms in a molecule.
If atom_ids is provided, only those atoms will be labelled. Otherwise, all atoms will be labelled. If prop_values is not provided, atom indices will be used as labels.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.
prop_values (str, int, float, Iterable, optional) – Values to assign as labels. If None, atom indices are used.
atom_ids (Iterable[int], optional) – Atom indices to assign labels to. Default is all atoms.
- Return type:
None
- static assign_atom_mapnumbers(mol: Mol, atom_ids: Iterable[int] = ()) None¶
Assign map numbers to atoms in a molecule.
If atom_ids is provided, only those atoms will be assigned map numbers. Otherwise, all atoms in the molecule will be assigned map numbers.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.
atom_ids (Iterable[int], optional) – Atom indices to assign map numbers to. Default is all atoms.
- Return type:
None
- static assign_atom_notes(mol: Mol, prop_values: str | int | float | Iterable | None = None, atom_ids: Iterable[int] = ()) None¶
Assign notes to atoms in a molecule.
If atom_ids is provided, only those atoms will be annotated. Otherwise, all atoms will be annotated. If prop_values is not provided, atom indices will be used as notes.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.
prop_values (str, int, float, Iterable, optional) – Values to assign as notes. If None, atom indices are used.
atom_ids (Iterable[int], optional) – Atom indices to assign notes to. Default is all atoms.
- Return type:
None
- static assign_bond_notes(mol: Mol, prop_values: str | int | float | Iterable, bond_ids: Iterable[int] = ()) None¶
Assign notes to bonds in a molecule.
If bond_ids is provided, only those bonds will be annotated. Otherwise, all bonds will be annotated.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.
prop_values (str, int, float, Iterable) – Values to assign as notes.
bond_ids (Iterable[int], optional) – Bond indices to assign notes to. Default is all bonds.
- Return type:
None
- static clear_all_atomprops(mol: Mol) None¶
Clear all properties from atoms in a molecule.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.
- Return type:
None
- static clear_prop(rdkit_obj: Mol | Atom | Bond | Conformer, prop: str) None¶
Clear a property from an RDKit object.
- static get_owning_mol(rdkit_obj: Atom | Bond | Conformer) Mol¶
Get the molecule that owns the RDKit object.
Shortcut for RDKit’s GetOwningMol() method.
- Parameters:
rdkit_obj (rdkit.Chem.rdchem.Atom or Bond or Conformer) – The RDKit object whose parent molecule is to be retrieved.
- Returns:
The owning molecule.
- Return type:
rdkit.Chem.rdchem.Mol
- static get_prop(rdkit_obj: Mol | Atom | Bond | Conformer, prop: str) str¶
Get the value of a property from an RDKit object.
Shortcut for RDKit’s GetProp() method.
- static get_prop_names(rdkit_obj: Mol | Atom | Bond | Conformer) list¶
Get the names of all properties of an RDKit object.
- static get_props_dict(rdkit_obj: Mol | Atom | Bond | Conformer) dict¶
Get all properties of an RDKit object as a dictionary.
- static set_prop(rdkit_obj: Mol | Atom | Bond | Conformer, prop_name: str, prop_val: str) None¶
Set a property on an RDKit object.
Shortcut for RDKit’s SetProp() method.
- class Bond¶
Bases:
object- static get_begin_atom(bond: Bond) Atom¶
Get the beginning atom of a bond.
Shortcut for the analogous RDKit method GetBeginAtom().
- Parameters:
bond (rdkit.Chem.rdchem.Bond) – The RDKit bond object.
- Returns:
The atom at the beginning of the bond.
- Return type:
rdkit.Chem.rdchem.Atom
- static get_begin_atom_idx(bond: Bond) int¶
Get the index of the beginning atom of a bond.
Shortcut for the analogous RDKit method GetBeginAtomIdx().
- Parameters:
bond (rdkit.Chem.rdchem.Bond) – The RDKit bond object.
- Returns:
The index of the beginning atom.
- Return type:
int
- static get_bond_type(bond: Bond) BondType¶
Get the type of a bond.
Shortcut for the analogous RDKit method GetBondType().
- Parameters:
bond (rdkit.Chem.rdchem.Bond) – The RDKit bond object.
- Returns:
The type of the bond.
- Return type:
rdkit.Chem.rdchem.BondType
- static get_end_atom(bond: Bond) Atom¶
Get the ending atom of a bond.
Shortcut for the analogous RDKit method GetEndAtom().
- Parameters:
bond (rdkit.Chem.rdchem.Bond) – The RDKit bond object.
- Returns:
The atom at the end of the bond.
- Return type:
rdkit.Chem.rdchem.Atom
- static get_end_atom_idx(bond: Bond) int¶
Get the index of the ending atom of a bond.
Shortcut for the analogous RDKit method GetEndAtomIdx().
- Parameters:
bond (rdkit.Chem.rdchem.Bond) – The RDKit bond object.
- Returns:
The index of the ending atom.
- Return type:
int
- static get_idx(bond: Bond) int¶
Get the index of a bond.
Shortcut for the analogous RDKit method GetIdx().
- Parameters:
bond (rdkit.Chem.rdchem.Bond) – The RDKit bond object.
- Returns:
The index of the bond.
- Return type:
int
- static get_other_atom(bond: Bond, atom: Atom) Atom¶
Given one atom of the bond, get the other atom.
Shortcut for the analogous RDKit method GetOtherAtom().
- Parameters:
bond (rdkit.Chem.rdchem.Bond) – The RDKit bond object.
atom (rdkit.Chem.rdchem.Atom) – One of the atoms in the bond.
- Returns:
The other atom in the bond.
- Return type:
rdkit.Chem.rdchem.Atom
- static get_other_atom_idx(bond: Bond, idx: int) int¶
Given the index of one atom in the bond, get the index of the other atom.
Shortcut for the analogous RDKit method GetOtherAtomIdx().
- Parameters:
bond (rdkit.Chem.rdchem.Bond) – The RDKit bond object.
idx (int) – The index of one atom in the bond.
- Returns:
The index of the other atom in the bond.
- Return type:
int
- static get_valence_contribution(bond: Bond, atom: Atom) float¶
Get the valence contribution of a bond to an atom.
Shortcut for the analogous RDKit method GetValenceContrib().
- Parameters:
bond (rdkit.Chem.rdchem.Bond) – The RDKit bond object.
atom (rdkit.Chem.rdchem.Atom) – The atom for which to compute the valence contribution.
- Returns:
The valence contribution of the bond to the atom.
- Return type:
float
- static is_aromatic(bond: Bond) bool¶
Check if a bond is aromatic.
Shortcut for the analogous RDKit method GetIsAromatic().
- Parameters:
bond (rdkit.Chem.rdchem.Bond) – The RDKit bond object.
- Returns:
True if the bond is aromatic, False otherwise.
- Return type:
bool
- static is_conjugated(bond: Bond) bool¶
Check if a bond is conjugated.
Shortcut for the analogous RDKit method GetIsConjugated().
- Parameters:
bond (rdkit.Chem.rdchem.Bond) – The RDKit bond object.
- Returns:
True if the bond is conjugated, False otherwise.
- Return type:
bool
- static is_in_ring(bond: Bond) bool¶
Check if a bond is in a ring.
Shortcut for the analogous RDKit method IsInRing().
- Parameters:
bond (rdkit.Chem.rdchem.Bond) – The RDKit bond object.
- Returns:
True if the bond is in a ring, False otherwise.
- Return type:
bool
- static is_in_ring_size(bond: Bond, size: int) bool¶
Check if a bond is in a ring of a specific size.
Shortcut for the analogous RDKit method IsInRingSize().
- Parameters:
bond (rdkit.Chem.rdchem.Bond) – The RDKit bond object.
size (int) – The size of the ring to check.
- Returns:
True if the bond is in a ring of the specified size, False otherwise.
- Return type:
bool
- static set_is_aromatic(bond: Bond, decision: bool) None¶
Check if a bond is in a ring of a specific size.
Shortcut for the analogous RDKit method IsInRingSize().
- Parameters:
bond (rdkit.Chem.rdchem.Bond) – The RDKit bond object.
size (int) – The size of the ring to check.
- Returns:
True if the bond is in a ring of the specified size, False otherwise.
- Return type:
bool
- class Conformation¶
Bases:
object- static add_conformer(mol: Mol, conformer: Conformer, assignId: bool = False) int¶
Add a conformer to a molecule and return its ID.
Shortcut for the analogous RDKit method AddConformer().
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object to which the conformer will be added.
conformer (rdkit.Chem.rdchem.Conformer) – The conformer to add to the molecule.
assignId (bool, optional) – Whether to assign a new ID to the conformer. Default is False.
- Returns:
The ID of the added conformer.
- Return type:
int
- static calculate_conformer_energy_from_mol(mol: Mol, conf_id: int = -1, forcefield: Literal['UFF', 'MMFF94'] = 'MMFF94') float¶
Calculate the energy of a specific conformer.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.
conf_id (int, optional) – The ID of the conformer. Default is -1.
forcefield ({'UFF', 'MMFF94'}, optional) – The force field to use. Default is ‘MMFF94’.
- Returns:
The energy of the conformer in kcal/mol.
- Return type:
float
- static canonicalise_conformer(conf: Conformer, ignoreHs: bool = False) None¶
Canonicalise a conformer.
Shortcut for the analogous RDKit method CanonicalizeConformer().
- Parameters:
conf (rdkit.Chem.rdchem.Conformer) – The conformer to canonicalise.
ignoreHs (bool, optional) – Whether to ignore hydrogen atoms. Default is False.
- Return type:
None
- static canonicalise_mol_conformers(mol: Mol, ignoreHs: bool = False) None¶
Canonicalise all conformers of a molecule.
Shortcut for the analogous RDKit method CanonicalizeMol().
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – The molecule whose conformers are to be canonicalised.
ignoreHs (bool, optional) – Whether to ignore hydrogen atoms. Default is False.
- Return type:
None
- static display_3dmols_overlapped(mols: list[Mol], py3dmolviewer=None, size: tuple | list | int = (400, 400), confIds: list[int] | None = None, removeHs: bool = False, colours: list[str] = None) None¶
Display multiple 3D molecules overlapped in a py3Dmol viewer.
This function displays a list of RDKit molecules in a 3D viewer, with options to customise size, conformer IDs, hydrogen removal, and colour schemes.
- Parameters:
mols (list of rdkit.Chem.rdchem.Mol) – List of RDKit molecule objects to display.
py3dmolviewer (py3Dmol.view, optional) – An existing py3Dmol viewer instance. If None, a new viewer is created.
size (int or tuple or list, optional) – Size of the viewer. Can be a single integer or a tuple/list of two integers. Default is (400, 400).
confIds (list of int, optional) – List of conformer IDs to display. If None, uses default conformer (-1). Default is None.
removeHs (bool, optional) – Whether to remove hydrogen atoms before displaying. Default is False.
colours (list of str, optional) – List of colour schemes for the molecules. Default is a predefined list.
- Return type:
None
- static display_conformers(conf: Conformer | Iterable[Conformer], size: tuple = (300, 300)) None¶
Display conformers in 3D.
Shortcut for the analogous RDKit method drawMol3D().
- Parameters:
conf (rdkit.Chem.rdchem.Conformer or Iterable[rdkit.Chem.rdchem.Conformer]) – A single RDKit conformer or an iterable of conformers to display.
size (tuple, optional) – The size of the display window. Default is (300, 300).
- Return type:
None
- static generate_conformers(mol: Mol, n_conf: int, rms_threshold: int = 0, embedding_params: Literal['ETDG', 'ETKDG', 'ETKDGv2', 'ETKDGv3'] = 'ETKDGv3', show: bool = False, size: tuple = (300, 300), force_field: Literal['MMFF94', 'UFF'] = 'MMFF94', optimise: bool = True, max_iter: int = 500, random_seed: int = 61453) tuple¶
Generate conformers for a molecule.
This method embeds multiple conformers using specified embedding parameters, optionally optimises them using a force field, and returns the conformers along with their energies and optimisation results.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.
n_conf (int) – The number of conformers to generate.
rms_threshold (int, optional) – RMS threshold for pruning conformers. Default is 0.
embedding_params ({'ETDG', 'ETKDG', 'ETKDGv2', 'ETKDGv3'}, optional) – The embedding parameters to use. Default is ‘ETKDGv3’.
show (bool, optional) – Whether to display the conformers. Default is False.
size (tuple, optional) – The size of the display window. Default is (300, 300).
force_field ({'MMFF94', 'UFF'}, optional) – The force field to use for energy calculation. Default is ‘MMFF94’.
optimise (bool, optional) – Whether to optimise the conformers. Default is True.
max_iter (int, optional) – Maximum number of iterations for optimisation. Default is 500.
random_seed (int, optional) – Random seed for conformer generation. Default is 0xf00d.
- Returns:
A tuple containing: - A list of RDKit conformer objects. - A list of conformer energies. - A list of optimisation results (if optimise is True).
- Return type:
tuple
- static get_shape_descriptors(conf_or_mol: Conformer | Mol, include_masses: bool = True, is_3d: bool = True) dict¶
Calculate shape descriptors for a conformer or molecule.
- Parameters:
conf_or_mol (rdkit.Chem.rdchem.Conformer or rdkit.Chem.rdchem.Mol) – The conformer or molecule to analyse.
include_masses (bool, optional) – Whether to include atomic masses. Default is True.
is_3d (bool, optional) – Whether to use 3D coordinates. Default is True.
- Returns:
A dictionary of shape descriptors.
- Return type:
dict
- static optimise_conformers(mol: Mol, force_field: Literal['UFF', 'MMFF94'] = 'MMFF94', max_iter: int = 500) list¶
Optimise all conformers of a molecule.
Shortcut for the analogous RDKit methods MMFFOptimizeMoleculeConfs() and UFFOptimizeMoleculeConfs().
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – The molecule whose conformers are to be optimised.
force_field ({'UFF', 'MMFF94'}, optional) – The force field to use. Default is ‘MMFF94’.
max_iter (int, optional) – Maximum number of iterations. Default is 500.
- Returns:
Optimisation results for each conformer.
- Return type:
list of tuple
- static optimise_molecule(mol: Mol, conf_id: int = -1, force_field: Literal['UFF', 'MMFF94'] = 'MMFF94', max_iter: int = 500) int¶
Optimise a specific conformer of a molecule.
Shortcut for the analogous RDKit methods MMFFOptimizeMolecule() and UFFOptimizeMolecule().
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – The molecule whose conformer is to be optimised.
conf_id (int, optional) – The ID of the conformer. Default is -1. If value is -1, whole molecule will be optimised.
force_field ({'UFF', 'MMFF94'}, optional) – The force field to use. Default is ‘MMFF94’.
max_iter (int, optional) – Maximum number of iterations. Default is 500.
- Returns:
0 if converged, -1 if force field setup failed, 1 if more iterations are needed.
- Return type:
int
- static straighten_mol_2d(mol: Mol) None¶
Straighten the 2D depiction of a molecule.
This method computes 2D coordinates and straightens the depiction of the molecule using RDKit’s depiction tools.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.
- Return type:
None
- class Mol¶
Bases:
object- static get_atoms(mol: Mol) list¶
Get the atoms of a molecule.
Retrieves all atoms from the given RDKit molecule object.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.
- Returns:
A list of RDKit atom objects.
- Return type:
list of rdkit.Chem.rdchem.Atom
- static get_atoms_from_idx(mol: Mol, idx: int | Iterable[int]) Atom | list[Atom]¶
Retrieve atom(s) from a molecule based on index or indices.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.
idx (int or Iterable[int]) – The index or indices of the atom(s) to retrieve.
- Returns:
A single atom object if idx is an int, otherwise a list of atom objects.
- Return type:
rdkit.Chem.rdchem.Atom or list of rdkit.Chem.rdchem.Atom
- static get_bond_between_atoms(mol: Mol, idx1: int, idx2: int) Bond¶
Retrieve the bond between two atoms in a molecule.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.
idx1 (int) – The index of the first atom.
idx2 (int) – The index of the second atom.
- Returns:
The bond object between the specified atoms.
- Return type:
rdkit.Chem.rdchem.Bond
- static get_bonds(mol: Mol) list¶
Get the bonds of a molecule.
Retrieves all bonds from the given RDKit molecule object.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.
- Returns:
A list of RDKit bond objects.
- Return type:
list of rdkit.Chem.rdchem.Bond
- static get_bonds_from_idx(mol: Mol, idx: int | Iterable[int]) list¶
Retrieve bond(s) from a molecule based on index or indices.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.
idx (int or Iterable[int]) – The index or indices of the bond(s) to retrieve.
- Returns:
A list of RDKit bond objects.
- Return type:
list of rdkit.Chem.rdchem.Bond
- static get_conf_ids(mol: Mol) list¶
Get all conformer IDs associated with a molecule.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.
- Returns:
A list of conformer IDs.
- Return type:
list of int
- static get_conformer(mol: Mol, id: int = -1) Conformer¶
Get the conformer associated with a molecule.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.
id (int, optional) – The ID of the conformer to retrieve. Default is -1.
- Returns:
The conformer object.
- Return type:
rdkit.Chem.rdchem.Conformer
- static get_conformers(mol: Mol) list¶
Get all conformers associated with a molecule.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.
- Returns:
A list of conformer objects.
- Return type:
list of rdkit.Chem.rdchem.Conformer
- static get_coordinates(conf_or_mol: Conformer | Mol, is_3d: bool = False, canonOrient: bool = True, bondLength: float = -1.0) ndarray¶
Get the coordinates of a molecule or conformer.
- Parameters:
conf_or_mol (rdkit.Chem.rdchem.Conformer or rdkit.Chem.rdchem.Mol) – The conformer or molecule object.
is_3d (bool, optional) – Whether to retrieve 3D coordinates. Default is False.
canonOrient (bool, optional) – Whether to use canonical orientation for 2D coordinates. Default is True.
bondLength (float, optional) – Bond length for 2D coordinate generation. Default is -1.0.
- Returns:
An array of atomic coordinates.
- Return type:
numpy.ndarray
- static get_distance_matrix(mol: Mol, is_3d: bool = True) ndarray¶
Get the distance matrix of a molecule.
Calculates the pairwise distance matrix using atomic coordinates.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.
is_3d (bool, optional) – Whether to use 3D coordinates. Default is True.
- Returns:
The distance matrix.
- Return type:
numpy.ndarray
- static get_gasteiger_charges(mol: Mol, atom_ids: int | Iterable[int] = [], nIter: int = 12) list¶
Compute Gasteiger charges for specified atoms in a molecule.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.
atom_ids (int or Iterable[int], optional) – Atom indices to compute charges for. Default is all atoms.
nIter (int, optional) – Number of iterations for charge computation. Default is 12.
- Returns:
Gasteiger charges for the specified atoms.
- Return type:
list of float
- static get_stereogroups(mol: Mol) list¶
Get the stereochemistry groups of a molecule.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.
- Returns:
A list of stereochemistry groups.
- Return type:
list of rdkit.Chem.rdchem.StereoGroup
- static remove_all_conformers(mol: Mol) None¶
Remove all conformers from a molecule.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.
- Return type:
None
- static remove_conformer(mol: Mol, id: int) None¶
Remove a conformer from a molecule.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.
id (int) – The ID of the conformer to remove.
- Return type:
None
- create_molecule(mol_input: str | str_ | Mol, add_hydrogens: bool = False, show: bool = False, solid_sticks: bool = False, is_3d: bool = False, embedding_params: Literal['ETDG', 'ETKDG', 'ETKDGv2', 'ETKDGv3'] = 'ETKDGv3', size: tuple = [300, 300], optimise: bool = True, optimiser: Literal['MMFF94', 'UFF'] = 'MMFF94', random_seed: int = 61453) Mol | None¶
Create and optionally display a molecule from a SMILES string or RDKit Mol object.
Supports 2D/3D visualisation, hydrogen addition, geometry optimisation, and rendering options.
- Parameters:
mol_input (str or numpy.str_ or rdkit.Chem.rdchem.Mol) – The molecule input as a SMILES string, numpy string, or RDKit Mol object.
add_hydrogens (bool, optional) – Whether to add hydrogens to the molecule. Default is False.
show (bool, optional) – Whether to display the molecule. Default is False.
solid_sticks (bool, optional) – Whether to render the molecule as solid sticks. Default is False.
is_3d (bool, optional) – Whether to generate a 3D representation. Default is False.
embedding_params ({'ETDG', 'ETKDG', 'ETKDGv2', 'ETKDGv3'}, optional) – Embedding parameters for 3D generation. Default is ‘ETKDGv3’.
size (tuple, optional) – Size of the displayed image. Default is (300, 300).
optimise (bool, optional) – Whether to optimise the geometry. Default is True.
optimiser ({'MMFF94', 'UFF'}, optional) – Optimisation method. Default is ‘MMFF94’.
random_seed (int, optional) – Random seed for embedding. Default is 0xf00d.
- Returns:
The processed molecule object, or None if only visualisation is requested.
- Return type:
rdkit.Chem.rdchem.Mol or None
- generate_resonance(smi: str, save: bool = False, path_name: str = '') list[bytes]¶
Generate resonance structures for a SMILES string and optionally save images.
- Parameters:
smi (str) – A SMILES string representing the molecule.
save (bool, optional) – Whether to save the images. Default is False.
path_name (str, optional) – Directory path to save images. Default is current directory.
- Returns:
List of binary image data for the resonance structures.
- Return type:
list of bytes
- kekulise_smiles(smiles: str) str¶
Convert a SMILES string to its Kekulé form.
- Parameters:
smiles (str) – A SMILES string.
- Returns:
The Kekulé form of the SMILES string.
- Return type:
str
- mol_from_string(mol_input: str) Mol¶
Convert a molecular string (InChI or SMILES) to an RDKit molecule object.
Attempts to interpret the input string as an InChI first, then as a SMILES. Raises a ValueError if both conversions fail.
- Parameters:
mol_input (str) – A string representing a molecule in InChI or SMILES format.
- Returns:
RDKit molecule object corresponding to the input string.
- Return type:
rdkit.Chem.rdchem.Mol
- Raises:
ValueError – If the input string is not a valid InChI or SMILES.
- mol_to_binary(mol: Mol) bytes¶
Convert an RDKit molecule object to its binary representation.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – An RDKit molecule object.
- Returns:
Binary representation of the molecule.
- Return type:
bytes
- neutralise_mol(mol: Mol) Mol¶
Neutralise an RDKit molecule by removing formal charges.
Based on RDKit’s neutralisation recipe.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – An RDKit molecule object.
- Returns:
The neutralised molecule.
- Return type:
rdkit.Chem.rdchem.Mol
- remove_smarts_pattern(mol: Mol, smarts_string: str) Mol¶
Remove substructures matching a SMARTS pattern from a molecule.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – An RDKit molecule object.
smarts_string (str) – A SMARTS pattern to remove.
- Returns:
The molecule with matching substructures removed.
- Return type:
rdkit.Chem.rdchem.Mol
- smarts_from_string(string: str) str¶
Convert a SMILES or InChI string into a SMARTS string.
- Parameters:
string (str) – A string representing a molecule in SMILES or InChI format.
- Returns:
A SMARTS string representing the molecular pattern.
- Return type:
str
- smiles_to_inchi(smiles: str) str¶
Convert a SMILES string to an InChI string.
- Parameters:
smiles (str) – A string representing a molecule in SMILES format.
- Returns:
An InChI string corresponding to the input SMILES.
- Return type:
str
- Raises:
ValueError – If the SMILES string is invalid.
- unkekulise_smiles(smiles: str) str¶
Convert a Kekulé SMILES string to its canonical form.
- Parameters:
smiles (str) – A Kekulé SMILES string.
- Returns:
The canonical SMILES string.
- Return type:
str