rdmc.mol

This module provides class and methods for dealing with RDKit RWMol, Mol.

class rdmc.mol.RDKitMol(mol: Union[Mol, RWMol], keepAtomMap: bool = True)

Bases: object

A helpful wrapper for rdchem.Mol. The method nomenclature follows the Camel style to be consistent with RDKit. It keeps almost all of the orignal method of Chem.rdchem.Mol/RWMol, but add few useful shortcuts, so that a user doesn’t need to refer to other RDKit modules.

AddNullConformer(confId: Optional[int] = None, random: bool = True)

Embed null conformer to existing RDKit mol.

Parameters:
  • confId (int, optional) – Which ID to set for the conformer (will add as last conformer by default).

  • random (bool, optional) – Whether set coordinates to random numbers. Otherwise, set to all-zero coordinates. Defaults to True.

AddRedundantBonds(bonds: Iterable)

Add redundant bonds (not originally exist in the molecule) for the convenience for some molecule operation or analysis. This function will only generate a copy of the molecule and no change is conducted inplace.

Parameters:

bonds – a list of length-2 Iterables containing the index of the ended atoms.

AlignMol(prbMol: Union[RDKitMol, RWMol, Mol] = None, refMol: Union[RDKitMol, RWMol, Mol] = None, prbCid: int = 0, refCid: int = 0, reflect: bool = False, atomMaps: Optional[list] = None, maxIters: int = 1000, weights: list = []) float

Align molecules based on a reference molecule. This function will also return the RMSD value for the best alignment. When leaving both prbMol and refMol blank, then the function will match the current molecules’ conformers, and PrbCid or refCid must be provided.

Parameters:
  • refMol (Mol) – RDKit molecule as a reference. Should not be provided with prbMol.

  • prbMol (Mol) – RDKit molecules to align to the current molecule. Should not be provided with refMol.

  • prbCid (int, optional) – The conformer id to be aligned. Defaults to 0.

  • refCid (int, optional) – The id of reference conformer. Defaults to 0.

  • reflect (bool, optional) – Whether to reflect the conformation of the probe molecule. Defaults to False.

  • atomMap (list, optional) – a vector of pairs of atom IDs (probe AtomId, ref AtomId) used to compute the alignments. If this mapping is not specified an attempt is made to generate on by substructure matching.

  • maxIters (int, optional) – maximum number of iterations used in mimizing the RMSD. Defaults to 1000.

Returns:

RMSD value.

Return type:

float

AssignStereochemistryFrom3D(confId: int = 0)

Assign the chiraltype to a molecule’s atoms.

CalcRMSD(prbMol: RDKitMol, prbCid: int = 0, refCid: int = 0, reflect: bool = False, atomMaps: Optional[list] = None, weights: list = [])

Calculate the RMSD for between conformers of two molecules. Note this function will not align molecule, thus molecules geometries in the calculation are not translated or rotated. You can expect a larger number compared to the RMSD from AlignMol.

Parameters:
  • prbMol ('RDKitMol') – The other molecule to compare with. This can be the instance as the current molecule.

  • prbCid (int, optional) – The conformer ID of the current molecule to calculate RMSD. Defaults to 0.

  • refCid (int, optional) – The conformer ID of the other molecule to calculate RMSD. Defaults to 0.

  • reflect (bool, optional) – Whether to reflect the conformation of the prbMol. Defaults to False.

  • atomMaps (list, optional) – Provide an atom mapping to calculate the RMSD. By default, prbMol and current molecule will be assumed to have the same atom order.

  • weights (list, optional) – Specify weights to each atom pairs. E.g., use atom weights to highlight the importance of heavy atoms.

CombineMol(molFrag: Union[RDKitMol, Mol], offset: Union[list, tuple, float, ndarray] = 0, c_product: bool = False) RDKitMol

A function to combine the current molecule with the given molFrag (another molecule or fragment). It will return a new RDKitMol instance without changing the input molecules.

Parameters:
  • molFrag (RDKitMol, Mol) – the molecule or fragment to be combined into the current one.

  • offset

    • (list, tuple): A 3D vector used to define the offset.

    • (float): Distance in Angstrom between the current mol and the molFrag along the x axis.

  • c_product (bool, optional) – If True, generate conformers for every possible combination between the current molecule and the molFrag. E.g., (1,1), (1,2), … (1,n), (2,1), …(m,1), … (m,n) Defaults to False, meaning it generates conformers pairwise. E.g., (1,1), (2,2), … When c_product=True, N(conformer) = m x n. When c_product=False, if the current molecule has 0 conformer, N(conformer) will be equal to the N(conformer) of molFrag; Otherwise, N(conformer) will be equal to the N(conformer) of the current molecule object. Some coordinates may be filled by zeros if the current molecule and molFrag have different N(conformer).

Returns:

The combined molecule.

Return type:

RDKitMol

Copy(quickCopy: bool = False, confId: int = - 1, copy_attrs: Optional[list] = None) RDKitMol

Make a copy of the RDKitMol.

Parameters:
  • quickCopy (bool, optional) – Use the quick copy mode without copying conformers. Defaults to False.

  • confId (int, optional) – The conformer ID to be copied. Defaults to -1, meaning all conformers.

  • copy_attrs (list, optional) – copy specific attributes to the new mol

Returns:

a copied molecule

Return type:

RDKitMol

EmbedConformer(embed_null: bool = True, **kwargs)

Embed a conformer to the RDKitMol. This will overwrite current conformers. By default, it will first try embeding a 3D conformer; if failed, it will then try to compute 2D coordinates and use that for the conformer structure; if both approaches fail and allow embedding a null conformer, then a conformer with all zero coordinates will be embedded. The last one is still helpful if you have the coordinates, so you can use SetPositions to set them, or if you want to optimize the geometry using things like forcefields.

Parameters:

embed_null (bool) – If Embed 3D and 2D coordinates fails, whether to embed a conformer with all null coordinates: (0, 0, 0) for each atom. Defaults to True.

EmbedMultipleConfs(n: int = 1, embed_null: bool = True, **kwargs)

Embed conformers to the RDKitMol. This will overwrite current conformers. By default, it will first try embeding 3D conformers; if failed, it will then try to compute 2D coordinates and use that for the conformer structures; if both approaches fail and allow embedding null conformers, then conformers with all zero coordinates will be embedded. The last one is still helpful if you have the coordinates, so you can use SetPositions to set them, or you want to optimize the geometry using things like forcefields.

Parameters:
  • n (int) – The number of conformers to be embedded. The default is 1.

  • embed_null (bool) – If embeding fails, whether to embed null conformers. Defaults to True.

EmbedMultipleNullConfs(n: int = 10, random: bool = True)

Embed conformers with meaningless atom coordinates. This helps the cases where a conformer can not be successfully embedded. You can choose to generate all zero coordinates or random coordinates. You can set to all-zero coordinates, if you will set coordinates later; You should set to random coordinates, if you want to optimize this molecule by forcefield (RDKit force field cannot optimize all-zero coordinates).

Parameters:
  • n (int) – The number of conformers to be embedded. Defaults to 10.

  • random (bool, optional) – Whether set coordinates to random numbers. Otherwise, set to all-zero coordinates. Defaults to True.

EmbedNullConformer(random: bool = True)

Embed a conformer with meaningless atom coordinates. This helps the cases where a conformer can not be successfully embeded. You can choose to generate all zero coordinates or random coordinates. You can set to all-zero coordinates, if you will set coordinates later; You should set to random coordinates, if you want to optimize this molecule by forcefield (RDKit force field cannot optimize all-zero coordinates).

Parameters:

random (bool, optional) – Whether set coordinates to random numbers. Otherwise, set to all-zero coordinates. Defaults to True.

classmethod FromFile(path: str, backend: str = 'openbabel', header: bool = True, correctCO: bool = True, removeHs: bool = False, sanitize: bool = True, sameMol: bool = False, **kwargs) RDKitMol

Read RDKitMol from file.

Parameters:
  • path (str) – File path to data.

  • backend (str) – The backend used to perceive molecule. Defaults to openbabel. Currently, we only support openbabel and jensen.

  • header (bool, optional) – If lines of the number of atoms and title are included. Defaults to True.

  • removeHs (bool) – Whether or not to remove hydrogens from the input (defaults to False)

  • sanitize (bool) – Whether or not to use RDKit’s sanitization algorithm to clean input; helpful to set this to False when reading TS files (defaults to True)

  • sameMol (bool) – Whether or not all the conformers in the (sdf) file are for the same mol, in which case we will copy conformers directly to the mol (defaults to False)

Returns:

An RDKit molecule object corresponding to the file.

Return type:

RDKitMol

classmethod FromInchi(inchi: str, removeHs: bool = False, addHs: bool = True, sanitize: bool = True)

Construct a molecule from a InChI string

Parameters:
  • inchi (str) – a InChI string. https://en.wikipedia.org/wiki/International_Chemical_Identifier

  • removeHs (bool, optional) – Whether to remove hydrogen atoms from the molecule, Due to RDKit implementation, only effective when sanitize is True as well. True to remove.

  • addHs (bool, optional) – Whether to add explicit hydrogen atoms to the molecule. True to add. Only functioning when removeHs is False.

  • sanitize (bool, optional) – Whether to sanitize the RDKit molecule, True to sanitize.

Returns:

An RDKit molecule object corresponding to the InChI.

Return type:

RDKitMol

classmethod FromMol(mol: Union[Mol, RWMol], keepAtomMap: bool = True) RDKitMol

Convert a RDKit Chem.rdchem.Mol molecule to RDKitMol Molecule.

Parameters:
  • rdmol (Union[Mol, RWMol]) – The RDKit Chem.rdchem.Mol / RWMol molecule to be converted.

  • keepAtomMap (bool, optional) – Whether keep the original atom mapping. Defaults to True. If no atom mapping is stored in the molecule, atom mapping will be created based on atom indexes.

Returns:

An RDKitMol molecule.

Return type:

RDKitMol

classmethod FromOBMol(obMol: openbabel.OBMol, removeHs: bool = False, sanitize: bool = True, embed: bool = True) RDKitMol

Convert a OpenBabel Mol to an RDKitMol object.

Parameters:
  • obMol (Molecule) – An OpenBabel Molecule object for the conversion.

  • removeHs (bool, optional) – Whether to remove hydrogen atoms from the molecule, Defaults to False.

  • sanitize (bool, optional) – Whether to sanitize the RDKit molecule. Defaults to True.

  • embed (bool, optional) – Whether to embeb 3D conformer from OBMol. Defaults to True.

Returns:

An RDKit molecule object corresponding to the input OpenBabel Molecule object.

Return type:

RDKitMol

classmethod FromRMGMol(rmgMol: rmgpy.molecule.Molecule, removeHs: bool = False, sanitize: bool = True) RDKitMol

Convert an RMG Molecule to an RDkitMol object.

Parameters:
  • rmgMol ('rmg.molecule.Molecule') – An RMG Molecule instance.

  • removeHs (bool, optional) – Whether to remove hydrogen atoms from the molecule, True to remove.

  • sanitize (bool, optional) – Whether to sanitize the RDKit molecule, True to sanitize.

Returns:

An RDKit molecule object corresponding to the RMG Molecule.

Return type:

RDKitMol

classmethod FromSDF(sdf: str, removeHs: bool = False, sanitize: bool = True) RDKitMol

Convert xyz string to RDKitMol.

Parameters:
  • sdf (str) – An SDF string.

  • removeHs (bool) – Whether or not to remove hydrogens from the input (defaults to False)

  • sanitize (bool) – Whether or not to use RDKit’s sanitization algorithm to clean input; helpful to set this to False when reading TS files (defaults to True)

Returns:

An RDKit molecule object corresponding to the sdf.

Return type:

RDKitMol

classmethod FromSmarts(smarts)

Convert a SMARTS to an RDKitMol object.

Parameters:

smarts (str) – A SMARTS string of the molecule

classmethod FromSmiles(smiles: str, removeHs: bool = False, addHs: bool = True, sanitize: bool = True, allowCXSMILES: bool = True, keepAtomMap: bool = True) RDKitMol

Convert a SMILES to an RDkitMol object.

Parameters:
  • smiles (str) – A SMILES representation of the molecule.

  • removeHs (bool, optional) – Whether to remove hydrogen atoms from the molecule, True to remove.

  • addHs (bool, optional) – Whether to add explicit hydrogen atoms to the molecule. True to add. Only functioning when removeHs is False.

  • sanitize (bool, optional) – Whether to sanitize the RDKit molecule, True to sanitize.

  • allowCXSMILES (bool, optional) – Whether to recognize and parse CXSMILES. Defaults to True.

  • keepAtomMap (bool, optional) – Whether to keep the Atom mapping contained in the SMILES. Defaults Defaults to True.

Returns:

An RDKit molecule object corresponding to the SMILES.

Return type:

RDKitMol

classmethod FromXYZ(xyz: str, backend: str = 'openbabel', header: bool = True, correctCO: bool = True, sanitize: bool = True, embed_chiral: bool = False, **kwargs)

Convert xyz string to RDKitMol.

Parameters:
  • xyz (str) – A XYZ String.

  • backend (str) – The backend used to perceive molecule. Defaults to openbabel. Currently, we only support openbabel and jensen.

  • header (bool, optional) – If lines of the number of atoms and title are included. Defaults to True.

  • sanitize (bool) – Sanitize the RDKit mol created using openbabel or not. Helpful to set this to False when reading in TSs. Defaults to True.

  • embed_chiralTrue to embed chiral information. Defaults to True.

  • kwargs (supported) –

    jensen:
    • charge: The charge of the species. Defaults to 0.

    • allow_charged_fragments: True for charged fragment, False for radical. Defaults to False.

    • use_graph: True to use networkx module for accelerate. Defaults to True.

    • use_huckel: True to use extended Huckel bond orders to locate bonds. Defaults to False.

    • forced_rdmc: Defaults to False. In rare case, we may hope to use a tailored

      version of the Jensen XYZ parser, other than the one available in RDKit. Set this argument to True to force use RDMC’s implementation, which user’s may have some flexibility to modify.

Returns:

An RDKit molecule object corresponding to the xyz.

Return type:

RDKitMol

GetAdjacencyMatrix()

Get the adjacency matrix of the molecule.

Returns:

A square adjacency matrix of the molecule, where a “1” indicates that atoms are bonded and a “0” indicates that atoms aren’t bonded

Return type:

numpy.ndarray

GetAllConformers() List[RDKitConf]

Get all of the embedded conformers.

Returns:

A list all of conformers.

Return type:

List[‘RDKitConf’]

GetAtomMapNumbers() tuple

Get the atom mapping.

Returns:

atom mapping numbers in the sequence of atom index.

Return type:

tuple

GetAtomMasses()

Get the mass of each atom. The order is consistent with the atom indexes.

Returns:

A list of atom masses.

Return type:

list

GetAtomicNumbers()

Get the Atomic numbers of the molecules. The atomic numbers are sorted by the atom indexes.

Returns:

A list of atomic numbers.

Return type:

list

GetBestAlign(refMol, prbCid: int = 0, refCid: int = 0, atomMaps: Optional[list] = None, maxIters: int = 1000, keepBestConformer: bool = True)

This is a wrapper function for calling AlignMol twice, with reflect to True and False, respectively.

Parameters:
  • refMol (Mol) – RDKit molecule as a reference.

  • prbCid (int, optional) – The conformer id to be aligned. Defaults to 0.

  • refCid (int, optional) – The id of reference conformer. Defaults to 0.

  • reflect (bool, optional) – Whether to reflect the conformation of the probe molecule. Defaults to False.

  • atomMap (list, optional) – a vector of pairs of atom IDs (probe AtomId, ref AtomId) used to compute the alignments. If this mapping is not specified an attempt is made to generate on by substructure matching.

  • maxIters (int, optional) – maximum number of iterations used in mimizing the RMSD. Defaults to 1000.

  • keepBestConformer (bool, optional) – Whether to keep the best Conformer structure. Defaults to True. This is less helpful when you are comparing different atom mappings.

Returns:

RMSD value. bool: if reflected conformer gives a better result.

Return type:

float

GetBondsAsTuples()

Generate a list of length-2 sets indicating the bonding atoms in the molecule

GetConformer(id: int = 0) RDKitConf

Get the embedded conformer according to ID.

Parameters:

id (int) – The ID of the conformer to be obtained. The default is 0.

Raises:

ValueError – Bad id assigned.

Returns:

A conformer corresponding to the ID.

Return type:

RDKitConf

GetConformers(ids: Union[list, tuple] = [0]) List[RDKitConf]

Get the embedded conformers according to IDs.

Parameters:

ids (Union[list, tuple]) – The ids of the conformer to be obtained. The default is [0].

Raises:

ValueError – Bad id assigned.

Returns:

A list of conformers corresponding to the IDs.

Return type:

List[RDKitConf]

GetDistanceMatrix(id: int = 0) ndarray
GetElementSymbols()

Get the element symbols of the molecules. The element symbols are sorted by the atom indexes.

Returns:

A list of element symbols.

Return type:

list

GetFormalCharge() int

Get formal charge of the molecule.

Returns:

Formal charge.

Return type:

int

GetInternalCoordinates(nonredundant: bool = True) list

Get internal coordinates of the molecule.

GetMolFrags(asMols: bool = False, sanitize: bool = True, frags: Optional[list] = None, fragsMolAtomMapping: Optional[list] = None) tuple

Finds the disconnected fragments from a molecule. For example, for the molecule ‘CC(=O)[O-].[NH3+]C’, this function will split the molecules into a list of ‘CC(=O)[O-]’ and ‘[NH3+]C’. By defaults, this function will return a list of atom mapping, but options are available for getting mols.

Parameters:
  • asMols (bool, optional) – Whether the fragments will be returned as molecules instead of atom ids. Defaults to True.

  • sanitize (bool, optional) – Whether the fragments molecules will be sanitized before returning them. Defaults to True.

  • frags (list, optional) – If this is provided as an empty list, the result will be mol.GetNumAtoms() long on return and will contain the fragment assignment for each Atom.

  • fragsMolAtomMapping (list, optional) – If this is provided as an empty list, the result will be a numFrags long list on return, and each entry will contain the indices of the Atoms in that fragment: [(0, 1, 2, 3), (4, 5)].values

Returns:

a tuple of atom mapping or a tuple of split molecules (RDKitMol).

Return type:

tuple

GetPositions(id: int = 0) ndarray

Get atom positions of the embeded conformer.

Parameters:

id (int, optional) – The conformer ID to extract atom positions from. Defaults to 0.

Returns:

a 3 x N matrix containing atom coordinates.

Return type:

np.ndarray

GetSpinMultiplicity() int

Get spin multiplicity of a molecule. The spin multiplicity is calculated using Hund’s rule of maximum multiplicity defined as 2S + 1.

Returns:

Spin multiplicity.

Return type:

int

GetSubstructMatch(query: Union[RDKitMol, RWMol, Mol], useChirality: bool = False, useQueryQueryMatches: bool = False) tuple

Returns the indices of the molecule’s atoms that match a substructure query.

Parameters:
  • query (Mol) – a RDkit Molecule.

  • useChirality (bool, optional) – enables the use of stereochemistry in the matching. Defaults to False.

  • useQueryQueryMatches (bool, optional) – use query-query matching logic. Defaults to False.

Returns:

A tuple of matched indices.

Return type:

tuple

GetSubstructMatches(query: Union[RDKitMol, RWMol, Mol], uniquify: bool = True, useChirality: bool = False, useQueryQueryMatches: bool = False, maxMatches: int = 1000) tuple

Returns tuples of the indices of the molecule’s atoms that match a substructure query.

Parameters:
  • query (Mol) – a Molecule.

  • uniquify (bool, optional) – determines whether or not the matches are uniquified. Defaults to True.

  • useChirality (bool, optional) – enables the use of stereochemistry in the matching. Defaults to False.

  • useQueryQueryMatches (bool, optional) – use query-query matching logic. Defaults to False.

  • maxMatches – The maximum number of matches that will be returned to prevent a combinatorial explosion.

Returns:

A tuple of tuples of matched indices.

Return type:

tuple

GetSymmSSSR()

Get a symmetrized SSSR for a molecule.

Returns:

a sequence of sequences containing the rings found as atom ids

Return type:

tuple

GetTorsionTops(torsion: Iterable, allowNonbondPivots: bool = False) tuple

Generate tops for the given torsion. Top atoms are defined as atoms on one side of the torsion. The mol should be in one-piece when using this function, otherwise, the results will be misleading.

Parameters:
  • torsion (Iterable) – An iterable with four elements and the 2nd and 3rd are the pivot of the torsion.

  • allowNonbondPivots (bool, optional) – Allow non-bonding pivots

Returns:

  • one of the top of the torsion

  • the other top of the torsion

GetTorsionalModes(excludeMethyl: bool = False, includeRings: bool = False) list

Get all of the torsional modes (rotors) from the molecule.

Parameters:
  • excludeMethyl (bool) – Whether exclude the torsions with methyl groups. Defaults to False.

  • includeRings (bool) – Whether or not to include ring torsions. Defaults to False.

Returns:

A list of four-atom-indice to indicating the torsional modes.

Return type:

list

GetVdwMatrix(threshold=0.4) Optional[ndarray]

Get the derived Van der Waals matrix, which can be used to analyze the collision of atoms. More information can be found from generate_vdw_mat.

Parameters:

threshold – float indicating the threshold to use in the vdw matrix

Returns:

A 2D array of the derived Van der Waals Matrix, if the

the matrix exists, otherwise None.

Return type:

Optional[np.ndarray]

HasCollidingAtoms(threshold=0.4) bool
Parameters:

threshold – float indicating the threshold to use in the vdw matrix

HasSameConnectivity(confId: int = 0, backend: str = 'openbabel', **kwargs) bool

Check whether the conformer of the molecule (defined by its spacial coordinates) as the same connectivity as the molecule.

Parameters:
  • confId (int, optional) – The conformer ID. Defaults to 0.

  • backend (str, optional) – The backend to use for the comparison. Defaults to openbabel.

  • **kwargs – The keyword arguments to pass to the backend.

Returns:

Whether the conformer has the same connectivity as the molecule.

Return type:

bool

Kekulize(clearAromaticFlags: bool = False)

Kekulizes the molecule.

Parameters:

clearAromaticFlags (optional) – if True, all atoms and bonds in the molecule will be marked non-aromatic following the kekulization. Defaults to False.

PrepareOutputMol(removeHs: bool = False, sanitize: bool = True) Mol

Generate a RDKit Mol instance for output purpose, to ensure that the original molecule is not modified.

Parameters:
  • removeHs (bool, optional) –

    Remove less useful explicity H atoms. E.g., When output SMILES, H atoms, if explicitly added, will be included and reduce the readablity. Defaults to False. Note, following Hs are not removed:

    1. H which aren’t connected to a heavy atom. E.g.,[H][H].

    2. Labelled H. E.g., atoms with atomic number=1, but isotope > 1.

    3. Two coordinate Hs. E.g., central H in C[H-]C.

    4. Hs connected to dummy atoms

    5. Hs that are part of the definition of double bond Stereochemistry.

    6. Hs that are not connected to anything else.

  • sanitize (bool, optional) – whether to sanitize the molecule. Defaults to True.

Returns:

A Mol instance used for output purpose.

Return type:

Mol

Reflect(id: int = 0)

Reflect the atom coordinates of a molecule, and therefore its mirror image.

Parameters:

id (int, optional) – The conformer id to reflect.

RemoveHs(sanitize: bool = True)

Remove H atoms. Useful when trying to match heavy atoms.py

Parameters:

sanitize (bool, optional) – Whether to sanitize the molecule. Defaults to True.

RenumberAtoms(newOrder: Optional[list] = None, updateAtomMap: bool = True) RDKitMol

Return a new copy of RDKitMol that has atom (index) reordered.

Parameters:
  • newOrder (list, optional) – the new ordering the atoms (should be numAtoms long). E.g, if newOrder is [3,2,0,1], then atom 3 in the original molecule will be atom 0 in the new one. If no value provided, then the molecule will be renumbered based on the current atom map numbers. The latter is helpful when the sequence of atom map numbers and atom indexes are inconsistent.

  • updateAtomMap (bool) – Whether to update the atom map number based on the new order.

Returns:

Molecule with reordered atoms.

Return type:

RDKitMol

Sanitize(sanitizeOps: Optional[Union[int, SanitizeFlags]] = rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_ALL)

Sanitize the molecule.

Parameters:

sanitizeOps (int or str, optional) – Sanitize operations to be carried out. Defaults to SanitizeFlags.SANITIZE_ALL. More details can be found at https://www.rdkit.org/docs/source/rdkit.Chem.rdmolops.html?highlight=sanitize#rdkit.Chem.rdmolops.SanitizeFlags.

SaturateBiradicalSites12(multiplicity: int, verbose: bool = True)

A method help to saturate 1,2 biradicals to match the given molecule spin multiplicity. E.g.,

C - C => C = C

In the current implementation, no error will be raised, if the function doesn’t achieve the goal. This function has not been been tested on nitrogenate.

Parameters:
  • multiplicity (int) – The target multiplicity.

  • verbose (bool) – Whether to print additional information. Defaults to True.

SaturateBiradicalSitesCDB(multiplicity: int, chain_length: int = 8, verbose: bool = True)

A method help to saturate biradicals that have conjugated double bond in between to match the given molecule spin multiplicity. E.g, 1,4 biradicals can be saturated if there is a unsaturated bond between them:

C - C = C - C => C = C - C = C

In the current implementation, no error will be raised, if the function doesn’t achieve the goal. This function has not been been tested on nitrogenate.

Parameters:
  • multiplicity (int) – The target multiplicity.

  • chain_length (int) – How long the conjugated double bond chain is. A larger value will result in longer time. defaults to 8.

  • verbose (int) – Whether to print additional information. Defaults to True.

SaturateCarbene(multiplicity: int, verbose: bool = True)

A method help to saturate carbenes and nitrenes to match the given molecule spin multiplicity:

-C- (triplet) => C-(**) (singlet)

In the current implementation, no error will be raised, if the function doesn’t achieve the goal. This function has not been been tested on nitrogenate.

Parameters:
  • multiplicity (int) – The target multiplicity.

  • verbose (int) – Whether to print additional information. Defaults to True.

SaturateMol(multiplicity: int, chain_length: int = 8, verbose: bool = False)

A method help to saturate the molecule to match the given molecule spin multiplicity. This is just a wrapper to call both SaturateBiradicalSites12, SaturateBiradicalSitesCDB, and SaturateCarbene:

C - C => C = C C - C = C - C => C = C - C = C -C- (triplet) => C-(**) (singlet)

In the current implementation, no error will be raised, if the function doesn’t achieve the goal. This function has not been been tested on nitrogenate.

Parameters:
  • multiplicity (int) – The target multiplicity.

  • chain_length (int) – How long the conjugated double bond chain is. A larger value will result in longer time.

  • verbose (bool) – Whether to print intermediate information. Defaults to False.

SetAtomMapNumbers(atomMap: Optional[Sequence[int]] = None)

Set the atom mapping number. By defaults, atom indexes are used. It can be helpful when plotting the molecule in a 2D graph.

Parameters:

atomMap (list, tuple, optional) – A sequence of integers for atom mapping.

SetPositions(coords: Union[Sequence, str], id: int = 0, header: bool = False)

Set the atom positions to one of the conformer.

Parameters:
  • coords (sequence) – A tuple/list/ndarray containing atom positions; or a string with the typical XYZ formating.

  • id (int, optional) – Conformer ID to assign the Positions to. Defaults to 1.

  • header (bool) – When the XYZ string has an header. Defaults to False.

SetVdwMatrix(threshold: float = 0.4, vdw_radii: dict = {1: 1.2, 2: 1.4, 3: 2.2, 4: 1.9, 5: 1.8, 6: 1.7, 7: 1.6, 8: 1.55, 9: 1.5, 10: 1.54, 11: 2.4, 12: 2.2, 13: 2.1, 14: 2.1, 15: 1.95, 16: 1.8, 17: 1.8, 18: 1.88, 19: 2.8, 20: 2.4, 21: 2.3, 22: 2.15, 23: 2.05, 24: 2.05, 25: 2.05, 26: 2.05, 27: 2.0, 28: 2.0, 29: 2.0, 30: 2.1, 31: 2.1, 32: 2.1, 33: 2.05, 34: 1.9, 35: 1.9})

Set the derived Van der Waals matrix, which is an upper triangle matrix calculated from a threshold usually around 0.4 of the Van der Waals Radii. Its diagonal elements are all zeros. The element (i, j) is calculated by threshold * sum( R(atom i) + R(atom j) ). If two atoms are bonded, the value is set to be zero. When threshold = 0.4, the value is close to the covalent bond length.

Parameters:
  • threshold (float) – The threshold used to calculate the derived Van der Waals matrix. A larger value results in a matrix with larger values; When compared with distance matrix, it may overestiate the overlapping between atoms. The default value is 0.4.

  • vdw_radii (dict) – A dict stores the Van der Waals radii of different elements.

Raises:

ValueError – Invalid threshold is supplied.

ToAtoms(confId: int = 0) Atoms

Convert RDKitMol to the ase.Atoms object.

Parameters:

confId (int) – The conformer ID to be exported.

ToGraph(keep_bond_order: bool = False) Graph

Convert RDKitMol to a networkx graph.

Parameters:

keep_bond_order (bool) – Whether to keep bond order information. Defaults to False, meaning treat all bonds as single bonds.

Returns:

A networkx graph representing the molecule.

Return type:

nx.Graph

ToInchi(options: str = '') str

Convert the RDKitMol to a InChI string using RDKit builtin converter.

Parameters:

options (str, optional) – The InChI generation options. Options should be prefixed with either a - or a / Available options are explained in the InChI technical FAQ: https://www.inchi-trust.org/technical-faq/#15.14 and https://www.inchi-trust.org/?s=user+guide. Defaults to “”.

ToMolBlock(confId: int = - 1) str

Convert RDKitMol to a mol block string.

Parameters:

confId (int) – The conformer ID to be exported.

Returns:

The mol block of the molecule.

Return type:

str

ToOBMol() openbabel.OBMol

Convert RDKitMol to a OBMol.

Returns:

The corresponding openbabel OBMol.

Return type:

OBMol

ToRWMol() RWMol

Convert the RDKitMol Molecule back to a RDKit Chem.rdchem.RWMol.

Returns:

A RDKit Chem.rdchem.RWMol molecule.

Return type:

RWMol

ToSDFFile(path: str, confId: int = - 1)

Write molecule information to .sdf file.

Parameters:

path (str) – The path to save the .sdf file.

ToSmiles(stereo: bool = True, kekule: bool = False, canonical: bool = True, removeAtomMap: bool = True, removeHs: bool = True) str

Convert RDKitMol to a SMILES string.

Parameters:
  • stereo (bool, optional) – Whether keep stereochemistry information. Defaults to True.

  • kekule (bool, optional) – Whether use Kekule form. Defaults to False.

  • canonical (bool, optional) – Whether generate a canonical SMILES. Defaults to True.

  • removeAtomMap (bool, optional) – Whether to remove map id information in the SMILES. Defaults to True.

  • removeHs (bool, optional) – Whether to remove H atoms to make obtained SMILES clean. Defaults to True.

Returns:

The smiles string of the molecule.

Return type:

str

ToXYZ(confId: int = - 1, header: bool = True, comment: str = '') str

Convert RDKitMol to a xyz string.

Parameters:
  • confId (int) – The conformer ID to be exported.

  • header (bool, optional) – Whether to include header (first two lines). Defaults to True.

Returns:

The xyz of the molecule.

Return type:

str

rdmc.mol.generate_radical_resonance_structures(mol: RDKitMol, unique: bool = True, consider_atommap: bool = True, kekulize: bool = False)

Generate resonance structures for a radical molecule. RDKit by design doesn’t work for radical resonance. The approach is a temporary workaround by replacing radical electrons by positive charges and generating resonance structures by RDKit ResonanceMolSupplier. Currently, this function only works for neutral radicals.

Parameters:
  • mol (RDKitMol) – A radical molecule.

  • unique (bool, optional) – Filter out duplicate resonance structures from the list. Defaults to True.

  • consider_atommap (bool, atommap) – If consider atom map numbers in filtration duplicates. Only effective when uniquify=True. Defaults to False.

  • kekulize (bool, optional) – Whether to kekulize the molecule. Defaults to False. When True, uniquifying process will be skipped.

Returns:

a list of molecules with resonance structures.

Return type:

list

rdmc.mol.generate_vdw_mat(rd_mol, threshold: float = 0.4, vdw_radii: dict = {1: 1.2, 2: 1.4, 3: 2.2, 4: 1.9, 5: 1.8, 6: 1.7, 7: 1.6, 8: 1.55, 9: 1.5, 10: 1.54, 11: 2.4, 12: 2.2, 13: 2.1, 14: 2.1, 15: 1.95, 16: 1.8, 17: 1.8, 18: 1.88, 19: 2.8, 20: 2.4, 21: 2.3, 22: 2.15, 23: 2.05, 24: 2.05, 25: 2.05, 26: 2.05, 27: 2.0, 28: 2.0, 29: 2.0, 30: 2.1, 31: 2.1, 32: 2.1, 33: 2.05, 34: 1.9, 35: 1.9})

Generate a derived Van der Waals matrix, which is an upper triangle matrix calculated from a threshold usually around 0.4 of the Van der Waals Radii. Its diagonal elements are all zeros. The element (i, j) is calculated by threshold * sum( R(atom i) + R(atom j) ). If two atoms are bonded, the value is set to be zero. When threshold = 0.4, the value is close to the covalent bond length.

Parameters:
  • threshold (float) – The threshold used to calculate the derived Van der Waals matrix. A larger value results in a matrix with larger values; When compared with distance matrix, it may overestiate the overlapping between atoms. The default value is 0.4.

  • vdw_radii (dict) – A dict stores the Van der Waals radii of different elements.

Raises:

ValueError – Invalid threshold is supplied.

rdmc.mol.get_unique_mols(mols: List[RDKitMol], consider_atommap: bool = False, same_formula: bool = False)

Find the unique molecules from a list of molecules.

Parameters:
  • mols (list) – The molecules to be processed.

  • consider_atommap (bool, optional) – If treat chemically equivalent molecules with different atommap numbers as different molecules. Defaults to False.

  • same_formula (bool, opional) – If the mols has the same formula you may set it to True to save computational time. Defaults to False.

Returns:

A list of unique molecules.

Return type:

list

rdmc.mol.has_matched_mol(mol: RDKitMol, mols: List[RDKitMol], consider_atommap: bool = False)

Check if a molecule has a structure match in a list of molecules.

Parameters:
  • mol (RDKitMol) – The target molecule.

  • mols (List[RDKitMol]) – The list of molecules to be processed.

  • consider_atommap (bool, optional) – If treat chemically equivalent molecules with different atommap numbers as different molecules. Defaults to False.

Returns:

if a matched molecules if found.

Return type:

bool

rdmc.mol.parse_xyz_or_smiles_list(mol_list, with_3d_info: bool = False, **kwargs)

A helper function to parse xyz and smiles and list if the conformational information is provided.

Parameters:
  • mol_list (list) – a list of smiles or xyzs or tuples of (string, multiplicity) to specify desired multiplicity. E.g., [‘CCC’, ‘H 0 0 0’, (‘[CH2]’, 1)]

  • with_3d_info (bool) – Whether to indicate which entries are from 3D representations. Defaults to False.