sanitize_complex#

sanitize_complex(mol, verbose=False, value_missing_coord=0, add_hydrogens=False, add_atom='I', sanitize=True)[source]#

Sanitize ligands, determining X-type and L-type, returning a sanitized complex with oxidation state, number of electrons, and metal formal charge.

Note that if coordinates are present in a conformer, bonds are detected to find all metal interaction points.

Parameters:
  • mol (rdkit.Chem.rdchem.Mol) – RDKit molecule representing the transition metal complex.

  • verbose (bool, optional, default=False) – If True, print updates during processing.

  • value_missing_coord (float, optional, default=0) – Value used to detect missing coordinates (e.g., 0 for (0,0,0)).

  • add_hydrogens (bool, optional, default=False) – If True, add explicit hydrogens to the structure if needed.

  • add_atom (str, optional, default=’I’) – Element symbol of the “dummy atom” used in cleave_mol_from_index()

  • sanitize (bool, optional, default=True) – If True, the final complex will be sanitized with sanitize_molecule()

Raises:

ValueError – If the molecule does not contain a transition metal.

Returns:

Dictionary containing:
  • ”metal_info”: dict with keys:
    • ”rdmol”: RDKit molecule of the transition metal center.

    • ”oxidation_state”: Oxidation state of the metal center.

    • ”total_charge”: Formal charge of the metal center.

    • ”number_electrons”: Electron count for the metal center.

  • ”ligand_info”: list of dict
    List of ligand information dictionaries, each with keys:
    • ”smiles”: Canonical explicit hydrogen smiles string generated from the RDKit molecule

      of the complex. Note that the dummy atom type is present to denote where the metal attaches; commonly I.

    • ”rdmol”: RDKit molecule of the ligand.

    • ”total_charge”: Total charge of the ligand.

    • ”hanging_bonds”: Number of unused valencies.

    • ”charged_atoms”: Atom charge information (see tmos.build_rdmol.assess_atoms()).

    • ”L-type connectors”: List of original atom indices for L-type connectors.

    • ”X-type connectors”: List of original atom indices for X-type connectors.

  • ”complex_info”: dict with keys:
    • ”smiles”: Canonical explicit hydrogen smiles string generated from the RDKit molecule of the complex

    • ”rdmol”: RDKit molecule of the reformed transition metal complex.

    • ”oxidation_state”: Oxidation state of the metal center.

    • ”total_charge”: Overall charge of the complex.

    • ”geometry”: Geometry information of the complex.

Return type:

dict