================================ Welcome to mlchem documentation! ================================ **mlchem** is a Python cheminformatics library designed for the scientific community. It provides a comprehensive set of tools for data handling, molecule manipulation, drawing, machine learning, and plotting. The library has been tested for python 3.11, 3.12 and 3.13. GitHub repository is available at https://github.com/seacunilever/mlchem ============ Installation ============ To install **mlchem**, open your command prompt and use the following command: .. code-block:: bash pip install git+https://github.com/seacunilever/mlchem.git Alternatively, latest release from PiPy (not available yet): .. code-block:: bash pip install mlchem Development installation, to modify the code or contribute with some changes: .. code-block:: bash # Clone the repository git clone https://github.com/seacunilever/mlchem cd mlchem # (Optional: create a virtual environment) python3 -m venv _venv . ./_venv/bin/activate # On Windows: .\_venv\Scripts\activate # Make an editable install of mlchem from the source tree pip install -e . # and install requirements pip install -r requirements.txt ======== Features ======== - **Data Handling**: Efficiently manage and process chemical data, including loading, cleaning, and transforming datasets. - **Molecule Manipulation**: Tools for manipulating molecular structures, such as adding or removing atoms, modifying bonds, and generating molecular conformations. - **Pattern Recognition**: An extensive list of functions to search for specific structural patterns. - **Molecule Drawing**: Visualise molecules with customisable drawing options, creating high-quality images for presentations and publications. - **Machine Learning**: Implement machine learning models for cheminformatics, including training, evaluating, and deploying models to predict chemical properties and activities. - **Feature Analysis and Interpretation**: Interpret model features and provide insightful plots. Here's a basic example of how to use **mlchem** (this calculates rdkit descriptors for two molecules):: from mlchem.chem.manipulation import create_molecule from mlchem.chem.calculator import descriptors mol1 = create_molecule('c1ccccc1CCCO') mol2 = create_molecule('CCCCCN') desc_df = descriptors.get_rdkitDesc([mol1, mol2],include_3D=True) More examples in the **examples** folder. ============ Contributing ============ We welcome contributions to **mlchem**. Users are free to propose new functionalities, flag new bugs, fix old bugs and issue pull requests. Please consult the guide in the respository (work in progress) on how to properly submit pull requests. ======= License ======= This project is licensed under the BSD-3 License. ================ Acknowledgements ================ Special thanks to the Safety, Environmental & Regulatory Science (SERS) Department at Unilever. .. note:: This project is under active development. This project includes components licensed under the Apache License 2.0 (e.g., the SELFIES package), as well as source code taken and adapted from RDKit library.