Getting Started
Welcome to the ChemXploreML documentation! ChemXploreML is a modular desktop application designed to simplify the process of predicting molecular properties using advanced machine learning techniques. This guide will help you quickly set up and start using ChemXploreML.
Download and Installation
ChemXploreML is available as a downloadable application for macOS (Intel and Apple Silicon), Windows (64-bit) and Linux (64-bit, .deb, .rpm and .AppImage):
Initial Setup
- Download the latest release from the GitHub releases page.
- Extract the downloaded archive to your preferred location.
- Run the ChemXploreML executable:
- On macOS, open the
.dmg
file and drag ChemXploreML into yourApplications
folder. - On Windows, run the installer (
.exe
) and follow the installation instructions. - On Linux, first make the
.AppImage
file executable and then run it.
- On macOS, open the
bash
# On Linux
chmod +x ChemXploreML-*.AppImage
./ChemXploreML-*.AppImage
Quick Start Guide
ChemXploreML provides an intuitive user interface for data preparation, molecular embedding, and machine learning model training. Follow these steps to get started:
1. Load Your Data
- Launch ChemXploreML.
- Go to LOAD FILE tab and browse directory to load your
.csv
file. - Supported file formats include CSV (preferred), JSON, and HDF5.
2. Vectorize Molecules
- Go to VECTORIZE MOLECULES tab.
- Select the molecular embedding model you want to use.
- Click on Compute button to start the vectorization process.
- The vectorized molecules will be saved in the /embedded_vectors/<embedder_name>.npy file.
3. Train Your Machine Learning Model
- Go to ML Training tab.
- Select ML Model from the sidebar.
- (Optional) In Control Panel, configure,
CV, Data split, scaling, cleaning, etc.
- Select the machine learning model you want to use.
- Click on Begin training button to start the training process.
- The trained model will be saved as
<model_name>_<embedder_name>_embeddings_pretrained_model_<mode>.pkl
in the/pretrained_models/<model_name>/<embedder_name>_embeddings/<mode>/
directory. - The model performance plots will be saved in the
/pretrained_models/<model_name>/<embedder_name>_embeddings/<mode>/figures/
directory.
4. Predict Molecular Properties
- In ML Prediction tab, select ML Prediction from the sidebar.
- Choose
model -> embedder -> pre-trained model
from the dropdown menu. - Enter
smiles
string in the SMILES text box. - Click on Compute button to start the prediction process.
- The predicted property value will be displayed in the Predicted value text box.
What's Next?
Enjoy exploring your molecular datasets with ChemXploreML!