CDD has recently developed, as part of CDD Visualization, the ability to build machine learning models.
These can be used to make predictions for properties of molecules that may not have been tested. Once a scientist chooses a “good” set of compounds, perhaps with a desired bioactivity, and a “bad” set, the inactives, with a single click they can build a machine learning model. In a few seconds the model is built and cross validation is performed. The model can then be used to score other compounds, such as from a vendor library, the FDA approved drugs in CDD Public, or molecules that are stored securely in the users own private CDD Vault.
In discussions with our advisory board member Dr. Christopher Lipinski, we began to wonder what other properties could the machine learning models predict. An expert medicinal chemist’s appraisal of the MLPCN probe compounds was high on the list. Dr. Lipinski had previously been involved in a study where 11 experts had given their opinion on the 64 NIH Probes. It has been over 5 years since that study and now there are more than 300 NIH Probes. Dr. Lipinski has evaluated them as he would for a drug development program through an exhaustive, manual iterative process. He ultimately determined if a probe was desirable or not, considering the literature and chemical reactivity. We have now included Dr. Lipinski’s evaluations alongside the NIH Probes available on CDD Public. From his decisions, we defined a “good” and “bad” set and built machine learning models that could predict his scores! You can read the details in a recent paper published in J Chem Inf Model.
This heat map shows a comparison of our "Expert Model" with other druglikeness metrics for the probes labeled as undesirable by Dr. Lipinski. Red corresponds to a less druglike value for each metric.
This work still leaves some open questions – would a machine learning model be more predictive with evaluations from more than one expert? Can a scientist build a model that works based on their own evaluations? If you'd like to see how your compounds would score in our “Expert Model,” please contact us at info@collaborativedrug.com.
This blog is authored by members of the CDD Vault community. CDD Vault is a hosted drug discovery informatics platform that securely manages both private and external biological and chemical data. It provides core functionality including chemical registration, structure activity relationship, chemical inventory, and electronic lab notebook capabilities!
CDD Vault: Drug Discovery Informatics your whole project team will embrace!
Other posts you might be interested in
View All Posts
News
2 min
November 20, 2024
Collaborative Drug Discovery Receives SOC 2 Type II Compliance Attestation
Read More
CDD Blog
8 min
November 19, 2024
Drug Discovery Industry Roundup with Barry Bunin — November, 19 2024
Read More
CDD Vault Snack
4 min
November 18, 2024
Vault Snack #25 - All About CDD Vault Templates
Read More