Written by Abe Wang | Apr 14, 2023 7:00:00 AM
Molecule Clustering in Visualization
The CDD Vault Visualization tool now has a new “plot type” for clustering. Clicking the Cluster analysis tool will:
- give you an option to set a Tanimoto similarity threshold
- perform the clustering algorithm across all structures
- Add a new “Cluster ID” column to the data table
- Cluster ID is now available as a parameter which can be used in all plots
As an example, one could have a scatter plot of Inhibition scores versus a chosen chemical property, where the dots are colored by Cluster ID.
Substructures Available as Plot Parameters
Substructures can now be used in histograms and as plot parameters, such as color, size, shape, etc. Select from the automatically generated fragments or draw your own substructure. When configuring a plot, the “Substructure” option is available for Color, Shape, Size, and Split plot by parameters. Also, when choosing the axis for Histograms, the “Substructure” option is now available. By default, the top 25 most frequent automatically generated fragments are selected.
Enhanced Searching in Visualization Drop-Down Menus
Whenever you are using a drop-down to select a parameter, there is now a search and type-ahead feature to help you efficiently locate the value of interest. Easily find the desired value, project or collection.
New Tautomer Detection Parameter for POST Batches API Call
To allow API workflows to mimic the new Tautomer Detection Compound Registration Workflow recently deployed within the CDD Vault GUI, a new "tautomer_resolution" parameter was added to the POST Batches API call. The options available when including this parameter are:
- first
"tautomer_resolution":"first"
- results in a new Batch being registered for the first Molecule detected as a potential tautomer
- new
"tautomer_resolution":"new"
- results in a new Molecule being registered
- prompt
"tautomer_resolution":"prompt"
- results in nothing being registered
- Existing tautomer molecule IDs are returned
This new "tautomer_resolution" parameter goes WITHIN the "molecule" section of your JSON: { "molecule": { "registration_type":"CHEMICAL_STRUCTURE", "smiles":"O=CC(C=O)O", "tautomer_resolution": "prompt"}, "projects":["Internal Data"] }
Helpful hints:
- The intent of the
"tautomer_resolution":"prompt"
parameter is to give you the Molecule IDs of the matching tautomers... then, you can use the POST Batches without the structure, and match on the Molecule ID or Name to register the new Batch of any existing Molecule.
POST Batches with JSON like this: {"molecule": { "Registration_type":"chemical_structure", "name":"CWS-0005669" }, "projects": ["CW Test"]}
- The
"tautomer_resolution":"first"
is the default. So, if you leave out the "tautomer_resolution" parameter all together, "tautomer_resolution":"first"
is used and a new Batch will be registered for the first Molecule detected as a potential tautomer. (You are not warned that there were tautomers detected.)