"I think the biggest impact was people in academia that were not used to databases and with the training and the use, they came to realize how powerful it is to have a permanent central repository, where you can access the data very fast and correlate the most salient patterns. I think that is the most important impact by providing this new capability that was even useful for people that were just not used to databases. It really speaks to the ease of use and overall design of CDD Vault."
Enrique Michelotti, Ph.D.
NIMH/NIH,
Program Director
Molecular Libraries and Blueprint Neurotherapeutics Program, Division of Neuroscience and Basic Behavioral Science (DNBBS)
Dr. Enrique Michelotti manages trans-institute projects at the NIH, including the Blueprint Neurotherapeutics project which spans across 15 NIH institutes. Dr. Michelotti has years of international experience in academia and industry, working in large and small companies. His pharmaceutical industry experience includes working as a Research Fellow at Rohm and Haas, as well as in management as the Director of Medicinal Chemistry at Locus Discovery.
Interviewed by Barry Bunin, PhD Collaborative Drug Discovery, Inc.
Barry Bunin
I thought I would start out with a simple question, just to explain what is the Blueprint.
Enrique Michelotti
There are two different things. One is the Blueprint and the other thing is the Blueprint Neurotherapeutics. Let's call it a BP, Blueprint, and a BPN, Blueprint Neurotherapeutics. The Blueprint is large trans-institute project from NIH. The Blueprint Neurotherapeutics is part of the Blueprint and focused on projects related to what is called the Brain Institutes, anything related to mental health, neurological diseases, eye diseases, hearing problems, and so on. Now the BPN in which we are using CDD is a large project that tries to develop, or to bring hits to leads, or to translate hits all the way to phase one clinical candidates. The BPN is a complex project because it involves biological problems ranging from eye diseases all the way to psychiatric diseases, and currently we are using CDD as the cheminformatics database where we put all the compound structures generated in the project, plus the biological activity generated with those compounds.
Barry Bunin
Great, that's a wonderful basic introduction for folks. And just talk a little bit about your role at the NIH in this project and how CDD supports the goals of the BPN, as you called it.
Enrique Michelotti
The way that the BPN works is based on a virtual Pharma model. There are complementary groups participating. One is the principal investigator, the PI that proposes the project, which is peer reviewed, and if it's accepted, the PI will be a critical part in the biology development of the project. The second group are consultants paid for by the NIH. These consultants have a really strong industrial pharmaceutical background, most of them from big pharma covering expertise from assay development, ADME expertise, medicinal chemistry, and also clinical expertise, and then there are the CROs, the companies actually making or running in-vivo tests. We have a chemistry CRO, and also an in-vivo/PK CRO, as well as GMP synthesis and formulation CROs. And CDD is helping us to craft a common data repository, so everyone in the team with such, not only diverse backgrounds, but also diverse geographies, where everyone is not in the same “room”, can access the data very fast after it's generated, and so everyone has access to the same data.
Barry Bunin
I think one key point, I would just add is that everyone has access to whatever data they have permission to view and so collaborative partitioning…the selective sharing component is critical, in terms of who has permissions to view which data.
Enrique Michelotti
Okay, let me explain that. That is one thing that for this project was a strength with CDD. We started with seven projects, then evolved to now having eight, and soon we will have eleven, and each project is independent, so it is really important that there is no cross-talk between the different projects, because each project’s IP is completely different. We'll go to one university or to the other, and CDD allows us to create formally siloed processes, in which all the people in one project can see only the data related to that project, and not the other ones. Some people, like the administrator or perhaps a chemist that is involved in more than one project, can see more than one project, according to what we decide and allow people to see. The data partitioning permissions rigorously control what we allow these people to see.
Barry Bunin
Excellent, and maybe just talk about what was involved because people were all in different locations. Can you talk about how CDD and the NIH were able to get up and running in just a month with these different projects from a technical, but also from just a process perspective?
Enrique Michelotti
In the process we have two very different stages. The first stage involved all the historical data that each PI’s team had previously generated before accessing the BPN resources. All that data had to be placed in CDD. And the second stage was for all new generated data. The new generated data was relatively straightforward because we decided upfront how to structure the templates. The templates rigorously determined how the data would be put in the database and not only that, what sort of data and which conditions to capture with the data. The historical data is different, and I think that's always the case, because obviously we couldn't change it after the fact, so the database had to be flexible enough to allow the uploading of this previously generated data in a way that is relevant, was relevant not only to the project, at that moment, but also after the new data was uploaded to be most useful when mined all together moving forward. And this was all accomplished really flawlessly in CDD with a lot of expert help from the CDD personnel. Help in both aspects, in uploading and also in how to best generate the Excel files with SMILES or the SD files needed to upload the data in a standardized, useful format for mining into CDD. And, as you mention it was accomplished surprisingly quickly, within a month in all the seven projects.
Barry Bunin
I can imagine other people might be surprised that our virtual process could be managed that effectively, so what were some of the things maybe from your industry background that impacted the effective rollout as well as some of the continuous improvements that have been done to date? So there's the initial startup period, and then once that is setup, how do you improve once you've gotten your first level of baseline efficiency, how do you get to your second level of effectiveness, especially when dealing with the inherent complexity of working collaboratively with different groups?
Enrique Michelotti
After the projects were are up and running, there were two different things that we did. First, the SAR data with all the primary, and secondary assays were uploaded. And then we also upload in vivo PK data into CDD. To make certain that there was nothing missing, in addition uploading key PK parameters to the CDD database, we also uploaded the full assay report PDF, and that was really powerful because people can not only access just the few parameters of the in vivo PK, but the full report, and that was really one of the things that have been really important for the latest stages of the project.
The second part is when we started the project, the BPN project, we also polled the members of the different teams to see what improvements they think will make the use of CDD easier for them, or certain new things that CDD was not really supporting at that moment. And after that survey, a number of improvements were selected, and then in discussion with CDD personnel, we prioritized based on three things. First, the needs of the BPN. Second, the needs of CDD, what was more important for CDD, given your insights into the broader market needs and developments. And third, based on the how fast or how cost efficiently it will be done. And during the last year we had at least 20 ideas. We worked together on a number of improvements for making the database easier to use for this broad group of people that go from chemists to ADME specialists to biologists.
Barry Bunin
So one of the things that, at least to me, has been interesting is the range, the breadth of CNS science involved in these drug discovery screening projects. There are projects working on Alzheimer's and spinal cord injury, as well as another project where the company co-founders recently won the Nobel Prize for the GPCR research. Can you talk about balancing the innovative biology aspects with the actual applied drug discovery aspects and a little bit about how to integrate complementary expertise?
Of course just for what's been already published and publicly released science, can you share how you're working on translating those into new drug candidates and ultimately, as you mentioned, into the clinic…
Enrique Michelotti
The Blueprint Neurotherapeutics is focused on projects related to the brain, and also on projects that are tend to be riskier because of the biology, tend to be more novel than a company traditionally will work on, and for that the biology involved in most of the projects related to the BPN are very novel. The novel biological assays make them riskier. And as you mentioned, we have projects that go from Alzheimer's, Parkinson Disease, Depression, macular degeneration in the eye, hearing loss, and so on. All those projects, we tend to support them because the novel biology can change the way that the disease is being seen and, as I mentioned before, they can be riskier than traditional projects in the pharmaceutical industry. We certainly don't compete with them and the intention is after the projects have been completely or have been partially de-risked, this project will be picked up by somebody in the private sector.
Barry Bunin
So one of the things that I've noticed is that the NIH seems to have more people with industry experience involved in the program. In the Blueprint we work with you with industry experience both at Locus Pharmaceuticals plus Rohm and Haas, as well as Charles Cywin, who rose through the ranks to the Director level at Boehringer Ingelheim, and many others at NIMH with deep industry experience. At NCATS, there's Christopher Austin, the director of NCATS is from Merck as well as Jim Inglese and others at NCATS who are also from Merck and had experience running HTS at Pharmacopiea.
Maybe you can talk a little bit about how your industry experience influences and impacts your work both at the NIH and in these collaborations. We had a previous spotlight with one of your coworkers from back when you were in industry at Locus, so I am curious to hear about the transition from industry to NIH and what you learned in industry and how that's impacting your work at the NIH now.
Enrique Michelotti
I’ve been now at the NIH for three and a half years and working always in what NIH calls Trans-Institute Projects, first in the Molecular Libraries, that is a Common Fund Project involving any institute, any of the 27 institutes in the NIH. The Blueprint Neurotherapeutics, too, involves now 15 different institutes. As a general medicinal chemist, let's call it, I have not had a disease specific focus, that is I'm not tied to one disease like biology which can be more specialized, and that versatility has helped me a lot in dealing with so many different projects or different background, projects like I needed to do in the Molecular Library project and the Blueprint. The industry experience also generally brings some rigor in a way, to advance projects that is really important, especially when working with people that have not had that first-hand drug discovery expertise. Sometimes at the NIH it is a very important push to de-risk the projects from the point of view of the biology, when you get to phase one, once there is a good target proof of concept, to demonstrate what is really happening. That is a little bit different from even what industry has done in the past. Industry now is moving strongly in that direction too. The second thing that the industry experience brings, is being used to working groups, in which not only one person can make the project successful, you need everything or every single area to click, to have something in the clinic and something in the market that goes from the chemistry, biology, the clinical, and all the IT expertise and so on, and that was really important in this large Trans-Institute Projects that I work on.
Barry Bunin
Can you talk a little bit about what you've liked working with CDD, what's went well and where things could go in the future as well?
Enrique Michelotti
Let's talk first not about the database itself, but dealing with the CDD company. The personnel have been extremely willing and flexible to help us in modifying or adapting things for the well-being of the project. Very flexible, so I really appreciated that. It really helped us in getting things in a way that everyone will use. Our users go from people that are extremely expert in databases, say for people that are used to very sophisticated drug discovery databases from industry, that not only deal with databases but the complex, multi-parameter, multidimensional SAR searches, and then on the other extreme, to people that have never had done anything with databases, which makes training a challenge, and optimally serving such a broad-based diversity of expertise or level of understanding is a fundamental challenge. And I think all the training sessions that we have setup, and there have been quite a number of them tailored to different people, have been really helpful and CDD have been very willing to help us doing that.
About the database, what I appreciate is the flexibility that allows us to have all the information from projects – actually from a wide variety of projects, all put in the database and the corresponding ability to not only search by compounds, biology, and IC50s, but also being able to search by the BPN numbers for time-bound reports and metrics. These reports show specific or overall progress of the projects. These are not only the PK reports but other biology reports that are traditionally very hard to specifically track in databases.
In summary, I think those two things, the willingness of the CDD personnel to really work with us, that have been one plus, with the second plus being the inherent flexibility of the database.
Barry Bunin
That's very consistent with what we've been seeing on our side in terms of the challenges and what's needed. I'm curious, before you started using CDD, what was it like without the database in terms of managing the data or collaborating and before you knew it was going to work out this well, what was sort of involved in thinking about how to evaluate CDD? Can you talk about how it was before and how it is after when managing a collaborative project?
Enrique Michelotti
Well when we started the Blueprint Neurotherapeutics, we immediately realized that we needed a database and that we'll have a central permanent repository of data. When we started the project in 2010, early 2010, we had in mind to have a database system in place and out of the many that we evaluated, we decided to go with CDD. So the BPN started with the database with CDD as the provider. In the other projects like the Molecular Libraries, for example, there it was completely different. PubChem was the repository of the data and the searches were run in PubChem. That is a public system, it is a different type of system with different types of functionality (such as direct integration with other NIH Databases and as a public repository) in which the search is not as developed as you can see in CDD or other leading commercial databases. So pre or post CDD, I cannot comment in the BPN because we started with CDD.
Barry Bunin
Let me ask the last CDD question, and then I'll move to more just general science questions. So when you look at the impact of working with CDD in a collaborative project, can you talk about the impact on the different constituents, with the biologists running screens at the different places, the chemists working on the molecules, and then the NIH managing the project. Can you talk about what the tangible impact has been for each of the different types of users?
Enrique Michelotti
The impact was especially in just having the appropriate access, right away, to make meaningful decisions for the projects. The way that we operate in BPN, there are scheduled meetings or scheduled teleconferences where the entire team calls in and there is a discussion of the data and then a discussion on how to move forward based on that data. These teleconferences are held biweekly, so every other week there is a period, specifically two days before the teleconference, by which all the new data is deposited in CDD, so people have time to really evaluate and discuss the data in the teleconference with knowledge. And so one of the positive impacts working with CDD, or a similar database, is that it really forces everyone to be on schedule to not only read whatever is uploaded but to be on a consistent schedule in uploading data that is really important in virtual Pharma projects, and we wanted the same rigor working across distributed cultures and geographies. When folks are in - let's call it in a traditional brick and mortar pharma, everyone is in one place, so the data is there while that is not the case for us, if we did not have a collaborative database that allows everyone to upload data with secure, selective, appropriate data access viewing privileges. So having by having this collaborative CDD Vault database, CDD has really made the biggest impact when combined with a process to have this biweekly uploading of data in a rigorous way and a discussion of the data every other week.
Barry Bunin
It was more just in terms the benefits for the different constituents... Like the biologist perspective who maybe has or hasn’t used the database before, or industry chemists who have, and managers with different roles and responsibilities…
Enrique Michelotti
I think the biggest impact was people in academia that were not used to databases and with the training and the use, they came to realize how powerful it is to have a permanent central repository, where you can access the data very fast and correlate the most salient patterns. I think that is the most important impact by providing this new capability that was even useful for people that were just not used to databases. It really speaks to the ease of use and overall design of CDD. People, from industry, they were used to databases. They just blended in very fast and actually one of the things that was interesting is the different types of suggestions to improve the databases. Many of them came from people that knew databases, but many of them, especially in the ease of use category, came from people that were relatively naïve users of databases, which was real interesting to observe. But again, the biggest impact was being helpful for a diversity of collaborators with the different levels of expertise, but mostly on people that have little or no expertise in databases. In a sense they had the most to gain, given they were going from nothing to something useful when moving from doing drug discovery without any database to working with CDD.
Barry Bunin
So switching gears a bit here, one of the things I like to do in these Spotlight interviews, since we are all initially trained as scientists independent of the different things we end up doing over our careers, is just ask about either a formative scientific experience or memorable interaction you've had with another brilliant scientist in your career. It can be something before the NIH or something in some of the other projects you've done at the NIH or this one, but just something where there was something memorable, some "ah-ha", some insights. I know you've been parts of many, either directly or managing them. Can you share something interesting that's happened in the various scientific projects?
Enrique Michelotti
I started early on in fungicides and from one of the main projects where now there is a commercial compound, where it was critical to differentiate the toxicity from the activity. And that was my first experience with an SAR project. There it was so important to have the right computational tools, not only databases, which were most critical to be able to correlate and differentiate the level of phytotoxicity or cytotoxicity and the activity. Later on, I moved to the other medicinal chemistry projects using combinatorial chemistry and parallel synthesis in which, again, it was extremely important to have the database to not only use and correlate data when it's produced, but also to plan and organize tests. To do the whole process holistically and efficiently, it is critical to have a database. And then the one very good and transformative experience was when I was in Locus Discovery then Locus Pharmaceuticals. They were a structure based company in which not only the design of the molecules were run computationally by a Monte Carlo approach, but also each target was then evaluated with predictors to see a priori how good they were in principle, and then make the compounds. It was all tracked by a database to finally see the results in the test, and correlate not only biology and chemistry but also the predictions. So there was an immediate feedback loop to assess the computational tools. And that really was an interesting process of evolving from being a chemist, a synthetic chemist, to being able to understand all of this computational approaches and more of the overall drug discovery process.
This blog is authored by members of the CDD Vault community. CDD Vault is a hosted drug discovery informatics platform that securely manages both private and external biological and chemical data. It provides core functionality including chemical registration, structure activity relationship, chemical inventory, and electronic lab notebook capabilities!
CDD Vault: Drug Discovery Informatics your whole project team will embrace!