"Certainly the biologist and the biology team more than appreciate CDD Vault. I think that just about everybody on the biology team finds CDD Vault incredibly useful to us, I mean, I cannot stress that enough. If we can just do this all in one space it helps our whole team. It’s 18 months into the biology and it’s only growing, CDD Vault provides a consistent, shared, secure environment. This is really important as we generate more data. "
Ronnett Seldon
Screening Technician for H3-D in the MMRU at University of Cape Town, South Africa
Contributor to Multiple TB Research and Drug Discovery Consortia
On Wednesday, the 24th of June 2015, Ronnett Seldon, a screening technician based in the Institute of Infectious Disease and Molecular Medicine (IDM) at the University of Cape Town (UCT) in South Africa spoke with us about her use of the Collaborative Drug Discovery CDD Vault platform.
Ronnett is employed by the H3-D Drug Discovery and Development Centre (Kelly Chibale, Director), but her work with Mycobacterium tuberculosis, the bacterium which causes tuberculosis (TB), means that she spends her time in the Molecular Mycobacteriology Research Unit (MMRU), which is Directed by Professor Valerie Mizrahi. The goal of the MMRU is to address tuberculosis drug resistance, persistence, and metabolic vulnerability by applying an integrated genetic, biochemical and physiological approach to research on the physiology and metabolism of TB. As principal screening technician, Ronnett is at the frontline of TB drug discovery at UCT, facilitating research under several major consortia funded by grants from the Bill & Melinda Gates Foundation under the TB Drug Accelerator program (HIT-TB), the Seventh Framework Program of the European Union (MM4TB), TIA (Technology Innovation Agency), and the South African Medical Research Council (through the Strategic Health Innovation Partnerships program).
CDD supports the development of robust, medium-throughput, small-molecule infectious disease biological assay screening processes for TB drug discovery. Ronnett’s work and close involvement with CDD’s Anna Coulon Spektor to streamline data upload, develop and implement customized curve calculation macros, provides a model “playbook” for other consortia members to follow. They can use the carefully crafted and saved templates not only to upload their data to the consortia database but also to make their collaboration experience as seamless as possible. Customizable macros and CDD technical support are available each step of the way to facilitate collaborations across multiple continents.
Interviewed by Collaborative Drug Discovery, Inc.
When you joined H3-D were they already using CDD Vault?
Yes, CDD Vault is a central database so everybody has access to the data. I have been with the project for 18 months now, and I run a dose-response inhibition assay which generates a significant amount of data analysis. I had intended to use GraphPad prism or Stata for my analysis in which I have to define controls and use a 4 parameter logistic regression for the curve generation. For large data sets, this takes a lot of time. I have experience of clinical microbiology research studies, and so I have a fair amount of experience with large datasets.
While uploading data to CDD, I noticed in the help section of CDD Vault that the platform has a customized 4-parameter logistic regression that is used to process and analyze the data. This has changed, a whole lot and for the better, the amount of time I need to upload and process the data. CDD Vault is not just a database to which I bulk upload data, but I actually use it to analyze and process all my data.
Before using CDD Vault to analyze your data, were you using GraphPad or something else?
The assay had just been optimized, we were looking at the data in Excel, using that analysis but I realized the resulting calculations were inaccurate because we weren’t taking into consideration the data from the controls and so it was just an estimate, really, just plotting an Excel curve. My intention then was to use GraphPad but I never went to GraphPad because I found CDD Vault. It would probably take me an hour per analysis on GraphPad because I would have to do all the annotation myself, whereas that’s all facilitated with the CDD algorithm.
How long does it take you to do the same amount of analysis using CDD Vault?
Because CDD has facilitated the entire uploading process via the macro, running the analyses in batch mode is all automated. At the moment I do spend a fair amount of time still processing for the registration, but that is because we have not yet standardized our registration format, we are in the process of doing that and then that will be automated too once we’ve defined the process. We are busy creating a web interface where a chemist will upload their drugs with their compound details in a format that is transferable to the macro, but once I have it loaded into the macro, it takes only 2-3 minutes. Most of my time is spent processing the raw data to get it to be uploadable to the macro in terms of the registering of the compounds but from then on, let’s say 10 minutes a micro plate of raw data, it would take at least 10x that using an application that has not been customized to this data output.
How long did it take you to learn how to use curve generator? Did you have support from the CDD Vault technical staff for learning how to use it? Did they make custom scripts or macros for your data?
Yes, absolutely, they were very well prepared. I contacted Anna Coulon Spektor by email and wrote to her initially just with some technical questions to make sure my understanding of the output was correct and, from the online description, was valid for what I needed. We had a couple of emails back and forth and confirmed it was applicable to my data as I had imagined, and we then had an online meeting and the macro was already in place so I think that within 1 or 2 online meetings with Anna we were uploading and analyzing data with it and generating reports. Her support has been significant and still is, it is amazing that the complex, batch data analysis is so fast and easy and routinely efficient with minimal, focused interactions. It helps that she wrote the online help materials.
Are there things you can do with the data that you couldn’t have done using graph pad, are there additional functionalities that you are using?
I haven’t explored any other software yet because the curve generating algorithm applied to my primary assay from CDD Vault is literally giving me all that I need for now. But we have contacted Anna for another assay we are busy optimizing that will have totally different data format. We are basically waiting on sample data so that I can meet online with her and play with that. I don’t know where that may lead because it may not work for us because this assay is a reporter gene assay but it also reports on gene expression and regulation kinetics that would have to be unpacked before a final analysis. But Anna has indicated that she is certainly keen to assess what CDD Vault can do for us for these types of data too, we’ll explore it.
Who else looks at the data that you generate?
The medicinal chemists. They use the Vault to share compound data and to search the database. We have an upcoming online meeting with Anna to address some requests on how they would like to see the data returned from a query to make their use of it more satisfying. There are challenges concerning compound screening run dates but, in discussions with Anna, it seems the issue is in our definition of a protocol, and not in the platform itself and so we will have an online meeting to resolve the way we define the protocols, essentially defining one protocol with many different conditions instead of many different protocols to streamline analyses.
How often do you have data that fails the curve generation Macro and how does the system notify you of suspect data records? What means do you have within CDD Vault to annotate that data individually?
The only time I have data that presents an atypical curve, which we have been experiencing recently, is for a specific chemical series or compound. It happens very rarely that I flag outliers and so I have not really had the need to do that kind of troubleshooting. For the recent compounds I mentioned, I simply flag those compounds in the final report in a comment section. It’s been pretty straightforward.
When you upload data, does CDD Vault give you any notice if there is an unusual curve generated or can you only find that by visual inspection?
Because the data are preprocessed, before uploading, you can generally tell if something went wrong, what CDD Vault thankfully does tell you is when the points don’t allow a curve to be generated, it will say something like ‘Not Able to Generate an IC99 Curve’ or ‘Insufficient Data to Generate and IC99 value’ but in almost all cases it’s not the data that has issues but an error on my part in preprocessing the data.
Do you use CDD Vault on a daily basis, a weekly basis?
It depends on the rhythm of the lab work. I’ve been online every second day these past two weeks. Generally I have data to process and analyze at least every other week. But lately we’ve had a fairly rapid increase in our numbers of compounds screened… in the last 9 months we’ve gone from a 100% increase to a 400% increase. During the last 9 months our workload has increased by 400%. This means I run the assay for two straight weeks collecting data, then I spend two weeks analyzing data. During this time I am on CDD Vault at least every other day.
Do you have the ability to generate reports over the lifetime of the assay?
I do but I don’t need to. It is a 14 day assay and I read it at 7 days and 14 days so that I can compare, but I only need to process, analyze and report on the 14 day assay results.
What about over life of the project, to see if any aspect of the assay is changing over time? Could you analyze for trends?
I could, yes, I could though its not been necessary. I could easily generate a report on a compound that had been run 5 times over the last few months and identify if there were any trends. I think CDD Vault does quite well at that kind of report generating.
Have you looked at all at CDD Vision to see what kinds of analyses can be done with the data?
Yes, we have scheduled a meeting with Anna to revise report processing, to get a tutorial on vision and it is my understanding that the call will be recorded so that we can play it back for the rest of the team. It’s really hard to get us all together at the same time because we are spread across many campuses but that is happening shortly.
Are you involved at all in the analysis of the data beyond the generation of the IC50 curves? Are you involved at all with the medicinal chemists in analyzing the SAR of the data generated?
Yes, but it’s not my lead role. We look at all the data together in CDD Vault and have discussions around formal hit assessment, hit to lead assessment.
What do you use currently to look at those analyses in? What visualization platform or application is used?
I think that some of the members of the consortia have their individual databases and data analyses and report generating platforms and so they can analyze whatever they need to using other platforms and still in addition to that, data is also loaded on the CDD Vault. They are currently looking at creating some form of uniformity in their approach. Having seen the type of graphical approach they have presented on the SAR, to me that’s very exciting that CDD Vision can do that and that we can do it all in one place, so I am very keen for them to be introduced to CDD Vision. It would be great to have it all in one place.
For your targets and your types of assays have you looked at the public data sets in CDD to see to see the relevance of potential public data sets?
That happens frequently but again is an output of the medicinal chemistry team. I know from the discussions at our group meetings that they do use them.
How are other people in the consortia using CDD Vault? Do they access the CDD Vault database live in team meetings?
We’ve seen CDD Vault live as a whole group when we met with Anna for the demos for compound registration and assay protocol registration.
Are the reports you generate on the assay runs used in the team meetings?
I circulate the data frequently among the team members as I analyze it and generate it in CDD Vault. When we are asked to give feedback and for a report back on particular compounds, we matriculate another worksheet so I am keen to learn from Anna how to generate reports where, even if it’s not live, we can use that report in the meeting and it gives us everything, apart from the SAR that the medicinal chemists will generate in the future, but all that we have biologically for that compound is there. I am hoping to share that soon.
Are there other groups in the consortia that are also generating biological data using CDD Vault?
Yes, there are a couple of groups that have recently started. Our collaborators in the US have just recently started uploading the biological data they have generated onto CDD Vault. The assay used to generate that data doesn’t require statistical analysis or a formula, it is a visual assay that uses a cutoff. But we have a couple of other assays that also produces vast data sets and will need a similar, if not more intense, type of algorithm in the future.
Did they interact with you to learn how to upload the data and produce curves or did they interact directly with Anna?
They interacted with Anna, from our last big meeting. Anna is our “go to” person for most things for our teams. As with the technology, consistency across groups is helpful.
Are you able to search through that data or is it kept separate?
I am able to search through and see that data.
Does that help facilitate discussions? Do you talk about the data you are generating and they are generating?
It’s a little bit complicated. It’s a fairly large consortium and so the data that we generate are not necessarily on identical groups of compounds. Certainly, all in all, what is happening is discussed for the overall picture but up to now there has not been, except for very recently, any discrepancies that the medicinal chemists would pick up from the reports.
I think that, at the moment, the use of CDD Vault is significantly different for the medicinal chemists and the biologists, so it is nice that it is useful for each of their complementary needs.
How often does the consortium get together and discuss the data?
The entire consortium, we have 2 big meetings per year. And its really for two different consortia, it’s a massive effort. There are consortia within the big consortia. If you were to bring up the projects currently listed on my dashboard, some of them are pharma later stage drug development data, some are earlier stage research data.
Do all the members of the consortia use CDD Vault and upload their data to the same CDD Vault Database in that manner.
As far as I know, yes, that is the intention.
Have you used CDD Vault for several different biological assays?
To date, we have a triage of assays, they include two primary data types that I upload to the database. We have a set of follow-up screens, counter-screens of known TB mutations that has a colorimetric readout but there is also a fluorescent molecule we have begun to use and I have a first screen where I have a fluorescent readout. I have designed the plate layout to be identical to my first screen and I have compared the data and it looks good, it looks like the values correlate so I will be using CDD Vault to process that data as well.
Have you used it for a variety of types of assays and has CDD Vault performed seamlessly across those different data types?
Yes, for now. The LUX assays, a Luciferase reporter strain, because the manner in which it is fused is linked very closely to upregulation of another gene and there is all the kinetics to unpack it would probably need more to an algorithm than what we have now but I certainly would hope to do it all in the same place. I think that just about everybody on the biology team finds CDD Vault incredibly useful to us, I mean, I cannot stress that enough. If we can just do this all in one space it helps our whole team. It’s 18 months into the biology and it’s only growing, CDD Vault provides a consistent, shared, secure environment. This is really important as we generate more data.
If you could think of one functionality in CDD Vault that would make your job of uploading, analysis and report generation more easy, what would that be?
Right now I export each data set that I’ve analyzed into an excel format that CDD Vault puts out and then I email and circulate that file so I am amassing quite a large number of the data files and its completely unnecessary since all the data are available from the database. So anything CDD Vault could do to facilitate that with the Message Board. Then it’d be easy for folks to go on CDD Vault to find all their data in the same place. I think that’s what we will be working on with Anna at our future meeting with her.
Is there any aspect of CDD Vault that you feel is the most valuable to you in your work?
Yes the online customized analysis is the most useful for me. It is not an uncommon analysis, certainly there are other platforms and applications that I could do the analysis in but that it’s easily customized for me for this output is a huge benefit. As long as our assay has been designed to our macro there is really nothing for us to do, I literally upload that and upload the CSV that has been generated from the macro, in comparison with having to do that in another statistical analysis program, that is an enormous benefit to me, I currently assay about 250 compounds a week and if I had to analyze that individually in a platform that had not been customized for me, I would never get back to the lab.
Do you make use of any of the notification functionality in CDD Vault? To notify the medicinal chemists when the data is uploaded?
No, not currently because I haven’t the time. If I am working with a team and that team has 10 medicinal chemists and each chemist gives me a data sheet of compounds to run, that is 10 projects, so right now it is easier to just send them back a data sheet with just their compounds on it, however ideally it’d all be communicated back and forth efficiently within the database. People are naturally used to looking at excel sheets.
When you run a compound multiple times is it easy enough for you to see all the previous run values for that compound?
Yes it is, I when I search for a compound I can see all the run dates. It’s more useful to the chemists than for me, since they are always interested in comparing the latest results to previous results.
Lastly – do you foresee yourself using CDD Vision if you get that module?
Absolutely. I am as impressed with what Anna showed me in terms of the report generating functions and from the presentations I have seen from the chemists it would be useful for them as well.
Because of the manner in which the projects are growing, because of the increased workload, and the research environment, multiple funders, we are getting to that stage where we are going to be asked for annual reports and that type of report generating functionality is incredibly valuable and will save lots of time.
This blog is authored by members of the CDD Vault community. CDD Vault is a hosted drug discovery informatics platform that securely manages both private and external biological and chemical data. It provides core functionality including chemical registration, structure activity relationship, chemical inventory, and electronic lab notebook capabilities!
CDD Vault: Drug Discovery Informatics your whole project team will embrace!