Blog | January 13, 2022

Software and Science Go Hand in Hand at X‑Chem: Informatics Tools for a Scalable and Effective DEL Platform

x-chem-blog-9-header

From the very beginnings of X-Chem, we recognized the need to invest in software capabilities. DNA-encoded library (DEL) technology generates a large amount of raw data that needs to be translated back to chemical structures. One experiment can yield 250M or more sequences, sometimes up to 10B sequences. We knew that our platform would only work if we had the right informatics tools to support our scientists. We set out to create software tools that would leverage the richness of our data: from something as straightforward as tracking the building blocks that make up the molecules in our libraries to something as complex as determining why a molecule should be prioritized over another for follow-up and assay. 

Our software tools travel with our scientists throughout the workday. We capture what was done at the bench: from the building blocks and tags that go into making each library to the exact conditions that targets are exposed to in each DEL screen. We capture the results of the screens: from the structural clusters enriched in a screen to the compounds resynthesized for follow-up with their confirmation assay results. 

How does this help our scientists? The tools are a one-stop shop where repetitive and mundane tasks are automated, leaving our scientists to innovate and focus on delivering for our clients. Our tools are built to address the questions and bottlenecks that arise as we continue to develop and scale our platform. Since X-Chem takes pride in continuously enhancing its technology, our tools are also continuously evolving.    

While some off-the-self software tools out there seem to be set in stone and require extensive effort for customization, this is not an issue at X-Chem. It all boils down to our approach: a flexible data model, a scientific team invested in enhancing the software tools and an integrated informatics team. We have successfully continued to adapt our software tools to changing needs as our platform grows and improves. We automate, we add new visualizations, we enhance existing analysis, we capture more and more data, we add new functionality, all in an effort to ease the data capture and analysis burden to free our scientists to innovate. And thanks to the flexibility built into our data model, we can continually augment and merge it with new data as we implement new functionality. We always assume that while we know what we want to do today, we will want to do more in the future and need to build the system to support change.

The key to our software tools’ success has been the recognition that the informatics team should be embedded with the science teams at X-Chem. Communication lines are always open. Our scientific users and informatics staff are in direct contact. This promotes an understanding of needs from both sides. One great example of this interplay between the informatics and scientific teams is the story of the “frequent hitters”. These were compounds that seemed to bind across seemingly unrelated targets and should be flagged promiscuous binders during data analysis efforts. In the early days of X-Chem, our scientists had the list of frequent hitters memorized. As we added diversity to our DEL library and ran more and more screens, it quickly became obvious that relying on memory was not sustainable nor objective. We needed a more systematic way to identify frequent hitters. That’s when the embedded informatics team, understanding the scientists’ workflows and cognizant of the X-Chem data model, started to score the molecules in our library based on how frequently they were enriched in all our screens. These scores are linked to all our DEL screens to automatically eliminate frequent hitter from our data sets and provide the best recommendations for follow-up molecules to our clients.      

Our next growth spurt has already begun. Another illustration of X-Chem’s informatics platform malleability was our ability to quickly pivot to support our new machine learning offering. With the addition of Glamorous AI to our team, we needed to implement functionality to present our selection data in a format suitable for the machine learning model-building pipeline. While machine learning was not planned when we originally built the DEL pipeline, our flexible data model allowed us to effortlessly meet this emerging need.

At X-Chem, we take pride in the impact we can have on new drug discoveries and strive to always grow. The concept behind our software tools is simple: enable our scientists to analyze our data in-depth efficiently to provide the most value for our partners. 

Back to Blog

Blog

Log KIAM Is the New LogD

March 29, 2024

By Johan Bartholomeus, Senior Principal Scientist, and Philippe McGee, Principal Scientist  In a drug discovery project, during the hit-to-lead and...

Blog

X‑Chem and the SGC are pioneering crowd-sourced AI advancements by making DEL screening data public. 

February 21, 2024

Artificial intelligence (AI) is now recognized as an indispensable component of the modern drug discovery tool kit, applied at all...