Blog | February 21, 2024

X‑Chem and the SGC are pioneering crowd-sourced AI advancements by making DEL screening data public. 

X-Chem AI blog header with people

Artificial intelligence (AI) is now recognized as an indispensable component of the modern drug discovery tool kit, applied at all drug discovery stages from initial hit finding to lead optimization and beyond. It has been repeatedly shown that pairing high-quality data with advanced AI can dramatically enhance the hit finding process.1,2,3 But as the benefits of AI in drug discovery continue to mount, we must remember that maximizing this industry’s potential means aiding key contributors in their efforts: 

In short, if we want the AI community to continue supporting the drug discovery community, we must support the AI community too.  

Think of it this way: AlphaFold can now predict 3-D protein structures with greater accuracy than ever.4 Why? Because training data were made publicly available. So, how do we further improve AI algorithms? We share data, empowering data scientists to explore new approaches to solving drug discovery challenges.  

Unfortunately, most companies do not share their proprietary data, resulting in public datasets that are small and lacking in diversity. Additionally, machine learning (ML) requires both positive and negative data for effective models, but public (e.g., literature) datasets often only include positive results. This will not do.  

To help correct the issue, X-Chem is taking steps to lead by example and be part of the solution: In collaboration with Structural Genomics Consortium (SGC), X-Chem will be publicly sharing its DNA-encoded library (DEL) screening data. AI practitioners will have open access to these data and be able to use them to test novel AI approaches. Further details are available in this press release. 

Our hope is that X-Chem’s high-throughput, high-quality data will similarly drive advancements in AI, as AI has already enhanced various stages of drug discovery. With these new public datasets, we will also promote target validation via probe identification and explore previously unchartered avenues of drug discovery.  

Our mission is to help our clients cure disease and share our innovations with the world. We’re already putting our technology in the hands of drug discovery experts, and now we aim to do the same for the AI community by putting more data in their hands. X-Chem’s DEL platform provides an unprecedented depth and breadth of clean and diverse data. And now these data will be available to the public. 


  1. McCloskey, K., Sigel, E.A., Kearnes, S., et al. 2020. Machine learning on DNA-encoded libraries: a new paradigm for hit finding. Journal of Medicinal Chemistry, 63(16), pp.8857-8866. 
  1. Ahmad, S., Xu, J., Feng, J.A., et al. 2023. Discovery of a First-in-Class Small-Molecule Ligand for WDR91 Using DNA-Encoded Chemical Library Selection Followed by Machine Learning. Journal of Medicinal Chemistry, 66(23), pp.16051–16061. 
  1. Li, A.S.M., Kimani, S., Wilson, B., et al. 2023. Discovery of Nanomolar DCAF1 Small Molecule Ligands. Journal of Medicinal Chemistry, 66(7), pp.5041-5060. 
  1. Bordin, N., Dallago, C., Heinzinger, M., et al. 2023. Novel machine learning approaches revolutionize protein knowledge. Trends in Biochemical Sciences, 48(4), pp.345-359. 

Back to Blog


Log KIAM Is the New LogD

March 29, 2024

By Johan Bartholomeus, Senior Principal Scientist, and Philippe McGee, Principal Scientist  In a drug discovery project, during the hit-to-lead and...


Unlocking Chemical Space: The Power of DEL Screening 

January 29, 2024

Most readers will be familiar with the parable of the “search”, in which a drunk person searches for their lost...