DEL Screen Artifacts: How to Find and Avoid Them
As experts in the DNA-encoded chemical library (DEL) field with over twelve years of experience in screening DELs, X-Chem has extensive insight into common DEL artifacts, and we are well-versed in techniques for identifying false positives. This allows us to focus on prioritizing DEL compounds for synthesis that have the highest likelihood of becoming confirmed hits.
Here, we provide a summary of several artifacts that can show up in a DEL screen and describe how to avoid them:
- Promiscuous binders: Compounds that bind non-specifically to protein targets (for example, to denatured proteins or their affinity tags) are easily identified in our DEL screen outputs. We achieve this by automatically compiling and storing information for the enrichment of every building block and building block combination — for every target they have ever been screened against. The enriched compounds observed in our screen outputs come labeled with a promiscuity metric so that we can confidently filter out these undesirable compounds.
- Matrix binders: Matrix binders will show enrichments against both target and no-target control selections. The straightforward solution is to always have a no-target control selection condition for each type of matrix used. During the data analysis process, enriched compounds that bind in the no-target controls are automatically tagged, allowing them to be easily filtered.
- Monosynthon features: We generally define a monosynthon as an on-DNA chemical entity with (a) an enrichment signature that is driven by a single building block at one cycle of chemistry and (b) no co-enrichment of any other building blocks at any other cycles. Monosynthon features may be a result of binding to a specific DNA tag, failed reactions or a non-specific target or matrix-driven event. The testing of compounds designed from monosynthon features has confirmed that they are usually artifacts. During data analysis, we recommend filtering monosynthon features, as these are less likely to lead to productive binders.
- Truncates: Truncates — and monosynthon features from failed reactions, for that matter — are rarely observed at X-Chem. (This topic is further discussed in X-Chem’s May 11, 2021, blog entry.) Ninety-four percent of X-Chem licensed compounds are fully elaborated; in other words, they correspond exactly to the intended chemical scheme and the indicated enriched building blocks at all cycles of chemistry. X-Chem designs high-quality libraries without chemical ambiguity, and we validate every building block for high reaction yields before committing it to the DEL synthesis process. We do occasionally observe selection output signatures that correspond to possible truncates in 3-cycle libraries (for example, when we observe enriched features with one cycle that has many building blocks with weak structural similarity). In these cases, we may recommend the off-DNA synthesis of the full-length and truncated analogs.
- Weak structural similarity within clusters: Our general practice is to cluster the enriched compounds observed in DEL selection outputs into structurally similar clusters. After the full enumeration of structures, structural similarity and structure-enrichment relationships can occasionally be poor among the compounds in a cluster. Generally, we find that clusters with logical and consistent structure-enrichment relationships are more likely to lead to the successful confirmation of resynthesized hits, and we recommend prioritizing these.
Identifying compounds that exhibit these categories of artifacts is a capability that is built into our analysis process. This helps with synthesis prioritization and avoids wasting limited resources on the synthesis of false positives. We can also filter by applying physicochemical property constraints to support the discovery of drug-like or lead-like hits. At X-Chem, we take an extremely proactive approach to library design, and most members will necessarily have drug-like properties as a result. Consequently, we generally do not need to apply property filters unless especially stringent requirements are requested.
In summary, paying close attention to these artifact categories helps us to prioritize compounds that have the highest chances of being successful. This is one of the many reasons for X-Chem’s high success rate of 80% in the identification of hits from individual DEL screens — including for difficult and even unprecedented targets.
X‑Chem and the SGC are pioneering crowd-sourced AI advancements by making DEL screening data public.
Artificial intelligence (AI) is now recognized as an indispensable component of the modern drug discovery tool kit, applied at all...
Most readers will be familiar with the parable of the “search”, in which a drunk person searches for their lost...