Unstructured Protein Targets Successfully hit by Language-based Model in Drug Development
In a groundbreaking development, a new artificial intelligence (AI) method named PepMLM has been unveiled, offering hope in the drug development field for historically "undruggable" targets. The study, titled "Target sequence-conditioned design of peptide binders using masked language modeling," was published in Nature Biotechnology.
The authors of the study include Pranam Chatterjee, an assistant professor of bioengineering at the University of Pennsylvania, and Ray Truant, a professor at McMaster University and a Huntington's disease expert. The collaboration between Chatterjee and Truant began in 2018, when Chatterjee was a graduate student at MIT, designing Cas9 enzymes to base edit the repeat region of the HTT gene.
PepMLM, trained on approximately 10,000 peptide-protein sequence pairs sourced from PepNN and Propedia, has demonstrated impressive results. It achieved nanomolar binding affinity on disease-related receptor targets that could not be hit by RFdiffusion, the current gold standard model for de novo protein design. In fact, PepMLM outperformed RFdiffusion with a higher hit rate of 38% compared to 29%.
One of the key features of PepMLM is its ability to tune degradation efficacy, a crucial aspect for drug development. Additionally, PepMLM allows researchers to modulate protein levels without impacting mRNA, offering a powerful tool to investigate diseases with RNA pathologies, such as Huntington's disease-like (HDL) syndromes.
Chatterjee argues that protein language models, using only amino acid sequences with no structural information, are key to drugging historically "undruggable" targets. This approach allows models to effectively expand to targets for which known structures do not exist.
The next steps of the work aim to adapt the model to account for post-translational modifications, motif-specific binding, and tailoring specificity to avoid off-target effects. Truant's team is particularly interested in tethering kinase activity to PepMLM peptides to address hypophosphorylated sites and restore function in dysregulated Huntington's protein.
Since its public release last year, PepMLM has seen wide uptake from the biology community, averaging approximately 600 downloads per month. The disease Huntington's, a monogenic disease affecting more than 1 in 10,000 adults and primarily caused by an expanded CAG repeat in exon 1 of the HTT gene, is just one example of the potential applications of this innovative AI method.
Interestingly, the AI model ChatGPT, according to Chatterjee, effectively learns the basis of language without being trained on the metadata of the text, a testament to the advancements in AI technology. The future of drug development and AI-assisted research seems promising with the development of PepMLM.
Read also:
- Nightly sweat episodes linked to GERD: Crucial insights explained
- Antitussives: List of Examples, Functions, Adverse Reactions, and Additional Details
- Asthma Diagnosis: Exploring FeNO Tests and Related Treatments
- Unfortunate Financial Disarray for a Family from California After an Expensive Emergency Room Visit with Their Burned Infant