Issue #67: How to make the most of your masked language model for protein engineering
Protein Design Digest - 2026-03-13 - Discovery of a Hematopoietic Manifold in scGPT Yields a Method for Extracting Performant Algorithms from Biological Foundation Model Internals

Building something in Protein Design?
I love collaborating on new challenges. Let's build together.
Subscribe to Protein Design Digest
Daily curated signals from arXiv, PubMed, and BioRxiv.
Signal of the Day
How to make the most of your masked language model for protein engineering
A plethora of protein language models have been released in recent years. Yet comparatively little work has addressed how to best sample from them to optimize desired biological properties. We fill this gap by proposing a flexible, effective sampling method for masked language models (MLMs), and by systematically evaluating models and methods both in silico and in vitro on actual antibody therapeutics campaigns. Firstly, we propose sampling with stochastic beam search, exploiting the fact that MLMs are remarkably efficient at evaluating the pseudo-perplexity of the entire 1-edit neighborhood of a sequence. Reframing generation in terms of entire-sequence evaluation enables flexible guidance with multiple optimization objectives. Secondly, we report results from our extensive in vitro head-to-head evaluation for the antibody engineering setting. This reveals that choice of sampling method is at least as impactful as the model used, motivating future research into this under-explored area.
Why this matters:
Also Worth Reading
Binding interactions of Trametes villosa and Trametes lactinea laccases with 4-nonylphenol and its intermediates: molecular docking and molecular dynamics approaches.
Emerging pollutants such as 4-nonylphenol (4-NP) act as endocrine disruptors and have been associated with reproductive toxicity in humans and wildlife, as well as with physiological disturbances in aquatic, terrestrial, and plant organisms. Laccases are oxidoreductases with notable biotechnological relevance and the ability to oxidize phenolic pollutants, making them attractive candidates for biodegradation strategies. This study investigated the interactions between laccases from Trametes villosa and Trametes lactinea and 4-NP and its degradation intermediates via molecular docking and molecular dynamics simulations (MDS). Ligands were geometrically optimized using the PM7 semiempirical method, and their global reactivity descriptors were computed to explore correlations between electronic properties and laccase binding affinity. Docking revealed favorable binding energies (ΔG bind ≈ -6 kcal·mol -1 ) and recurrent interactions with key amino acid residues, including Ala, Glu, Leu, Phe, Pro, Ser, Val, and His, mainly through hydrogen bonding and hydrophobic contacts. The MDS confirmed the stability of the enzyme-ligand complexes, as indicated by low root mean square deviation (RMSD) and root mean square fluctuation (RMSF) values, along with consistent radius of gyration and solvent-accessible surface areas throughout the trajectories. Binding free energy calculations using the Molecular Mechanics/Poisson-Boltzmann Surface Area (MM/PBSA) method indicated stronger binding affinity under solvation, with ΔG bind values of -26.45 and -17.73 kcal·mol -1 for T. villosa and T. lactinea, respectively, highlighting hydrophobic and van der Waals contributions as the primary stabilizing forces. Overall, these results provide computational evidence that laccases from T. villosa and T. lactinea have potential for application in the oxidative biodegradation of 4-NP. These findings advance the molecular understanding of fungal laccase‒pollutant interactions and support future in vitro validation and protein engineering strategies aimed at enhancing biodegradation efficiency.
Artificial intelligence driven protein design and sustainable nanomedicine for advanced theranostics.
The integration of artificial intelligence, protein engineering, and sustainable nanomedicine is driving a paradigm shift in theranostics by enabling highly precise disease diagnosis and targeted therapy. AI-driven methodologies, including machine learning and deep learning, facilitate the rapid analysis of complex biological and chemical datasets, accelerating protein structure prediction, molecular docking, and structure-activity relationship modeling. These capabilities support the rational design of proteins and peptides with enhanced specificity, therapeutic efficacy, and safety, while enabling personalized treatment strategies tailored to individual molecular profiles. In parallel, sustainable nanomedicine focuses on the development of biodegradable, biocompatible, and environmentally benign nanomaterials to improve drug bioavailability, stability, and controlled release. AI-assisted optimization further refines nanocarrier design by balancing therapeutic performance with safety and environmental impact. Advanced intelligent nanocarriers capable of real-time monitoring, adaptive drug release, and degradation into non-toxic by-products represent a significant advancement over conventional static systems. The theranostic paradigm has become central to precision medicine, particularly in oncology, especially where AI-designed nanoplatforms enable targeted delivery of imaging agents and therapeutics to tumors, while allowing continuous treatment monitoring and minimizing off-target effects. Emerging applications in neurological, infectious, and cardiovascular diseases further highlight the broad clinical potential of this approach. Accordingly, this review summarizes AI-driven protein design strategies, sustainable nanocarrier engineering, and their convergence in next-generation theranostic systems, critically discussing mechanistic insights, translational challenges, and design principles required for developing safe, scalable, and clinically adaptable intelligent nanomedicines.
In silico prediction, molecular docking and simulation of natural flavonoid apigenin and xanthoangelol E against human metapneumovirus.
Human metapneumovirus (hMPV) is one of the potential pandemic pathogens, and it is a concern for elderly subjects and immunocompromised patients. There is no vaccine or specific antiviral available for hMPV. We conducted an in-silico study to predict initial antiviral candidates against human metapneumovirus. Our methodology included protein modeling, stability assessment, molecular docking, molecular simulation, analysis of non-covalent interactions, bioavailability, carcinogenicity, and pharmacokinetic profiling. We pinpointed four plant-derived bio-compounds as antiviral candidates. Among the compounds, apigenin showed the highest binding affinity, with values of - 8.0 kcal/mol for the hMPV-F protein and - 7.6 kcal/mol for the hMPV-N protein. Molecular dynamic simulations and further analyses confirmed that the protein-ligand docked complexes exhibited acceptable stability compared to two standard antiviral drugs. Additionally, these four compounds yielded satisfactory outcomes in bioavailability, drug-likeness, and ADME-Tox (absorption, distribution, metabolism, excretion, and toxicity) and STopTox analyses. This study highlights the potential of apigenin and xanthoangelol E as an initial antiviral candidate, underscoring the necessity for wet-lab evaluation, preclinical and clinical trials against human metapneumovirus infection. Supplementary information The online version contains supplementary material available at 10.1007/s40203-025-00539-7.
Research & AI Updates
- AI-Enabled Quantum Refinement Advances Protein Structure Determination for Structural Biology - Lab Manager — AI-Enabled Quantum Refinement Advances Protein Structure Determination for Structural Biology Lab Manager.
- Reverse vaccinology 3.0 - Nature — Reverse vaccinology 3.0 Nature.
- Hassabis marks 10 years since AlphaGo, touts move 37 and push to AGI - CHOSUNBIZ - Chosunbiz — Hassabis marks 10 years since AlphaGo, touts move 37 and push to AGI - CHOSUNBIZ Chosunbiz.
- Keynote Presentation: The Development and Application of Engineering Principles Governing Binding Protein Interactions w/ Live Q&A - Labroots — Keynote Presentation: The Development and Application of Engineering Principles Governing Binding Protein Interactions w/ Live Q&A Labroots.
- A Tenfold Increase in a Decade? What Makes This Small Biotech Company So Special - NAI500 — A Tenfold Increase in a Decade? What Makes This Small Biotech Company So Special NAI500.
- Sovereign AI Launches £500M Venture Fund To Support AI Founders - Quantum Zeitgeist — Sovereign AI Launches £500M Venture Fund To Support AI Founders Quantum Zeitgeist.
- AI drug discovery model surpasses AlphaFold by predicting if drugs work - Nanowerk — AI drug discovery model surpasses AlphaFold by predicting if drugs work Nanowerk.
From the Industry
- Cancer drug developer Theriva inks SYN-020 deal up to $38M - Stock Titan — Cancer drug developer Theriva inks SYN-020 deal up to $38M Stock Titan.
- Decoy Therapeutics Enters Strategic Partnership with Quantori to Deploy Google Cloud-native Integrated AI-Driven Peptide Design and Molecular Simulation Platform - marketscreener.com — Decoy Therapeutics Enters Strategic Partnership with Quantori to Deploy Google Cloud-native Integrated AI-Driven Peptide Design and Molecular Simulation Platform marketscreener.com.
- When Ambition Meets Ambiguity: The Trends and Sentiments Shaping Biotech in 2026 - Pharmaceutical Executive — When Ambition Meets Ambiguity: The Trends and Sentiments Shaping Biotech in 2026 Pharmaceutical Executive.
- Decoy Therapeutics Enters Strategic Partnership with Quantori to Deploy Google Cloud-native Integrated AI-Driven Peptide Design and Molecular Simulation Platform - PR Newswire — Decoy Therapeutics Enters Strategic Partnership with Quantori to Deploy Google Cloud-native Integrated AI-Driven Peptide Design and Molecular Simulation Platform PR Newswire.
- New biotech partnership aims to accelerate stem cell therapies for heart disease - News-Medical — New biotech partnership aims to accelerate stem cell therapies for heart disease News-Medical.
- Samsung Biologics Announces Collaboration with Lilly to Establish New Gateway Labs Site in Korea - BioSpace — Samsung Biologics Announces Collaboration with Lilly to Establish New Gateway Labs Site in Korea BioSpace.
- Tsingke Biotech and iGeneTech Forge Strategic Partnership to Lead Synthetic Biology Advancements - PR Newswire — Tsingke Biotech and iGeneTech Forge Strategic Partnership to Lead Synthetic Biology Advancements PR Newswire.
Quick Reads
Establishing FDA-approved oncology drugs as GPR176 inhibitor through homology modelling, molecular docking, MMGBSA, DFT, and molecular dynamics simulation.
How to make the most of your masked language model for protein engineering
A plethora of protein language models have been released in recent years. Read more →
Needle-in-a-haystack approach: rapid screening of PDE1C inhibitors through the combination of machine learning, molecular docking, molecular dynamics simulations and experimental validation.
Structure-based computational screening and molecular dynamics reveal potential inhibitors of Norovirus VP1 and RdRp Proteins: an in-silico study
Abstract Norovirus is recognized as a pathogen with pandemic potential, exhibiting a higher fatality rate in low-income countries, particularly affecting young children. Read more →
Protein Counterfactuals via Diffusion-Guided Latent Optimization
Deep learning models can predict protein properties with unprecedented accuracy but rarely offer mechanistic insight or actionable guidance for engineering improved variants. Read more →
aGPCR-HEK: A Stable High-Expression Inducible Mammalian Cell Expression System for Adhesion GPCR Structural Biology Applications.
ADGRL4 is an adhesion G protein-coupled receptor (aGPCR) implicated in multiple tumours. Read more →
Sulfonic Acid Group Docking Synthesis of Platinum Clusters in MOFs Cavity Enables Low-Temperature Stable Selective CO2 Hydrogenation to Methanol.
Platinum (Pt) nanoparticles favor the hydrogenation of CO2 to CO, presenting a significant challenge for value-added methanol synthesis. Read more →
A Modified Paraspinal Approach for Full-Endoscopic Discectomy for Far Lateral Disc Herniations: Docking at the Caudal Level Transverse Process.
The use of least invasive full-endoscopic spine systems has decreased the amount of tissue dissection, blood loss, and duration of post-operative recovery after intervention for far-lateral disc herniations (FLDH). Read more →
Pipeline Tip
Check for missing residues in PDB files using PDB-Fixer before simulation.
Resources & Tools
- Dataset: BioLiP - Verified biologically relevant ligand-protein interactions.
- Dataset: SIFTS - Residue-level mapping between PDB, UniProt, and other resources.
- Tool: OmegaFold - Structure prediction from single sequences with rapid inference. View all tools →
- Tool: Foldseek - Ultra-fast structural search and clustering engine. View all tools →
- Event: Protein Design Hub (LinkedIn Group) (Ongoing)
- Event: Structural Biology Events (Open)
- Job: Bioinformatics Data Analyst II - Indeed at Indeed Jobs
- Job: Bioinformatics Software Engineer - Indeed at Indeed Jobs
The protein structure is the language of life; design is its poetry. — Recep Adiyaman