Issue #128: Prediction of Antibody Non-Specificity using Protein Language Models and Biophysical Parameters
Protein Design Digest #128: Computational Assessment of Phytochemical Inhibitors of Cytochrome P450 …

Building something in Protein Design?
I love collaborating on new challenges. Let's build together.
Subscribe to Protein Design Digest
Daily curated signals from arXiv, PubMed, and BioRxiv.
Signal of the Day
Prediction of Antibody Non-Specificity using Protein Language Models and Biophysical Parameters
The development of therapeutic antibodies requires optimizing target binding affinity and pharmacodynamics, while ensuring high developability potential, including minimizing non-specific binding. In this study, we address this problem by predicting antibody non-specificity by two complementary approaches: (i) antibody sequence embeddings by protein language models (PLMs), and (ii) a comprehensive set of sequence-based biophysical descriptors. These models were trained on human and mouse antibody data from Boughter et al. (2020) and tested on three public datasets: Jain et al. (2017), Shehata et al. (2019) and Harvey et al. (2022). We show that non-specificity is best predicted from the heavy variable domain and heavy-chain complementary variable regions (CDRs). The top performing PLM, a heavy variable domain-based ESM 1v LogisticReg model, resulted in 10-fold cross-validation accuracy of up to 71%. Our biophysical descriptor-based analysis identified the isoelectric point as a key driver of non-specificity. Our findings underscore the importance of biophysical properties in predicting antibody non-specificity and highlight the potential of protein language models for the development of antibody-based therapeutics. To illustrate the use of our approach in the development of lead candidates with high developability potential, we show that it can be extended to therapeutic antibodies and nanobodies.
Why this matters:
Also Worth Reading
Tocotrienol as a multi-target inhibitor of ICAM-1, VCAM-1, and E-selectin: Comparison using AutoDock and GNINA docking with molecular dynamics simulation.
Atherosclerosis is a chronic inflammatory disease characterized by endothelial dysfunction and leukocyte adhesion, mediated by cell adhesion molecules such as E-selectin, intercellular adhesion molecule-1 (ICAM-1), and vascular cell adhesion molecule-1 (VCAM-1). Tocotrienols, a subgroup of vitamin E, exhibit potent antioxidant and anti-inflammatory properties, suggesting their potential role in attenuating atherosclerosis. This study comparatively evaluated the binding affinities and molecular interaction profiles of α-, β-, γ-, and δ tocotrienol isomers towards E-selectin, ICAM-1, and VCAM-1 using molecular docking approaches, followed by molecular dynamic simulation to assess the stability of the top-ranked protein-ligand complexes. The docking experiment was conducted using MolModa, an automated molecular docking platform based on AutoDock Vina and convolutional neuronal network (CNN)-based AI-assisted GNINA. Overall, the conventional molecular docking tool AutoDock Vina results showed that all tocotrienol isomers exhibited the strongest average binding affinities to VCAM-1. Among the isomers, α-tocotrienol displayed the highest binding affinity towards E-selectin (-6.69 ± 0.00 kcal/mol) and ICAM-1 (-6.79 ± 0.00 kcal/mol), whereas β-tocotrienol exhibited the strongest affinity toward VCAM-1 (-7.59 ± 0.00 kcal/mol) in the molecular docking analysis using conventional molecular docking tool AutoDock Vina. In contrast, the AI-assisted molecular docking tool GNINA leveraging deep learning, demonstrated a more accurate and consistent affinity profile by consistently identified β-tocotrienol as the most favorable binder toward E-selectin (-6.91 ± 0.01 kcal/mol) and ICAM-1 (-7.08 ± 0.90 kcal/mol), characterized by hydrogen bonding, hydrophobic interactions, and extensive van der Waals forces, that are crucial for the lipid-soluble ligand. The AI-assisted molecular docking tool GNINA docking for VCAM-1 was not generated due to structural limitations of the receptor model. Molecular dynamics (MD) simulations over 200 ns demonstrate a significant stabilizing interaction with GLU87, whereas the hydrogen bonding at ASP178 was found to be intermittent and contributory throughout the trajectory. This study provides the first comprehensive computational evidence differentiating the multi-target potency of tocotrienol isomers in targeting inflammatory and vascular-related pathways. Further experimental validation is warranted to confirm these in silico predictions and explore their biological significance.
Heuristic multi-site optimization for protein sequence design using Masked Protein Language Models.
Protein sequence design for tailored functional properties is a fundamental task in protein engineering, with critical applications in drug discovery and therapeutic development. Efficient navigation of the combinatorial vastness of protein sequence space to identify functional variants remains a formidable challenge. Conventional approaches, which predominantly rely on template-based local search or single-residue mutagenesis, are constrained by their susceptibility to local optima and their potential risk of destabilizing native structural stability. In this study, we introduce ProtHMSO, a heuristic multi-site optimization framework leveraging masked protein language models (ProtLMs) for context-aware sequence exploration. ProtHMSO mimics natural evolutionary mechanisms by employing ProtLM-derived substitution probabilities to guide heuristic searches for synergistic mutations, thereby constraining combinatorial search spaces through evolutionary and biophysical priors. ProtHMSO is further applied to replace the exploration strategies in genetic algorithms (GAs) and Monte Carlo tree search (MCTS) for improving their convergence efficiency. Benchmark experiments demonstrate that protein sequences generated by ProtHMSO exhibit superior functional performance and closer alignment with natural sequence distribution, compared with state-of-the-art methods. These advancements highlight that ProtHMSO has strong potential and compatibility to accelerate functional protein discovery, offering a robust framework for efficient and context-aware exploration of protein sequence space.
Integrating network pharmacology, molecular docking and experimental verification to explore the therapeutic effect of piceatannol on rheumatoid arthritis.
Piceatannol (PIC) exhibits antioxidant and anti-inflammatory activities. This study integrates network pharmacology and experimental validation to investigate its potential role in rheumatoid arthritis (RA). PIC targets were predicted using public databases. RA-related differentially expressed genes (DEGs) were identified from Gene Expression Omnibus (GEO) datasets (|logFC| ≥ 1 and P-value < 0.05). Intersection genes were analyzed via protein-protein interaction (PPI) network (hub gene selection), molecular docking (binding affinity < -5.0 kcal/mol as threshold), ConnectivityMap and molecular dynamics simulation. Experimental validation included CCK8, flow cytometry, real-time quantitative PCR (RT-qPCR), Western blotting, and an adjuvant-induced arthritis (AIA) rat model. 35 intersecting genes were identified, from which 6 hub genes (SYK, CXCL8, TNF, NFKB1, PPARG, and CASP8) were selected. PIC showed stable binding to all hub genes (affinities: -5.6 to -7.8 kcal/mol). ConnectivityMap suggested a regulatory relationship between PIC and SYK. Molecular dynamics simulations demonstrate that the PIC-SYK complex maintains stable structural integrity. Experimental validation showed that PIC reduced MH7A cell viability, induced G2/M arrest and apoptosis, and downregulated mRNA levels of SYK, NFKB1, and CASP8, consistent with predictions. In vivo, PIC alleviated AIA severity. These preliminary findings suggest that PIC exerts therapeutic effects in RA models, potentially via SYK/NFKB1/CASP8. The study provides a theoretical basis for further evaluation of PIC in RA, while acknowledging the exploratory nature of network pharmacology and preclinical models.
Research & AI Updates
- AGI is coming within four years and could start a new human era, says Google DeepMind CEO - India Today — AGI is coming within four years and could start a new human era, says Google DeepMind CEO India Today.
- Opinion: How AI is already improving lives - East Bay Times — Opinion: How AI is already improving lives East Bay Times.
- Vertu’s Alphafold foldable phone has launched, and you’ve got to see its obnoxious price - MSN — Vertu’s Alphafold foldable phone has launched, and you’ve got to see its obnoxious price MSN.
From the Industry
- Parabilis sets a record with a $670M biotech IPO - BioPharma Dive — Parabilis sets a record with a $670M biotech IPO BioPharma Dive.
- How the Chai Discovery–Eli Lilly Collaboration Could Advance the Future of AI-Driven Biologics Development - Pharmacy Times — How the Chai Discovery–Eli Lilly Collaboration Could Advance the Future of AI-Driven Biologics Development Pharmacy Times.
- Applied Biologics Presents Interim CAMPX Clinical Trial - GlobeNewswire — Applied Biologics Presents Interim CAMPX Clinical Trial GlobeNewswire.
- Parabilis Medicines sets biotech record with $670m IPO - Yahoo Finance — Parabilis Medicines sets biotech record with $670m IPO Yahoo Finance.
- Update: Parabilis raises industry-record $670M IPO - BioSpace — Update: Parabilis raises industry-record $670M IPO BioSpace.
- Cancer-focused Parabilis’ upsized $670M IPO breaks new record for biotechs - Fierce Biotech — Cancer-focused Parabilis’ upsized $670M IPO breaks new record for biotechs Fierce Biotech.
- Parabilis Medicines targets $2.3 billion valuation in upsized US IPO - Reuters — Parabilis Medicines targets $2.3 billion valuation in upsized US IPO Reuters.
Quick Reads
Prediction of Antibody Non-Specificity using Protein Language Models and Biophysical Parameters
The development of therapeutic antibodies requires optimizing target binding affinity and pharmacodynamics, while ensuring high developability potential, including minimizing non-specific binding. Read more →
Exploring potential targets and mechanisms of male reproductive toxicity induced by the emerging PFAS GenX and F-53B via network toxicology, molecular docking, and in vivo validation.
Epidemiological studies have reported a progressive decline in parameters of male reproductive health. Read more →
Evaluation of phytochemical components, cytotoxicity, and molecular docking of Anacyclus pyrethrum and Commiphora myrrha for formulating a herbal topical anaesthetic gel development.
Synthetic intraoral topical anesthetics, such as lignocaine and benzocaine can cause adverse effects in pediatric dentistry, creating a need for safer plant-based alternatives. Read more →
HIPIF: Hierarchical Planning and Information Folding for Long-Horizon LLM Agent Learning
While Large Language Models (LLMs) have demonstrated strong capabilities as autonomous agents across a wide range of tasks, their performance often degrades in multi-turn long-horizon agentic tasks. Read more →
Identification of ursolic acid from Wumei as a syk-targeting anti-allergic agent using a piezoresistive cantilever biosensor.
Wumei (WM), a historical food and medicine homology fruit in China, is reported to have anti-allergic effect, yet its active components and mechanisms remain unclear. Read more →
Computational prediction of potential aggravating mechanisms of polyethylene terephthalate microplastics in diabetic foot ulcers: An integrated in silico approach combining network toxicology, bioinformatics, machine learning, and molecular dynamics simulations.
The increasing incidence of diabetic foot ulcer (DFU) and growing recognition of environmental pollutants have highlighted polyethylene terephthalate microplastics (PET-MP) as a potential metabolic disease trigger. Read more →
Network Pharmacological Analysis of Lentinan in Regulating Intervertebral Disc Degeneration: Combined with Machine Learning-Based Screening and Molecular Dynamics Validation.
Lentinan (LNT) is a polysaccharide with antioxidant and anti-inflammatory properties; however, its link to intervertebral disc degeneration (IVDD) remains unclear. Read more →
Quantum computing applications in drug discovery.
In early drug discovery, virtual screening based on deep learning, virtual screening based on molecular docking, and molecular dynamics are three widely used computational strategies, but they always face a trade-off between throughput, search stability, and physical fidelity. Read more →
Pipeline Tip
Check for missing residues in PDB files using PDB-Fixer before simulation.
Resources & Tools
- Dataset: CATH - Hierarchical protein domain classification for structure and function.
- Dataset: SCOPe - Curated structural classification of proteins for fold analysis.
- Tool: FunFOLD5 - Automated system for protein ligand-binding site prediction and function annotation. View all tools →
- Tool: MultiFOLD/IntFOLD - High-performance protein structure prediction and quality assessment server. View all tools →
- Event: Protein Design Hub (LinkedIn Group) (Ongoing)
- Event: Structural Biology Events (Open)
- Job: Amgen hiring Senior Scientist - Experimental Protein Design in San Francisco Bay Area - LinkedIn at Bioinformatics Careers
- Job: Labcorp hiring Lead Bioinformatics Scientist - Sample to Answer in San Diego, CA - LinkedIn at Bioinformatics Careers
The protein structure is the language of life; design is its poetry. — Recep Adiyaman