Issue #79: TurboESM: Ultra-Efficient 3-Bit KV Cache Quantization for Protein Language Models with Orthogonal Rotation and QJL Correction
Protein Design Digest - 2026-03-31 - Enhancing CYP450-Ligand Binding Predictions: A Comparative Analysis of Ligand-Based and Hybrid Machine Learning Models.

Building something in Protein Design?
I love collaborating on new challenges. Let's build together.
Subscribe to Protein Design Digest
Daily curated signals from arXiv, PubMed, and BioRxiv.
Signal of the Day
TurboESM: Ultra-Efficient 3-Bit KV Cache Quantization for Protein Language Models with Orthogonal Rotation and QJL Correction
The rapid scaling of Protein Language Models (PLMs) has unlocked unprecedented accuracy in protein structure prediction and design, but the quadratic memory growth of the Key-Value (KV) cache during inference remains a prohibitive barrier for single-GPU deployment and high-throughput generation. While 8-bit quantization is now standard, 3-bit quantization remains elusive due to severe numerical outliers in activations. This paper presents TurboESM, an adaptation of Google’s TurboQuant to the PLM domain. We solve the fundamental incompatibility between Rotary Position Embeddings (RoPE) and orthogonal transformations by deriving a RoPE-first rotation pipeline. We introduce a head-wise SVD calibration method tailored to the amino acid activation manifold, a dual look-up table (LUT) strategy for asymmetric K/V distributions, and a 1-bit Quantized Johnson-Lindenstrauss (QJL) residual correction. All experiments are conducted on ESM-2 650M, where our implementation achieves a 7.1x memory reduction (330 MB to 47 MB) while maintaining cosine similarity > 0.96 in autoregressive decoding across diverse protein families, including short peptides, transmembrane helices, enzyme active site fragments, and intrinsically disordered regions. We further implement a Triton-based fused decode attention kernel that eliminates intermediate dequantization memory allocations, achieving a 1.96x speedup over the PyTorch two-step path for the KV fetch operation alone; however, TurboESM incurs a prefill overhead of 21-27 ms relative to the original model due to KV quantization and packing, making it most suitable for memory-bound scenarios rather than latency-critical short-sequence workloads. Analysis reveals that PLMs exhibit sharper outlier profiles than large language models (LLMs) due to amino acid vocabulary sparsity, and our method effectively addresses these distributions.
Why this matters:
Also Worth Reading
Comprehensive Molecular Docking and Molecular Dynamics Reveal Inhibitors of HER2 L755S, T798I, and T798M based on a Large Database of Curcumin Derivatives.
Objective This study presents a methodology employing virtual screening to identify curcumin derivatives with selective affinity for the HER2 mutations L755S, T798I, and T798M. Methods Curcumin derivatives were retrieved from the ChEMBL database and filtered using KNIME. HER2 mutations were modeled in silico using MOE software with PDB ID 3RCD. Molecular docking and dynamics simulations were conducted to screen high-affinity compounds and evaluate binding interactions. Result From 505 curcumin derivatives, the RDKit module implemented in KNIME successfully filtered 317 compounds. Subsequent molecular docking against wild-type HER2 identified 100 curcumin derivatives with low docking scores, among which the top 20 compounds exhibited better binding affinities than Lapatinib. Further molecular docking screening against the three HER2 mutations identified five lead compounds with the lowest docking scores. Molecular docking and molecular dynamics simulation revealed critical binding interactions with residues essential for kinase domain stability. Chemical structural analysis revealed key modifications, such as geranyl and tripeptide modifications. CHEMBL3758656 and CHEMBL3827366, two curcumin derivatives, demonstrated consistent binding across HER2 mutations and a favorable ADMET profile. Conclusion This study successfully identified CHEMBL3758656 and CHEMBL3827366 as promising HER2 inhibitors through comprehensive virtual screening. Their high binding affinity against L755S, T798I, and T798M mutations and favorable ADME and toxicity properties underscore their potential as alternative therapeutics for HER2-positive breast cancer.
Generative Modeling in Protein Design: Neural Representations, Conditional Generation, and Evaluation Standards
Generative modeling has become a central paradigm in protein research, extending machine learning beyond structure prediction toward sequence design, backbone generation, inverse folding, and biomolecular interaction modeling. However, the literature remains fragmented across representations, model classes, and task formulations, making it difficult to compare methods or identify appropriate evaluation standards. This survey provides a systematic synthesis of generative AI in protein research, organized around (i) foundational representations spanning sequence, geometric, and multimodal encodings; (ii) generative architectures including $\mathrm{SE}(3)$-equivariant diffusion, flow matching, and hybrid predictor-generator systems; and (iii) task settings from structure prediction and de novo design to protein-ligand and protein-protein interactions. Beyond cataloging methods, we compare assumptions, conditioning mechanisms, and controllability, and we synthesize evaluation best practices that emphasize leakage-aware splits, physical validity checks, and function-oriented benchmarks. We conclude with critical open challenges: modeling conformational dynamics and intrinsically disordered regions, scaling to large assemblies while maintaining efficiency, and developing robust safety frameworks for dual-use biosecurity risks. By unifying architectural advances with practical evaluation standards and responsible development considerations, this survey aims to accelerate the transition from predictive modeling to reliable, function-driven protein engineering.
Investigation of in vitro anticancer and antioxidant activities of various extracts of Bayramiç Beyazı nectarine, and molecular docking, molecular dynamics simulation, and protein-protein interaction network
Nectarine (Prunus persica var. nucipersica), due to its high phenolic content and antioxidant properties, holds significance for human health. The aim of this study was to evaluate the in vitro anticancer and antioxidant effects of the extracts obtained from the fruit and kernel of “Bayramiç Beyazı” nectarine, a geographically indicated fruit grown in Bayramiç district of Çanakkale. The anticancer effects of the methanol and aqueous ethanol extracts were evaluated on breast and colon cancer cell lines. Apoptotic fragmentation and mitochondrial membrane potential of fruit and kernel extracts were examined under fluorescence microscopy. Antioxidant activity and phenolic content were determined using DPPH, ABTS, and Folin-Ciocalteu (F-C) methods, respectively. Kernel extract has the highest antioxidant activity (DPPH IC₅₀= 0.15 ± 0.001 mg/mL). The fruit methanol, aqueous ethanol, and kernel aqueous ethanol extracts significantly reduced the fluorescent intensity of the cells. A combination study was conducted between the extracts and doxorubicin. Molecular docking and molecular dynamics (MD) simulation studies of some of the identified components were performed using the Glide/SP and Desmond against a drug target PRK1. The highest binding affinity with quercetin for targeting PRK1 was calculated as -8.789 kcal/mol. The average RMSD values were calculated between 3.43 ± 0.31 and 2.22 ± 0.30 Å throughout 500 ns MD simulations. A protein-protein interaction network analysis was performed for PRK1 using a systems biology approach to identify the highest scoring predicted proteins such as RHOA, MAP2K3, and MEFV. The investigation of the in vitro anticancer effects of “Bayramiç Beyazı” extracts and combined in silico analyses were carried out for the first time, and the outcomes of this study have promising potential for future studies.
Research & AI Updates
- Identifying the limits of protein evolution - EurekAlert! — Identifying the limits of protein evolution EurekAlert!.
- Zuckerberg lost DeepMind over dinner test - MSN — Zuckerberg lost DeepMind over dinner test MSN.
- How an Australian man used artificial intelligence to fight dog’s cancer - Business Standard — How an Australian man used artificial intelligence to fight dog’s cancer Business Standard.
- One man, his dog, and ChatGPT: Australia’s AI vaccine saga - France 24 — One man, his dog, and ChatGPT: Australia’s AI vaccine saga France 24.
From the Industry
- Sen. McCormick tours Pittsburgh biotech facilities to highlight growth through NIH funding - Pittsburgh Post-Gazette — Sen.
- IPO Tracker 2026: Biotech Mega-Rounder Kailera Seeks IPO - BioSpace — IPO Tracker 2026: Biotech Mega-Rounder Kailera Seeks IPO BioSpace.
- Pharmalittle: We’re reading about Trump pushing drug pricing policy, Chinese biotech licensing, and more - statnews.com — Pharmalittle: We’re reading about Trump pushing drug pricing policy, Chinese biotech licensing, and more statnews.com.
- Obesity-focused biotech Kailera joins IPO queue in US - Pharmaphorum — Obesity-focused biotech Kailera joins IPO queue in US Pharmaphorum.
- Insilico Medicine and Tenacia Biotech expand AI-driven CNS collaboration worth $94.75 M - BioSpectrum Asia — Insilico Medicine and Tenacia Biotech expand AI-driven CNS collaboration worth $94.75 M BioSpectrum Asia.
- Arzeda and MANE Join Forces to Bring a Better Stevia at Scale - SynBioBeta — Arzeda and MANE Join Forces to Bring a Better Stevia at Scale SynBioBeta.
- BIO-Europe Spring 2026: partnership event brings funding hope to biotechs - Labiotech.eu — BIO-Europe Spring 2026: partnership event brings funding hope to biotechs Labiotech.eu.
Quick Reads
The Effect of Viniferin on Liver Cancer: Research Based on Network Pharmacology, Molecular Docking and Molecular Dynamics Simulation.
Background/Objectives: Hepatocellular carcinoma (HCC) is a primary malignancy often driven by metabolic syndrome, fatty liver disease, and chronic hepatitis. Read more →
Next-generation Janus kinase inhibitors: Integrating synthetic innovation, structural biology, and computational design for precision drug discovery.
Janus kinase (JAK) dysregulation plays a central role in the pathogenesis of inflammatory, autoimmune, and malignant disorders, making the JAK family an essential therapeutic target across multiple disease domains. Read more →
TurboESM: Ultra-Efficient 3-Bit KV Cache Quantization for Protein Language Models with Orthogonal Rotation and QJL Correction
The rapid scaling of Protein Language Models (PLMs) has unlocked unprecedented accuracy in protein structure prediction and design, but the quadratic memory growth of the Key-Value (KV) cache during inference remains a prohibitive barrier for single-GPU deployment and high-throughput generation. Read more →
Integrated DFT, molecular docking, and molecular dynamics investigation of some novel 2-thiohydantoin analogues as potent CDK2 inhibitors for anticancer therapy.
Cancer progression is driven by dysregulation of cyclin-dependent kinase 2 (CDK2), a critical cell cycle regulator. Read more →
Investigating the impact of aspartame on Alzheimer’s disease through network toxicology and molecular docking.
Introduction Alzheimer’s disease (AD) is a prevalent neurodegenerative disorder, and the relationship between its pathogenesis and environmental factors has garnered increasing scholarly interest. Read more →
Integrative molecular simulations reveal NeuroAid II mechanisms in ischemic stroke through network pharmacology, molecular dynamics, and pharmacophore modeling.
Ischemic stroke remains a major health challenge with limited treatment options. Read more →
Assessment of miRNAs as transcriptional regulators in respiratory syncytial virus infection through computational analysis and molecular docking studies.
Globally, RSV is a major contributor to severe lower respiratory tract infections among children. Read more →
Exploring the Toxicological Effects of Acetyl Tributyl Citrate Exposure on Osteoarthritis Based on Machine Learning, Network Toxicology and Molecular Docking Analysis.
To investigate the potential toxicological effects of acetyl tributyl citrate (ATBC) on osteoarthritis (OA) and elucidate the underlying mechanisms using bioinformatics, machine learning, and network toxicology. Read more →
Pipeline Tip
Index your BigWig files before visualization to save memory.
Resources & Tools
- Dataset: Uniprot Knowledgebase - The world’s most comprehensive resource for protein sequence and annotation.
- Dataset: PDB-REDO - Optimized protein structure database with refined models.
- Tool: RFdiffusion - State-of-the-art generative model for de novo protein design. View all tools →
- Tool: ProteinMPNN - High-speed sequence design optimized for fixed-backbone folding. View all tools →
- Event: Protein Design Hub (LinkedIn Group) (Ongoing)
- Event: Structural Biology Events (Open)
The protein structure is the language of life; design is its poetry. — Recep Adiyaman