Issue #15: SMARTDock: A Toolkit for the Automated Development of Target-Specific Scoring Functions Using Bioactivity Data.
Protein Design Digest - 2026-01-07 - Discovery of PPARγ Partial Agonists for Treatment of Type 2 Diabetes Based on an Integrated Virtual Screening Strategy that Combines Fragment Molecular Orbital Calculations, Machine Learning, Molecular Docking, Interaction Fingerprint Filtering, and Molecular Dynamics Simulations.

Building something in Protein Design?
I love collaborating on new challenges. Let's build together.
Subscribe to Protein Design Digest
Daily curated signals from arXiv, PubMed, and BioRxiv.
Signal of the Day
SMARTDock: A Toolkit for the Automated Development of Target-Specific Scoring Functions Using Bioactivity Data.
Molecular docking has become an essential tool in the early stages of structure-based drug discovery, enabling rapid virtual screening of large compound libraries against biological targets. However, the accuracy of binder selection is often limited by the available scoring functions. Here, we present a novel workflow SMARTDock (Scoring with Machine learning and Activity for Ranking Targeted Docking) that enhances the virtual screening capabilities of GOLD docking by integrating publicly available bioactivity data, a protein-ligand interaction fingerprint (PADIF), and machine learning classification models within a user-friendly Docker environment. This platform-independent approach enables seamless use on different operating systems and is accessible to both computational and medicinal chemists. With only a ChEMBL target ID, a protein structure file, and a SMILES list of testing compounds, users can build and apply target-specific scoring models to improve the enrichment of active compounds in the top ranks. SMARTDock implements the PADIF-based ML methodology to assist in virtual screening. Previous validation of this underlying methodology demonstrated its capacity to enhance screening performance across multiple targets. Finally, we show the advantages and disadvantages in the bioactive classification in virtual screening tasks.
Why this matters:
Also Worth Reading
AlphaFold for Docking Screens.
AlphaFold is an AI system developed by Google DeepMind to generate three-dimensional structures of proteins without experimental data. The models created with AlphaFold are available on the AlphaFold Protein Structure Database (AlphaFoldDB) ( https://alphafold.ebi.ac.uk/ ). The AlphaFold database is searchable by sequence and protein identification. This chapter focuses on an AlphaFold model and its use for docking screens using Molegro Virtual Docker. We rely on Jupyter Notebooks to integrate docking simulations and build regression models based on the atomic coordinates of protein-pose complexes. Our study focuses on constructing a neural network regression model to predict the inhibition of cyclin-dependent kinase 19 (CDK19). This enzyme is a target for anticancer drugs and does not have experimental data for its atomic coordinates. We utilize the Molegro Data Modeller to construct a regression model based on docking results of inhibitors for which binding affinity data is available. All CDK19 datasets and Jupyter Notebooks discussed in this work are available at GitHub: https://github.com/azevedolab/docking#readme .
Geometric deep learning assists protein engineering. Opportunities and Challenges.
Protein engineering is experiencing a paradigmatic transformation through the integration of geometric deep learning (GDL) into computational design workflows. While traditional approaches such as rational design and directed evolution have achieved significant progress, they remain constrained by the vastness of sequence space and the cost of experimental validation. GDL overcomes these limitations by operating on non-Euclidean domains and by capturing the spatial, topological, and physicochemical features that govern protein function. This perspective provides a comprehensive and critical overview of GDL applications in stability prediction, functional annotation, molecular interaction modeling, and de novo protein design. It consolidates methodological principles, architectural diversity, and performance trends across representative studies, emphasizing how GDL enhances interpretability and generalization in protein science. Aimed at both computational method developers and experimental protein engineers, the review bridges algorithmic concepts with practical design considerations, offering guidance on data representation, model selection, and evaluation strategies. By integrating explainable artificial intelligence and structure-based validation within a unified conceptual framework, this work highlights how GDL can serve as a foundation for transparent, interpretable, and autonomous protein design. As GDL converges with generative modeling, molecular simulation, and high-throughput experimentation, it is poised to become a cornerstone technology for next-generation protein engineering and synthetic biology.
Modeling Protein-Protein Complexes by Combining pyDock and AlphaFold.
The lack of experimental structures for the majority of protein-protein complexes has motivated the development of a variety of strategies for the structural modeling of protein complexes, such as computational docking, in active development for the last decades, and the more recent artificial intelligence (AI)-based ground-breaking methodologies. Among the existing computational docking methods, Python docking (pyDock) has shown competitive predictive rates and high robustness over the years. However, the field has dramatically changed with the appearance of artificial intelligence (AI)-based methods, like AlphaFold. While structure prediction of individual proteins is virtually solved by this program, the focus is now on how to improve the prediction of challenging cases like antibody-antigen complexes, multiprotein complexes, weak interactions, or highly flexible interacting proteins. Successful strategies are based on the generation of more diverse sets of models and the integration with other “classical” approaches that facilitate the identification of the correct models. Here, we will show in practical terms how to combine the structural modeling capabilities of AlphaFold with the energy-based scoring function in pyDock to improve structural predictions in challenging protein-protein complexes.
Research & AI Updates
- Monte Rosa Therapeutics to Present Interim MRT-8102 Phase 1 Study Results - The Manila Times — Monte Rosa Therapeutics to Present Interim MRT-8102 Phase 1 Study Results The Manila Times.
From the Industry
- Evidence Supports Safe, Effective Switching to Etanercept Biosimilars - Center for Biosimilars — Evidence Supports Safe, Effective Switching to Etanercept Biosimilars Center for Biosimilars.
- ProBioGen and Zag Bio™ Forge Strategic CMC Partnership to Advance Fc-Fusion Autoimmune Therapy - Biotech Newswire — ProBioGen and Zag Bio™ Forge Strategic CMC Partnership to Advance Fc-Fusion Autoimmune Therapy Biotech Newswire.
- The new science and business of oral biologics - The Pharma Letter — The new science and business of oral biologics The Pharma Letter.
- Our Human-Centric Approach to Partnership Amidst an Evolving Biotech Landscape - Sanofi — Our Human-Centric Approach to Partnership Amidst an Evolving Biotech Landscape Sanofi.
- French biotech TheraVectys weighs Hong Kong IPO - Bloomberg - Investing.com — French biotech TheraVectys weighs Hong Kong IPO - Bloomberg Investing.com.
- Piper Sandler: Biotech Funding Seems To Be Recovering (NYSE:PIPR) - Seeking Alpha — Piper Sandler: Biotech Funding Seems To Be Recovering (NYSE:PIPR) Seeking Alpha.
- Aktis aims for $209M windfall from 1st biotech IPO of 2026 - Fierce Biotech — Aktis aims for $209M windfall from 1st biotech IPO of 2026 Fierce Biotech.
Quick Reads
Exploring the Anti-Inflammatory Molecular Mechanism of Gentiana szechenyii Kanitz. Based on UPLC-MS/MS Combined With Network Pharmacology, Molecular Docking, and Molecular Dynamics Simulation.
This study explored the anti-inflammatory mechanisms of Gentiana szechenyii Kanitz. Read more →
Targeting spermidine synthase in <i>Leishmania donovani</i>: molecular docking and molecular dynamics simulation-based evaluation of Indian medicinal plant phytochemicals.
Visceral leishmaniasis, caused by Leishmania donovani , remains a critical global health challenge due to limited, toxic, and costly treatment options and rising drug resistance. Read more →
Convolutional neural network-assisted screening of natural product inhibitors against <i>Naja naja</i> venom: insights from molecular docking, molecular dynamics simulations and ADMET profiling.
Snakebite envenomation continues to be a major issue of public health which is mainly the case in tropical areas such as India where Naja naja is the main cause of death and diseases related to snakebite. Read more →
Establishing FDA-approved oncology drugs as GPR176 inhibitor through homology modelling, molecular docking, MMGBSA, DFT, and molecular dynamics simulation.
Unraveling the mechanism of curcumin in coronary slow flow phenomenon through network pharmacology and molecular docking.
The coronary slow flow phenomenon (CSFP) is associated with an increased risk of adverse cardiovascular events, yet standardized treatment is lacking. Read more →
Molecular docking and dynamic simulation of escherichia coli K-12 Elements as a Biosensor for Detecting 2,4,6-Trinitrotoluene (TNT).
Trinitrotoluene (TNT) is widely used in military and industrial fields due to its strong explosive properties and chemical stability. Read more →
Assessing the validity of leucine zipper constructs predicted by AlphaFold.
AP-1 transcription factors are a network of cellular regulators that combine in different dimer pairs to control a range of pathways involved in differentiation, growth, and cell death. Read more →
Pipeline Tip
Normalise thermal B-factors when comparing different crystal structures.
Resources & Tools
- Dataset: PDB-REDO - Optimized protein structure database with refined models.
- Dataset: CATH - Hierarchical protein domain classification for structure and function.
- Tool: MultiFOLD/IntFOLD - High-performance protein structure prediction and quality assessment server. View all tools →
- Tool: PyMOL - Gold standard for molecular visualization and publication-quality imaging. View all tools →
- Event: Structural Biology Events (Open)
- Event: Protein Design Hub (LinkedIn Group) (Ongoing)
- Job: Mercor hiring Bioinformatics Data-Science Specialist in Greater Montreal Metropolitan Area - LinkedIn at Bioinformatics Careers
- Job: European Bioinformatics Institute | EMBL-EBI hiring Research Management Office Lead in England, United Kingdom - LinkedIn at Bioinformatics Careers
Deep learning is not a magic wand, but a powerful lens for structural biology. — Recep Adiyaman