Recep Adiyaman
Daily Signal January 07, 2026 · 7 min read

Issue #15: SMARTDock: A Toolkit for the Automated Development of Target-Specific Scoring Functions Using Bioactivity Data.

Protein Design Digest - 2026-01-07 - Discovery of PPARγ Partial Agonists for Treatment of Type 2 Diabetes Based on an Integrated Virtual Screening Strategy that Combines Fragment Molecular Orbital Calculations, Machine Learning, Molecular Docking, Interaction Fingerprint Filtering, and Molecular Dynamics Simulations.

Share X LinkedIn
Protein Design Daily

Building something in Protein Design?

I love collaborating on new challenges. Let's build together.

Subscribe to Protein Design Digest

Daily curated signals from arXiv, PubMed, and BioRxiv.

Signal of the Day

SMARTDock: A Toolkit for the Automated Development of Target-Specific Scoring Functions Using Bioactivity Data.

Molecular docking has become an essential tool in the early stages of structure-based drug discovery, enabling rapid virtual screening of large compound libraries against biological targets. However, the accuracy of binder selection is often limited by the available scoring functions. Here, we present a novel workflow SMARTDock (Scoring with Machine learning and Activity for Ranking Targeted Docking) that enhances the virtual screening capabilities of GOLD docking by integrating publicly available bioactivity data, a protein-ligand interaction fingerprint (PADIF), and machine learning classification models within a user-friendly Docker environment. This platform-independent approach enables seamless use on different operating systems and is accessible to both computational and medicinal chemists. With only a ChEMBL target ID, a protein structure file, and a SMILES list of testing compounds, users can build and apply target-specific scoring models to improve the enrichment of active compounds in the top ranks. SMARTDock implements the PADIF-based ML methodology to assist in virtual screening. Previous validation of this underlying methodology demonstrated its capacity to enhance screening performance across multiple targets. Finally, we show the advantages and disadvantages in the bioactive classification in virtual screening tasks.

Why this matters:


Also Worth Reading

AlphaFold for Docking Screens.

AlphaFold is an AI system developed by Google DeepMind to generate three-dimensional structures of proteins without experimental data. The models created with AlphaFold are available on the AlphaFold Protein Structure Database (AlphaFoldDB) ( https://alphafold.ebi.ac.uk/ ). The AlphaFold database is searchable by sequence and protein identification. This chapter focuses on an AlphaFold model and its use for docking screens using Molegro Virtual Docker. We rely on Jupyter Notebooks to integrate docking simulations and build regression models based on the atomic coordinates of protein-pose complexes. Our study focuses on constructing a neural network regression model to predict the inhibition of cyclin-dependent kinase 19 (CDK19). This enzyme is a target for anticancer drugs and does not have experimental data for its atomic coordinates. We utilize the Molegro Data Modeller to construct a regression model based on docking results of inhibitors for which binding affinity data is available. All CDK19 datasets and Jupyter Notebooks discussed in this work are available at GitHub: https://github.com/azevedolab/docking#readme .

Geometric deep learning assists protein engineering. Opportunities and Challenges.

Protein engineering is experiencing a paradigmatic transformation through the integration of geometric deep learning (GDL) into computational design workflows. While traditional approaches such as rational design and directed evolution have achieved significant progress, they remain constrained by the vastness of sequence space and the cost of experimental validation. GDL overcomes these limitations by operating on non-Euclidean domains and by capturing the spatial, topological, and physicochemical features that govern protein function. This perspective provides a comprehensive and critical overview of GDL applications in stability prediction, functional annotation, molecular interaction modeling, and de novo protein design. It consolidates methodological principles, architectural diversity, and performance trends across representative studies, emphasizing how GDL enhances interpretability and generalization in protein science. Aimed at both computational method developers and experimental protein engineers, the review bridges algorithmic concepts with practical design considerations, offering guidance on data representation, model selection, and evaluation strategies. By integrating explainable artificial intelligence and structure-based validation within a unified conceptual framework, this work highlights how GDL can serve as a foundation for transparent, interpretable, and autonomous protein design. As GDL converges with generative modeling, molecular simulation, and high-throughput experimentation, it is poised to become a cornerstone technology for next-generation protein engineering and synthetic biology.

Modeling Protein-Protein Complexes by Combining pyDock and AlphaFold.

The lack of experimental structures for the majority of protein-protein complexes has motivated the development of a variety of strategies for the structural modeling of protein complexes, such as computational docking, in active development for the last decades, and the more recent artificial intelligence (AI)-based ground-breaking methodologies. Among the existing computational docking methods, Python docking (pyDock) has shown competitive predictive rates and high robustness over the years. However, the field has dramatically changed with the appearance of artificial intelligence (AI)-based methods, like AlphaFold. While structure prediction of individual proteins is virtually solved by this program, the focus is now on how to improve the prediction of challenging cases like antibody-antigen complexes, multiprotein complexes, weak interactions, or highly flexible interacting proteins. Successful strategies are based on the generation of more diverse sets of models and the integration with other “classical” approaches that facilitate the identification of the correct models. Here, we will show in practical terms how to combine the structural modeling capabilities of AlphaFold with the energy-based scoring function in pyDock to improve structural predictions in challenging protein-protein complexes.


Research & AI Updates

From the Industry


Quick Reads

Exploring the Anti-Inflammatory Molecular Mechanism of Gentiana szechenyii Kanitz. Based on UPLC-MS/MS Combined With Network Pharmacology, Molecular Docking, and Molecular Dynamics Simulation.

This study explored the anti-inflammatory mechanisms of Gentiana szechenyii Kanitz. Read more →

Targeting spermidine synthase in <i>Leishmania donovani</i>: molecular docking and molecular dynamics simulation-based evaluation of Indian medicinal plant phytochemicals.

Visceral leishmaniasis, caused by Leishmania donovani , remains a critical global health challenge due to limited, toxic, and costly treatment options and rising drug resistance. Read more →

Convolutional neural network-assisted screening of natural product inhibitors against <i>Naja naja</i> venom: insights from molecular docking, molecular dynamics simulations and ADMET profiling.

Snakebite envenomation continues to be a major issue of public health which is mainly the case in tropical areas such as India where Naja naja is the main cause of death and diseases related to snakebite. Read more →

Establishing FDA-approved oncology drugs as GPR176 inhibitor through homology modelling, molecular docking, MMGBSA, DFT, and molecular dynamics simulation.

Unraveling the mechanism of curcumin in coronary slow flow phenomenon through network pharmacology and molecular docking.

The coronary slow flow phenomenon (CSFP) is associated with an increased risk of adverse cardiovascular events, yet standardized treatment is lacking. Read more →

Molecular docking and dynamic simulation of escherichia coli K-12 Elements as a Biosensor for Detecting 2,4,6-Trinitrotoluene (TNT).

Trinitrotoluene (TNT) is widely used in military and industrial fields due to its strong explosive properties and chemical stability. Read more →

Assessing the validity of leucine zipper constructs predicted by AlphaFold.

AP-1 transcription factors are a network of cellular regulators that combine in different dimer pairs to control a range of pathways involved in differentiation, growth, and cell death. Read more →

Pipeline Tip

Normalise thermal B-factors when comparing different crystal structures.


Resources & Tools

Deep learning is not a magic wand, but a powerful lens for structural biology. — Recep Adiyaman

BS HF DK