Recep Adiyaman
bioinformatics

Issue #17: Discovery of PPARγ Partial Agonists for Treatment of Type 2 Diabetes Based on an Integrated Virtual Screening Strategy that Combines Fragment Molecular Orbital Calculations, Machine Learning, Molecular Docking, Interaction Fingerprint Filtering, and Molecular Dynamics Simulations.

January 09, 2026 Daily Intelligence
Protein Design Daily

Building something in Protein Design?

I love collaborating on new architectural challenges. Let's build together.

🧬 Protein Design Digest

Curated protein signals by Recep Adiyaman

Join 1,000+ researchers. Unsubscribe anytime.

🚀 Today’s Top Signal

Discovery of PPARγ Partial Agonists for Treatment of Type 2 Diabetes Based on an Integrated Virtual Screening Strategy that Combines Fragment Molecular Orbital Calculations, Machine Learning, Molecular Docking, Interaction Fingerprint Filtering, and Molecular Dynamics Simulations.

🧬 Abstract

Peroxisome proliferator-activated receptor γ (PPARγ) is a key therapeutic target for type 2 diabetes and cardiovascular diseases due to its central role in regulating glucose and lipid metabolism. While full PPARγ agonists exhibit efficacy, they are linked to adverse effects; in contrast, PPARγ partial agonists retain metabolic regulatory functions with improved safety, representing promising candidates for type 2 diabetes treatment. However, their action mechanisms and structure-activity relationships remain unclear. Herein, we developed an integrated virtual screening strategy combining fragment molecular orbital (FMO) calculations, machine learning, molecular docking, interaction fingerprint (IFP) filtering, and molecular dynamics (MD) simulations to identify potential PPARγ partial agonists and elucidate their interaction mechanisms. FMO analysis first confirmed interaction differences between PPARγ agonist classes at the binding pocket, pinpointing critical residues (CYS285, ARG288, ILE341, and SER342) for partial agonist activity. Using three machine learning algorithms (random forest, extra trees, and XGBoost) with extended connectivity fingerprints (ECFP), we constructed QSAR classification models and screened 9630 compounds. SHAP analysis highlighted key fingerprint fragments (positions 45, 1034, and 1243) governing bioactivity. Molecular docking and IFP refinement yielded six high-potency candidates, whose binding stability and partial agonist properties were validated via MD simulations, MM/PBSA binding free energy calculations, hydrogen bond analysis, and FMO calculations. Notably, these candidates did not directly interact with the AF2 domain, consistent with the canonical partial agonist mode of action. This multidisciplinary approach provides a framework for rational design of novel PPARγ partial agonists, and the identified molecules serve as promising leads for type 2 diabetes therapeutics.

Why it matters: Expands the searchable sequence space for novel folds and high-affinity binders.


⭐ Additional Signals

AlphaFold for Docking Screens.

AlphaFold is an AI system developed by Google DeepMind to generate three-dimensional structures of proteins without experimental data. The models created with AlphaFold are available on the AlphaFold Protein Structure Database (AlphaFoldDB) ( https://alphafold.ebi.ac.uk/ ). The AlphaFold database is searchable by sequence and protein identification. This chapter focuses on an AlphaFold model and its use for docking screens using Molegro Virtual Docker. We rely on Jupyter Notebooks to integrate docking simulations and build regression models based on the atomic coordinates of protein-pose complexes. Our study focuses on constructing a neural network regression model to predict the inhibition of cyclin-dependent kinase 19 (CDK19). This enzyme is a target for anticancer drugs and does not have experimental data for its atomic coordinates. We utilize the Molegro Data Modeller to construct a regression model based on docking results of inhibitors for which binding affinity data is available. All CDK19 datasets and Jupyter Notebooks discussed in this work are available at GitHub: https://github.com/azevedolab/docking#readme .

Geometric deep learning assists protein engineering. Opportunities and Challenges.

Protein engineering is experiencing a paradigmatic transformation through the integration of geometric deep learning (GDL) into computational design workflows. While traditional approaches such as rational design and directed evolution have achieved significant progress, they remain constrained by the vastness of sequence space and the cost of experimental validation. GDL overcomes these limitations by operating on non-Euclidean domains and by capturing the spatial, topological, and physicochemical features that govern protein function. This perspective provides a comprehensive and critical overview of GDL applications in stability prediction, functional annotation, molecular interaction modeling, and de novo protein design. It consolidates methodological principles, architectural diversity, and performance trends across representative studies, emphasizing how GDL enhances interpretability and generalization in protein science. Aimed at both computational method developers and experimental protein engineers, the review bridges algorithmic concepts with practical design considerations, offering guidance on data representation, model selection, and evaluation strategies. By integrating explainable artificial intelligence and structure-based validation within a unified conceptual framework, this work highlights how GDL can serve as a foundation for transparent, interpretable, and autonomous protein design. As GDL converges with generative modeling, molecular simulation, and high-throughput experimentation, it is poised to become a cornerstone technology for next-generation protein engineering and synthetic biology.

Modeling Protein-Protein Complexes by Combining pyDock and AlphaFold.

The lack of experimental structures for the majority of protein-protein complexes has motivated the development of a variety of strategies for the structural modeling of protein complexes, such as computational docking, in active development for the last decades, and the more recent artificial intelligence (AI)-based ground-breaking methodologies. Among the existing computational docking methods, Python docking (pyDock) has shown competitive predictive rates and high robustness over the years. However, the field has dramatically changed with the appearance of artificial intelligence (AI)-based methods, like AlphaFold. While structure prediction of individual proteins is virtually solved by this program, the focus is now on how to improve the prediction of challenging cases like antibody-antigen complexes, multiprotein complexes, weak interactions, or highly flexible interacting proteins. Successful strategies are based on the generation of more diverse sets of models and the integration with other “classical” approaches that facilitate the identification of the correct models. Here, we will show in practical terms how to combine the structural modeling capabilities of AlphaFold with the energy-based scoring function in pyDock to improve structural predictions in challenging protein-protein complexes.


🧪 AI & Research News

🏢 Industry Insight & Applications


⚡ Quick Reads

Exploring the Anti-Inflammatory Molecular Mechanism of Gentiana szechenyii Kanitz. Based on UPLC-MS/MS Combined With Network Pharmacology, Molecular Docking, and Molecular Dynamics Simulation.

This study explored the anti-inflammatory mechanisms of Gentiana szechenyii Kanitz. (GS), a Tibetan medicinal herb, by combining UPLC-MS/MS, network pharmacology, molecular docking, and molecular dynamics (MD) simulation. Using the lipopolysaccharide (LPS)-induced RAW264.7 cell inflammation model, the anti-inflammatory effect of GS was confirmed by detecting the release amount of nitric oxide (NO) and the levels of inflammatory factors tumor necrosis factor (TNF) and interleukin-6 (IL-6). UPLC-MS/MS identified 40 constituents, whereas network analysis predicted 5 core compounds (isovitexin 4’,7-diglucoside, loganin, isoorientin-2″-O-glucoside, gentiopicroside, sweroside), 5 key targets (TNF, IL-6, GAPDH, epidermal growth factor receptor [EGFR], HSP90AA1), and three critical pathways (PI3K-Akt, hypoxia inducible factor-1 [HIF-1], IL-17). Molecular docking showed strong binding between core compounds and targets; the binding energies were all lower than -5 kcal mol -1 , among which isovitexin 4’,7-diglucoside had the lowest binding energy to EGFR (-9.4 kcal mol -1 ). MD simulation confirmed stable binding of TNF with the five core compounds. This study comprehensively clarifies the pharmacodynamic material basis and mechanism of action of GS in anti-inflammation, providing an experimental basis for further development and utilization. It is expected to be applied to the adjuvant treatment of inflammation-related diseases such as chronic bronchitis and pharyngitis in the future, thereby promoting the modernization of Tibetan medicine.

SMARTDock: A Toolkit for the Automated Development of Target-Specific Scoring Functions Using Bioactivity Data.

Molecular docking has become an essential tool in the early stages of structure-based drug discovery, enabling rapid virtual screening of large compound libraries against biological targets. However, the accuracy of binder selection is often limited by the available scoring functions. Here, we present a novel workflow SMARTDock (Scoring with Machine learning and Activity for Ranking Targeted Docking) that enhances the virtual screening capabilities of GOLD docking by integrating publicly available bioactivity data, a protein-ligand interaction fingerprint (PADIF), and machine learning classification models within a user-friendly Docker environment. This platform-independent approach enables seamless use on different operating systems and is accessible to both computational and medicinal chemists. With only a ChEMBL target ID, a protein structure file, and a SMILES list of testing compounds, users can build and apply target-specific scoring models to improve the enrichment of active compounds in the top ranks. SMARTDock implements the PADIF-based ML methodology to assist in virtual screening. Previous validation of this underlying methodology demonstrated its capacity to enhance screening performance across multiple targets. Finally, we show the advantages and disadvantages in the bioactive classification in virtual screening tasks.

Establishing FDA-approved oncology drugs as GPR176 inhibitor through homology modelling, molecular docking, MMGBSA, DFT, and molecular dynamics simulation.

Molecular docking and dynamic simulation of escherichia coli K-12 Elements as a Biosensor for Detecting 2,4,6-Trinitrotoluene (TNT).

Trinitrotoluene (TNT) is widely used in military and industrial fields due to its strong explosive properties and chemical stability. However, its persistence in the environment and harmful effects on living organisms make it important to develop sensitive and selective detection methods. Previous research has identified the Escherichia coli genes yadG and aspC as promising components for TNT biosensors, based on their increased gene expression in response to TNT exposure. Although these findings are promising, it is still unclear whether the proteins produced from these genes directly interact with TNT at the molecular level. This study focuses on analyzing the binding interactions between TNT and the protein products of yadG and aspC using computational methods. Molecular docking showed that TNT binds more strongly to yadG (- 6.81 ± 0.02 kcal/mol) than to aspC (- 6.23 ± 0.00 kcal/mol). Further analysis using molecular dynamics simulations with MM-GBSA calculations confirmed that the yadG-TNT complex is more stable, with a binding free energy (ΔG) of - 23.58 kJ/mol, in line with fluorescence data that also indicated stronger binding to yadG. TNT binding to yadG involves aromatic residues (Tyr-106, His-153) and hydrophobic contacts (Ala-150), which may promote π-π stacking and suggest reduced water occupancy. These features highlight key principles for protein engineering and suggest a clear route from computational findings to biosensor development.

Assessing the validity of leucine zipper constructs predicted by AlphaFold.

AP-1 transcription factors are a network of cellular regulators that combine in different dimer pairs to control a range of pathways involved in differentiation, growth, and cell death. They dimerize via leucine zipper coiled-coil domains that are preceded by a basic DNA binding domain. Depending on which AP-1 transcription factors dimerize, different DNA sequences will be recognized resulting in differential gene expression. The affinity of AP-1 transcription factors for each other dictates which dimers form. The relative concentration of AP-1 transcription factors varies with tissue type and environment, adding another layer of control to this integral network of cellular regulation. The development of artificial intelligence (AI)-based protein structure prediction methods gives us a new technique to investigate or predict how dimerization affects combinatorial control. All versions of AlphaFold2 and AlphaFold3 are AI/deep learning programs that predict 3D structures of proteins from an amino acid sequence and multiple sequence alignments of homologous proteins. To fully realize the potential of AI for structural biology, it is essential to understand its current capabilities and limitations. In this study, we used the classical example of an AP-1 dimer: Fos and Jun, and an array of over 2000 experimentally tested human leucine zippers to interrogate how AlphaFold models leucine zipper domains and if AlphaFold can be used to differentiate between probable and improbable dimer interfaces. We found that AlphaFold predicts highly confident leucine zipper dimers, even for dimer pairs such as the FosB homodimer, for which electrostatics are known to prevent their formation in vivo. This is an important case study concerning high-confidence but low-accuracy protein structure prediction.

A screening strategy for bioactive components from Amaranth: An integrated approach of network pharmacology, molecular docking and molecular dynamics simulation.

Amaranth is a traditional medicinal and forage plant with promising anti-inflammatory properties. To enhance its utilization in livestock and feed industries, this study investigated the bioactive compounds and mechanisms of Amaranth at different growth stages using metabolomics and network pharmacology. LC-MS/MS identified 266 metabolites, including key compounds such as ferulic acid, isoferulic acid, sinapic acid, and 13-HODE. A total of 132 inflammation-related targets were screened, and enrichment analysis revealed their involvement in ATP binding, inflammatory response, and PI3K-Akt/MAPK signaling pathways. Molecular docking and molecular dynamics simulations confirmed strong interactions between core targets (e.g., IL6, MMP9) and major compounds. These findings demonstrate that phenolic acids and fatty acids in Amaranth possess anti-inflammatory activity, underpinning its prospective use in the formulation of biofunctional feeds and in promoting the health of livestock.

Network pharmacology and molecular docking reveal mechanisms of amiodarone-induced pulmonary fibrosis.

Pulmonary fibrosis is a common end-stage outcome of various chronic lung diseases, characterized by excessive extracellular matrix deposition, alveolar structural destruction, and progressive loss of pulmonary function. Despite advances in understanding its pathogenesis, effective therapeutic options remain scarce, highlighting the need for novel strategies. Amiodarone, a widely prescribed antiarrhythmic drug, is associated with pulmonary fibrosis as a severe adverse effect; however, its molecular mechanisms remain incompletely understood. Network pharmacology, combined with molecular docking, has recently emerged as a powerful approach to systematically uncover key targets and pathways underlying drug-induced organ toxicity. This study aimed to elucidate the potential mechanisms of amiodarone-induced pulmonary fibrosis by integrating network pharmacology analysis and molecular docking, thereby providing a theoretical basis for future mechanistic studies and potential preventive or therapeutic strategies. Network pharmacology and molecular docking approaches were applied to explore the mechanisms of amiodarone-induced pulmonary fibrosis. Potential amiodarone targets were predicted using publicly available databases, while pulmonary fibrosis-related genes were retrieved from GeneCards, DisGeNET, and OMIM. Common drug-disease targets were identified through Venn diagram analysis. Protein-protein interaction (PPI) networks were constructed using STRING, and hub genes were determined through topological analysis. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses were conducted to identify biological processes and pathways involved. Molecular docking was performed to assess the binding affinity of amiodarone to key hub proteins. Finally, the predicted mechanisms were summarized and interpreted based on the integrated network and docking results." A total of 101 KEGG pathways were enriched for the intersection of amiodarone and pulmonary fibrosis targets. PPI network analysis identified eight key hub genes: ABCB1, ERBB2, XIAP, ABL1, SRC, HIF1A, AKT1, and ADRB2. GO enrichment analysis indicated that these targets are primarily involved in membrane-to-nucleus signaling, regulation of phosphorylation, and chromatin remodeling. KEGG pathway analysis highlighted significant enrichment in EGFR/ErbB signaling, VEGF signaling, and renin secretion pathways. Molecular docking suggested favorable binding of amiodarone to the predicted target proteins, Based on docking-score-derived estimates,ABCB1 and AKT1 showed the strongest predicted binding (estimated Kd ≈ 0.37 μM), followed by ERBB2 (≈ 2.9 μM) and ADRB2 (≈ 7.0 μM). Collectively, these findings suggest a signaling framework in which membrane receptor activation propagates through tyrosine kinase cascades to regulate gene expression, thereby linking extracellular stimuli to transcriptional control in the pathogenesis of pulmonary fibrosis. This study systematically explored the potential mechanisms of amiodarone-induced pulmonary fibrosis by integrating network pharmacology, enrichment analysis, and molecular docking. Eight hub targets (ABCB1, ERBB2, XIAP, ABL1, SRC, HIF1A, AKT1, and ADRB2) and three critical signaling pathways (EGFR/ErbB signaling, VEGF signaling, and renin secretion) were identified, providing new insights into the complex mechanisms of amiodarone-associated pulmonary toxicity. The proposed membrane-to-nucleus signaling framework, supported by network topology and docking-based predictions, may help explain coordinated cellular responses implicated in fibrotic progression and could inform the prioritization of targets for future experimental validation and therapeutic development. Collectively, these findings extend our understanding of drug-associated pulmonary fibrosis and provide a rationale for future studies aimed at risk stratification and potential preventive or therapeutic strategies for amiodarone-related pulmonary complications.

In silico characterization and molecular docking of the MIOX gene in Nile tilapia (Oreochromis niloticus).

Myo-inositol oxygenase (MIOX) plays an essential role in metabolic pathways and cell processes, controls oxidative stress response mechanisms, and balances osmotic stress in aquatic organisms. Molecular docking and structural analysis of the MIOX gene have been accomplished in this work. The MIOX gene has a length of 3608 bp, which encodes 286 amino acids (AA). The secondary structure revealed α-helical and random coils containing 40.56% alpha helices, 38.81% random coils, 14.69% extended strands, and 5.94% beta turns. The subcellular localization results showed that 56% of the MIOX gene is found in cytoplasm and then 10% in lysosome. The Ramachandran plot analysis showed that 90.2% of residues fall in the most favored region and 9.8% in the additional allowed region. Virtual screening of ligands and molecular docking of inositol (CID-892) and D-glucuronic acid (CID-94715) showed the highest docking score values of - 4.015 and - 3.563, respectively. The Potential Energy OPLS3e was - 1632.608 and - 1545.687. Inositol and D-glucuronic acid interacted with different residues of MIOX protein. However, a greater binding affinity of MIOX was observed with inositol than with D-glucuronic acid. This signifies the biochemical role of inositol that helps in determining the enzymatic efficiency. So, this study offers insights into protein modeling, molecular docking, and virtual screening of ligands against the MIOX receptor, revealing aspects of drug design and preventive approaches for fish salinity tolerance.

💡 Pipeline Tip

Verify FASTA headers for special characters that break Rosetta pipelines.


🛠️ Resources

Deep learning is not a magic wand, but a powerful lens for structural biology. — Recep Adiyaman

BS HF DK