Connect with us

AI Research

Protein–ligand data at scale to support machine learning

Published

on


  • Edwards, A. M. et al. Too many roads not taken. Nature 470, 163–165 (2011).

    CAS 
    PubMed 

    Google Scholar
     

  • Moustakim, M. et al. Target identification using chemical probes. Methods Enzymol. 610, 27–58 (2018).

    CAS 
    PubMed 

    Google Scholar
     

  • Bond, M. J. & Crews, C. M. Proteolysis targeting chimeras (PROTACs) come of age: entering the third decade of targeted protein degradation. RSC Chem. Biol. 2, 725–742 (2021).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Kanev, G. K., de Graaf, C., Westerman, B. A., de Esch, I. J. P. & Kooistra, A. J. KLIFS: an overhaul after the first 5 years of supporting kinase research. Nucleic Acids Res. 49, D562–D569 (2021).

    CAS 
    PubMed 

    Google Scholar
     

  • Bender, B. J. et al. A practical guide to large-scale docking. Nat. Protoc. 16, 4799–4832 (2021).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Petrović, D. et al. Virtual screening in the cloud identifies potent and selective ROS1 kinase inhibitors. J. Chem. Inf. Model. 62, 3832–3843 (2022).

    PubMed 

    Google Scholar
     

  • Alon, A. et al. Structures of the σ2 receptor enable docking for bioactive ligand discovery. Nature 600, 759–764 (2021).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Stein, R. M. et al. Virtual discovery of melatonin receptor ligands to modulate circadian rhythms. Nature 579, 609–614 (2020).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Ren, F. et al. A small-molecule TNIK inhibitor targets fibrosis in preclinical and clinical models. Nat. Biotechnol. 43, 63–75 (2025).

    CAS 
    PubMed 

    Google Scholar
     

  • Lyu, J. et al. Ultra-large library docking for discovering new chemotypes. Nature 566, 224–229 (2019).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Schneider, P. et al. Rethinking drug design in the artificial intelligence era. Nat. Rev. Drug Discov. 19, 353–364 (2020).

    CAS 
    PubMed 

    Google Scholar
     

  • Zhu, T. et al. Hit identification and optimization in virtual screening: practical recommendations based on a critical literature analysis. J. Med. Chem. 56, 6560–6572 (2013).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Schneider, G. Virtual screening: an endless staircase? Nat. Rev. Drug Discov. 9, 273–276 (2010).

    CAS 
    PubMed 

    Google Scholar
     

  • Carter, A. J. et al. Target 2035: probing the human proteome. Drug Discov. Today 24, 2111–2115 (2019).

    CAS 
    PubMed 

    Google Scholar
     

  • Ackloo, S. et al. CACHE (Critical assessment of computational hit-finding experiments): a public–private partnership benchmarking initiative to enable the development of computational methods for hit-finding. Nat. Rev. Chem. 6, 287–295 (2022).

    PubMed 
    PubMed Central 

    Google Scholar
     

  • For chemists, the AI revolution has yet to happen. Nature 617, 438 (2023).

  • Mock, M., Edavettal, S., Langmead, C. & Russell, A. AI can help to speed up drug discovery — but only if we give it the right data. Nature 621, 467–470 (2023).

    CAS 
    PubMed 

    Google Scholar
     

  • Martin, E. J. et al. All-assay-max2 pQSAR: activity predictions as accurate as four-concentration IC50s for 8558 novartis assays. J. Chem. Inf. Model. 59, 4450–4459 (2019).

    CAS 
    PubMed 

    Google Scholar
     

  • Landrum, G. A. & Riniker, S. Combining IC50 or Ki values from different sources is a source of significant noise. J. Chem. Inf. Model. 64, 1560–1567 (2024).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Martin, E. J. & Zhu, X. W. Collaborative profile-QSAR: a natural platform for building collaborative models among competing companies. J. Chem. Inf. Model. 61, 1603–1616 (2021).

    CAS 
    PubMed 

    Google Scholar
     

  • Zardecki, C., Dutta, S., Goodsell, D. S., Voigt, M. & Burley, S. K. RCSB Protein Data Bank: a resource for chemical, biochemical, and structural explorations of large and small biomolecules. J. Chem. Educ. 93, 569–575 (2016).

    CAS 

    Google Scholar
     

  • Moult, J., Pedersen, J. T., Judson, R. & Fidelis, K. A large‐scale experiment to assess protein structure prediction methods. Proteins 23, ii–v (1995).

    CAS 
    PubMed 

    Google Scholar
     

  • Edfeldt, K. et al. A data science roadmap for open science organizations engaged in early-stage drug discovery. Nat. Commun. 15, 5640 (2024).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Thorne, N., Auld, D. S. & Inglese, J. Apparent activity in high-throughput screening: origins of compound-dependent assay interference. Curr. Opin. Chem. Biol. 14, 315–324 (2010).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Clark, M. A. et al. Design, synthesis and selection of DNA-encoded small-molecule libraries. Nat. Chem. Biol. 5, 647–654 (2009).

    CAS 
    PubMed 

    Google Scholar
     

  • McCloskey, K. et al. Machine learning on DNA-encoded libraries: a new paradigm for hit finding. J. Med. Chem. 63, 8857–8866 (2020).

    CAS 
    PubMed 

    Google Scholar
     

  • Li, A. S. M. et al. Discovery of nanomolar DCAF1 small molecule ligands. J. Med. Chem. 66, 5041–5060 (2023).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Ahmad, S. et al. Discovery of a first-in-class small-molecule ligand for WDR91 using DNA-encoded chemical library selection followed by machine learning. J. Med. Chem. 66, 16051–16061 (2023).

    CAS 
    PubMed 

    Google Scholar
     

  • Kelly, M. A., McLellan, T. J. & Rosner, P. J. Strategic use of affinity-based mass spectrometry techniques in the drug discovery process. Anal. Chem. 74, 1–9 (2002).

    CAS 
    PubMed 

    Google Scholar
     

  • Prudent, R., Annis, D. A., Dandliker, P. J., Ortholand, J. Y. & Roche, D. Exploring new targets and chemical space with affinity selection-mass spectrometry. Nat. Rev. Chem. 5, 62–71 (2021).

    CAS 
    PubMed 

    Google Scholar
     

  • Gesmundo, N. J. et al. Nanoscale synthesis and affinity ranking. Nature 557, 228–232 (2018).

    CAS 
    PubMed 

    Google Scholar
     

  • L’Heureux, A., Grolinger, K., Elyamany, H. F. & Capretz, M. A. M. Machine learning with big data: challenges and approaches. IEEE Access. 5, 7776–7797 (2017).


    Google Scholar
     

  • Najafabadi, M. M. et al. Deep learning applications and challenges in big data analytics. J. Big Data 2, 1 (2015).


    Google Scholar
     

  • Lo, Y. C., Rensi, S. E., Torng, W. & Altman, R. B. Machine learning in chemoinformatics and drug discovery. Drug Discov. Today 23, 15–38-1546 (2018).


    Google Scholar
     

  • Brenner, S. & Lerner, R. A. Encoded combinatorial chemistry. Proc. Natl Acad. Sci. USA 89, 5381–5383 (1992).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Melkko, S., Dumelin, C. E., Scheuermann, J. & Neri, D. Lead discovery by DNA-encoded chemical libraries. Drug Discov. Today 12, 456–471 (2007).


    Google Scholar
     

  • Gironda-Martínez, A., Donckele, E. J., Samain, F. & Neri, D. DNA-encoded chemical libraries: a comprehensive review with succesful stories and future challenges. ACS Pharmacol. Transl. Sci. 4, 1265–1279 (2021).

    PubMed 
    PubMed Central 

    Google Scholar
     

  • Peterson, A. A. & Liu, D. R. Small-molecule discovery through DNA-encoded libraries. Nat. Rev. Drug Discov. 22, 699–722 (2023).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Lim, K. S. et al. Machine learning on DNA-encoded library count data using an uncertainty-aware probabilistic loss function. J. Chem. Inf. Model. 62, 2316–2331 (2022).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Tingle, B. I. et al. ZINC-22 — a free multi-billion-scale database of tangible compounds for ligand discovery. J. Chem. Inf. Model. 63, 1166–1176 (2023).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Ackloo, S. et al. A target class ligandability evaluation of WD40 repeat-containing proteins. J. Med. Chem. 68, 1092–1112 (2024).

    PubMed 
    PubMed Central 

    Google Scholar
     

  • Han, S. et al. Highly selective novel heme oxygenase-1 hits found by DNA-encoded library machine learning beyond the DEL chemical space. ACS Med. Chem. Lett. 15, 1456–1466 (2024).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • SGC and HitGen announce research collaboration focused on DNA-encoded library based drug discovery. HitGen https://www.hitgen.com/en/news-details-319.html (2023).

  • X-chem and structural genomics consortium enter into collaboration to unlock the human proteome and promote open science. X-Chem https://www.x-chemrx.com/about/news/x-chem-and-structural-genomics-consortium-enter-into-collaboration-to-unlock-the-human-proteome-and-promote-open-science/ (2023).

  • Wellnitz, J. et al. Enabling open machine learning of DNA encoded library selections to accelerate the discovery of small molecule protein binders. Preprint at https://doi.org/10.26434/chemrxiv-2024-xd385 (2024).

  • Prudent, R., Lemoine, H., Walsh, J. & Roche, D. Affinity selection mass spectrometry speeding drug discovery. Drug Discov. Today 28, 103760 (2023).

    CAS 
    PubMed 

    Google Scholar
     

  • Xin, Y. et al. Affinity selection of double-click triazole libraries for rapid discovery of allosteric modulators for GLP-1 receptor. Proc. Natl Acad. Sci. USA 120, e2220767120 (2023).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Liu, J. et al. The omega-3 hydroxy fatty acid 7(S)-HDHA is a high-affinity PPARα ligand that regulates brain neuronal morphology. Sci. Signal. 15, eabo1857 (2022).

    CAS 
    PubMed 

    Google Scholar
     

  • Zhang, P. et al. Development of an α-klotho recognizing high-affinity peptide probe from in-solution enrichment. JACS Au 4, 1334–1344 (2024).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Muchiri, R. N. & van Breemen, R. B. Affinity selection–mass spectrometry for the discovery of pharmacologically active compounds from combinatorial libraries and natural products. J. Mass Spectrom. 56, e4647 (2021).

    CAS 
    PubMed 

    Google Scholar
     

  • Wang, X. et al. Enantioselective protein affinity selection mass spectrometry (EAS-MS). Preprint at https://doi.org/10.1101/2025.01.17.633682 (2025).

  • Paillard, G. et al. The ELF Honest Data Broker: informatics enabling public–private collaboration in a precompetitive arena. Drug Discov. Today 21, 97–102 (2016).

    PubMed 

    Google Scholar
     

  • Quancard, J. et al. The European Federation for Medicinal Chemistry and Chemical Biology (EFMC) best practice initiative: hit generation. ChemMedChem 18, e202300002 (2023).

    CAS 
    PubMed 

    Google Scholar
     

  • Giannetti, A. M., Koch, B. D. & Browner, M. F. Surface plasmon resonance based assay for the detection and characterization of promiscuous inhibitors. J. Med. Chem. 51, 574–580 (2008).

    CAS 
    PubMed 

    Google Scholar
     

  • Rich, R. L. & Myszka, D. G. Grading the commercial optical biosensor literature — class of 2008: ‘The Mighty Binders’. J. Mol. Recognit. 23, 1–64 (2010).

    CAS 
    PubMed 

    Google Scholar
     

  • Understanding SPR data. Critical Assessment of Computational Hit-Finding Experiments (CACHE) https://cache-challenge.org/sites/default/files/downloadable/forms/understanding_SPR_data.pdf (2024).

  • Wood, R. W. XLII. On a remarkable case of uneven distribution of light in a diffraction grating spectrum. Lond. Edinb. Dubl. Phil. Mag. J. Sci. 4, 396–402 (1902).


    Google Scholar
     

  • Kartal, Ö., Andres, F., Lai, M. P., Nehme, R. & Cottier, K. waveRAPID — a robust assay for high-throughput kinetic screens with the creoptix WAVEsystem. SLAS Discov. 26, 995–1003 (2021).

    CAS 
    PubMed 

    Google Scholar
     

  • Niesen, F. H., Berglund, H. & Vedadi, M. The use of differential scanning fluorimetry to detect ligand interactions that promote protein stability. Nat. Protoc. 2, 2212–2221 (2007).

    CAS 
    PubMed 

    Google Scholar
     

  • Sparks, R. P. & Fratti, R. in Methods in Molecular Biology (ed. Fratti, R.) 1860, 191–198 (2019).

  • Langer, A. et al. A new spectral shift-based method to characterize molecular interactions. Assay Drug Dev. Technol. 20, 83–94 (2022).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Meyer, P. & Saez-Rodriguez, J. Advances in systems biology modeling: 10 years of crowdsourcing DREAM challenges. Cell Syst. 12, 636–653 (2021).

    CAS 
    PubMed 

    Google Scholar
     

  • Tunyasuvunakool, K. et al. Highly accurate protein structure prediction for the human proteome. Nature 596, 590–596 (2021).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Manoharan, F. Google cloud expands higher education credits to 8 countries in Africa. Google Cloud https://cloud.google.com/blog/topics/public-sector/google-cloud-expands-higher-education-credits-8-countries-africa/ (2022).

  • MAchine learning Innovation Network For Research to Advance MEdicinal chemistry. MAINFRAME https://www.aircheck.ai/mainframe (2025).

  • Bedart, C. et al. The pan-Canadian chemical library: a mechanism to open academic chemistry to high-throughput virtual screening. Sci. Data 11, 597 (2024).

    PubMed 
    PubMed Central 

    Google Scholar
     

  • Burley, S. K. & Berman, H. M. Open-access data: a cornerstone for artificial intelligence approaches to protein structure prediction. Structure 29, 515–520 (2021).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Edwards, A. Reproducibility: team up with industry. Nature 531, 299–301 (2016).

    CAS 
    PubMed 

    Google Scholar
     

  • Mammoliti, A. et al. Orchestrating and sharing large multimodal data for transparent and reproducible research. Nat. Commun. 12, 5797 (2021).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Accessibility principles. Web Accessibility Initiative (WAI) https://www.w3.org/WAI/fundamentals/accessibility-principles/ (2024).



  • Source link

    AI Research

    Nvidia says ‘We never deprive American customers in order to serve the rest of the world’ — company says GAIN AI Act addresses a problem that doesn’t exist

    Published

    on


    The bill, which aimed to regulate shipments of AI GPUs to adversaries and prioritize U.S. buyers, as proposed by U.S. senators earlier this week, made quite a splash in America. To a degree, Nvidia issued a statement claiming that the U.S. was, is, and will remain its primary market, implying that no regulations are needed for the company to serve America.

    “The U.S. has always been and will continue to be our largest market,” a statement sent to Tom’s Hardware reads. “We never deprive American customers in order to serve the rest of the world. In trying to solve a problem that does not exist, the proposed bill would restrict competition worldwide in any industry that uses mainstream computing chips. While it may have good intentions, this bill is just another variation of the AI Diffusion Rule and would have similar effects on American leadership and the U.S. economy.”



    Source link

    Continue Reading

    AI Research

    OpenAI Projects $115 Billion Cash Burn by 2029

    Published

    on


    OpenAI has sharply raised its projected cash burn through 2029 to $115 billion, according to The Information. This marks an $80 billion increase from previous estimates, as the company ramps up spending to fuel the AI behind its ChatGPT chatbot.

    The company, which has become one of the world’s biggest renters of cloud servers, projects it will burn more than $8 billion this year, about $1.5 billion higher than its earlier forecast. The surge in spending comes as OpenAI seeks to maintain its lead in the rapidly growing artificial intelligence market.


    To control these soaring costs, OpenAI plans to develop its own data center server chips and facilities to power its technology.


    The company is partnering with U.S. semiconductor giant Broadcom to produce its first AI chip, which will be used internally rather than made available to customers, as reported by The Information.


    In addition to this initiative, OpenAI has expanded its partnership with Oracle, committing to a 4.5-gigawatt data center capacity to support its growing operations.


    This is part of OpenAI’s larger plan, the Stargate initiative, which includes a $500 billion investment and is also supported by Japan’s SoftBank Group. Google Cloud has also joined the group of suppliers supporting OpenAI’s infrastructure.


    OpenAI’s projected cash burn will more than double in 2024, reaching over $17 billion. It will continue to rise, with estimates of $35 billion in 2027 and $45 billion in 2028, according to The Information.

    Tags





    Source link

    Continue Reading

    AI Research

    PromptLocker scared ESET, but it was an experiment

    Published

    on


    The PromptLocker malware, which was considered the world’s first ransomware created using artificial intelligence, turned out to be not a real attack at all, but a research project at New York University.

    On August 26, ESET announced that detected the first sample of artificial intelligence integrated into ransomware. The program was called PromptLocker. However, as it turned out, it was not the case: researchers from the Tandon School of Engineering at New York University were responsible for creating this code.

    The university explained that PromptLocker — is actually part of an experiment called Ransomware 3.0, which was conducted by a team from the Tandon School of Engineering. A representative of the school told the publication that a sample of the experimental code was uploaded to the VirusTotal platform for malware analysis. It was there that ESET specialists discovered it, mistaking it for a real threat.

    According to ESET, the program used Lua scripts generated on the basis of strictly defined instructions. These scripts allowed the malware to scan the file system, analyze the contents, steal selected data, and perform encryption. At the same time, the sample did not implement destructive capabilities — a logical step, given that it was a controlled experiment.

    Nevertheless, the malicious code did function. New York University confirmed that their AI-based simulation system was able to go through all four classic stages of a ransomware attack: mapping the system, identifying valuable files, stealing or encrypting data, and creating a ransomware message. Moreover, it was able to do this on various types of systems — from personal computers and corporate servers to industrial controllers.

    Should you be concerned? Yes, but with an important caveat: there is a big difference between an academic proof-of-concept demonstration and a real attack carried out by malicious actors. However, such research can be a good opportunity for cybercriminals, as it shows not only the principle of operation but also the real costs of its implementation.



    Source link

    Continue Reading

    Trending