The Impact of Large Language Models on Drug Discovery

Explore top LinkedIn content from expert professionals.

Summary

Large language models (LLMs) are revolutionizing drug discovery by streamlining processes, improving predictions, and enabling new ways to analyze complex biological data. These AI systems, trained to understand and generate human-like text, are now being tailored to tackle specific challenges in developing safer and more effective drugs faster than traditional methods.

  • Explore modular solutions: Use LLM-powered tools to combine multiple steps in drug discovery, such as data retrieval, molecule generation, and property prediction, into a seamless workflow.
  • Focus on explainability: Embrace AI models that offer clear insights into their reasoning to enhance trust and refine their application in scientific research.
  • Develop biology-specific models: Consider building or adopting LLMs trained on biological and clinical data to uncover novel therapeutic targets and improve drug design processes.
Summarized by AI based on LinkedIn member posts
  • View profile for Amir Barati Farimani

    Associate Professor at Carnegie Mellon University

    8,525 followers

    🚀 Pushing the boundaries of AI in drug discovery! 🧬 “Large Language Model Agent for Modular Task Execution in Drug Discovery” — now on bioRxiv! In this work, we introduce AgentD, an LLM-powered agent that integrates language reasoning with domain-specific tools to automate and streamline the early-stage drug discovery pipeline. ✨ What can AgentD do? Retrieve biomedical data (FASTA sequences, SMILES, literature) from web and structured databases. Answer tough, domain-specific scientific questions grounded in real literature (via RAG). Generate diverse seed molecules (using REINVENT & Mol2Mol). Predict critical ADMET properties and binding affinities. Iteratively refine molecules to improve drug-likeness and safety. Generate 3D protein–ligand complex structures for deeper analysis. 🚀 Why is this exciting? Drug discovery typically takes 10–15 years and billions of dollars. AgentD tackles these bottlenecks by integrating all the pieces into one modular, flexible, LLM-driven framework — enabling rapid screening, prioritization, and structural evaluation of drug candidates. In our case study on BCL-2 for lymphocytic leukemia: ✅ Increased drug-likeness (QED > 0.6) from 34 to 55 molecules after just two refinement rounds. ✅ Boosted compounds satisfying empirical drug-likeness rules from 29 to 52. ✅ Generated 3D structures to prepare for docking and MD — all starting from a single query. The modular design means AgentD can easily incorporate new generative models, property predictors, and simulation tools, making it a robust foundation for next-generation AI-driven therapeutic discovery. 📖 Check out the preprint here: https://lnkd.in/eysCq2_A #DrugDiscovery #AI #LargeLanguageModels #ComputationalBiology #GenerativeAI #MachineLearning #PharmaTech #LLM #Bioinformatics #CMU

  • Surprisingly informative interview with Scott Gottlieb about the genAI drug development company Xaira announced yesterday with $1B of funding. (Most important: "Xaira" is pronounced "Zyra" 🙄 ). "What's different now is the use of large language models in the drug discovery and drug development process. You're seeing the large drug makers try to get instances of large language models within their own ecosystems so they can now load their own data on it and use it in a proprietary fashion." "Right now we're using models that are borrowed from the consumer tech world so they're not, they haven't been built to solve for human biology and we don't really have good clinical data training sets and that's the goal of the startup that we're funding is to try to build models that are purpose built for biology." "A lot of the data that we have to train models on...hasn't necessarily been collected in a fashion where it's optimized for training an artificial intelligence model, a large language model. So you might have proteomic or genomic data but it's not correlated with phenotypes." "It's also going to allow us if we can truly understand biology of different systems and different disease states, it might allow us to identify targets that we wouldn't have perceived otherwise. Using this model you can try to design antibodies intelligently that bind to protein targets that you might not have been able to get antibodies to bind to before because antibodies are the sum of small interactions and sometimes it's very hard to figure out how to design antibody to bind to certain targets." "Instead of borrowing some of the models that have been developed for more of a technological application you can actually develop models that are based on human biology and based on training sets that were built for the purpose of designing new models and that's happening right now."

  • View profile for Luke Yun

    building AI computer fixer | AI Researcher @ Harvard Medical School, Oxford

    32,831 followers

    Google DeepMind just open-sourced TxGemma: the first efficient, generalist LLM suite for therapeutic development! Drug discovery has long been hindered by high failure rates, expensive experiments + the need for specialized AI models for each step of the pipeline. 𝗧𝘅𝗚𝗲𝗺𝗺𝗮 𝗶𝘀 𝗮 𝗳𝗮𝗺𝗶𝗹𝘆 𝗼𝗳 𝗹𝗮𝗿𝗴𝗲 𝗹𝗮𝗻𝗴𝘂𝗮𝗴𝗲 𝗺𝗼𝗱𝗲𝗹𝘀 𝘁𝗿𝗮𝗶𝗻𝗲𝗱 𝘁𝗼 𝗽𝗿𝗲𝗱𝗶𝗰𝘁 𝘁𝗵𝗲𝗿𝗮𝗽𝗲𝘂𝘁𝗶𝗰 𝗽𝗿𝗼𝗽𝗲𝗿𝘁𝗶𝗲𝘀, 𝗲𝗻𝗮𝗯𝗹𝗲 𝗶𝗻𝘁𝗲𝗿𝗮𝗰𝘁𝗶𝘃𝗲 𝗿𝗲𝗮𝘀𝗼𝗻𝗶𝗻𝗴, 𝗮𝗻𝗱 𝗼𝗿𝗰𝗵𝗲𝘀𝘁𝗿𝗮𝘁𝗲 𝗰𝗼𝗺𝗽𝗹𝗲𝘅 𝘀𝗰𝗶𝗲𝗻𝘁𝗶𝗳𝗶𝗰 𝘄𝗼𝗿𝗸𝗳𝗹𝗼𝘄𝘀. 1. Achieved superior or comparable results to state-of-the-art models on 64 out of 66 therapeutic tasks, surpassing specialist models on 26. 2. Reduced the need for large training datasets in fine-tuning, making it suitable for data-limited applications like clinical trial outcome prediction.   𝟯. 𝗜𝗻𝘁𝗿𝗼𝗱𝘂𝗰𝗲𝗱 𝗮𝗻 𝗲𝘅𝗽𝗹𝗮𝗶𝗻𝗮𝗯𝗹𝗲 𝗔𝗜 𝘁𝗵𝗮𝘁 𝗮𝗹𝗹𝗼𝘄𝘀 𝘀𝗰𝗶𝗲𝗻𝘁𝗶𝘀𝘁𝘀 𝘁𝗼 𝗶𝗻𝘁𝗲𝗿𝗮𝗰𝘁 𝗶𝗻 𝗻𝗮𝘁𝘂𝗿𝗮𝗹 𝗹𝗮𝗻𝗴𝘂𝗮𝗴𝗲, 𝗿𝗲𝗰𝗲𝗶𝘃𝗲 𝗺𝗲𝗰𝗵𝗮𝗻𝗶𝘀𝘁𝗶𝗰 𝗿𝗲𝗮𝘀𝗼𝗻𝗶𝗻𝗴 𝗳𝗼𝗿 𝗽𝗿𝗲𝗱𝗶𝗰𝘁𝗶𝗼𝗻𝘀, 𝗮𝗻𝗱 𝗲𝗻𝗴𝗮𝗴𝗲 𝗶𝗻 𝘀𝗰𝗶𝗲𝗻𝘁𝗶𝗳𝗶𝗰 𝗱𝗶𝘀𝗰𝘂𝘀𝘀𝗶𝗼𝗻𝘀.   4. Developed a therapeutic AI agent powered by Gemini 2.0, which surpassed leading models in complex chemistry and biology reasoning benchmarks (+9.8% on Humanity’s Last Exam, +5.6% on ChemBench-Preference) Since Evo2 by NVIDIA, I've been on the lookout for papers using mechanistic interpretability for explainability. It has obvious benefits for medicine. The use of a conversational variant that explains its reasoning here is a great for informing the user both the strengths and limitations of the model. I'd recommend looking at the example of this when the model is given a molecule’s SMILES string and asked if it can cross the blood‑brain barrier. I'm a firm believer that more researchers in the field should be incorporating explainability into their models. Will be highlighting research more that does so here. It is essential for our ability to iterate on the right things faster to improve the model and actually trust the models. Here's the awesome work: https://lnkd.in/gP--FXVU Congrats to Eric Wang, Samuel Schmidgall, Fan Zhang, Paul F. Jaeger, Rory P. and Tiffany Chen! I post my takes on the latest developments in health AI – 𝗰𝗼𝗻𝗻𝗲𝗰𝘁 𝘄𝗶𝘁𝗵 𝗺𝗲 𝘁𝗼 𝘀𝘁𝗮𝘆 𝘂𝗽𝗱𝗮𝘁𝗲𝗱! Also, check out my health AI blog here: https://lnkd.in/g3nrQFxW

Explore categories