Large language models (LLMs) like generative pre-trained Transformers and bidirectional encoder representations from Transformers have transformed natural language processing (NLP) and are increasingly applied in drug discovery. These models, trained on vast datasets, excel at text generation, comprehension, and pattern recognition, making them ideal for analyzing biomedical data, predicting drug interactions, and identifying new drug candidates. LLMs can synthesize information from multiple sources, speeding up hypothesis generation and streamlining drug development processes. They help overcome data scarcity issues by generating synthetic data, which enhances model training and prediction accuracy. In drug discovery, LLMs assist with molecule screening, target identification, and clinical trial optimization, reducing time and costs associated with traditional methods. Despite their potential, challenges remain, including data quality, interpretability, and the need for domain-specific adaptations. LLMs are also being explored for their ability to predict pharmacokinetics, toxicity, and drug–drug interactions (DDI). Future advancements will focus on integrating LLMs with other AI techniques, like Reinforcement Learning (RL) and generative models, to further enhance their capabilities in drug discovery. Collaboration between academia, industry, and regulatory bodies will be crucial for overcoming challenges and realizing the full potential of LLMs in delivering new, effective therapies.

Large Language Models in Drug Discovery / Gangwal, A., Lavecchia, A.. - (2026), pp. 437-468. [10.1007/978-3-031-98022-0_14]

Large Language Models in Drug Discovery

Lavecchia, Antonio
Ultimo
2026

Abstract

Large language models (LLMs) like generative pre-trained Transformers and bidirectional encoder representations from Transformers have transformed natural language processing (NLP) and are increasingly applied in drug discovery. These models, trained on vast datasets, excel at text generation, comprehension, and pattern recognition, making them ideal for analyzing biomedical data, predicting drug interactions, and identifying new drug candidates. LLMs can synthesize information from multiple sources, speeding up hypothesis generation and streamlining drug development processes. They help overcome data scarcity issues by generating synthetic data, which enhances model training and prediction accuracy. In drug discovery, LLMs assist with molecule screening, target identification, and clinical trial optimization, reducing time and costs associated with traditional methods. Despite their potential, challenges remain, including data quality, interpretability, and the need for domain-specific adaptations. LLMs are also being explored for their ability to predict pharmacokinetics, toxicity, and drug–drug interactions (DDI). Future advancements will focus on integrating LLMs with other AI techniques, like Reinforcement Learning (RL) and generative models, to further enhance their capabilities in drug discovery. Collaboration between academia, industry, and regulatory bodies will be crucial for overcoming challenges and realizing the full potential of LLMs in delivering new, effective therapies.
2026
9783031980213
9783031980220
Large Language Models in Drug Discovery / Gangwal, A., Lavecchia, A.. - (2026), pp. 437-468. [10.1007/978-3-031-98022-0_14]
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11588/1047028
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact