Scientific Hypothesis Generation by a Large Language Model: Laboratory Validation in Breast Cancer Treatment

📅 2024-05-20
🏛️ arXiv.org
📈 Citations: 4
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates the scientific utility of large language models (LLMs) in biomedical hypothesis generation, specifically targeting the discovery of selective synergistic effects of FDA-approved non-oncology drugs against breast cancer MCF7 cells relative to normal MCF10A cells. Method: We innovatively repurpose LLM “hallucinations” as experimentally testable hypotheses, establishing a closed-loop paradigm integrating AI-driven hypothesis generation, wet-lab validation, and iterative refinement. Domain-knowledge-guided prompting with GPT-4 and in vitro pharmacological screening using the Combination Index (CI) method were employed. Contribution/Results: Among 12 initial drug combinations, three exhibited statistically significant synergy; after iterative optimization, three of four newly proposed combinations were validated positive—yielding an overall experimental validation rate of 75%, substantially exceeding random baseline expectations. This work provides the first empirical evidence that LLMs can efficiently generate biologically meaningful, highly verifiable hypotheses, establishing a methodological framework for AI-enabled drug repurposing.

Technology Category

Application Category

📝 Abstract
Large language models LLMs have transformed AI and achieved breakthrough performance on a wide range of tasks In science the most interesting application of LLMs is for hypothesis formation A feature of LLMs which results from their probabilistic structure is that the output text is not necessarily a valid inference from the training text These are termed hallucinations and are harmful in many applications In science some hallucinations may be useful novel hypotheses whose validity may be tested by laboratory experiments Here we experimentally test the application of LLMs as a source of scientific hypotheses using the domain of breast cancer treatment We applied the LLM GPT4 to hypothesize novel synergistic pairs of FDA-approved noncancer drugs that target the MCF7 breast cancer cell line relative to the nontumorigenic breast cell line MCF10A In the first round of laboratory experiments GPT4 succeeded in discovering three drug combinations out of twelve tested with synergy scores above the positive controls GPT4 then generated new combinations based on its initial results this generated three more combinations with positive synergy scores out of four tested We conclude that LLMs are a valuable source of scientific hypotheses.
Problem

Research questions and friction points this paper is trying to address.

Testing LLMs for generating novel breast cancer drug hypotheses
Evaluating LLM-proposed drug combinations for synergistic effects
Validating AI-generated scientific hypotheses through laboratory experiments
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLMs generate novel drug synergy hypotheses
GPT4 tested in breast cancer treatment
Lab validation confirms LLM hypothesis value
🔎 Similar Papers
No similar papers found.
A
A. Abdel-Rehim
Department of Chemical Engineering and Biotechnology, University of Cambridge, CB3 0AS, U.K.
Hector Zenil
Hector Zenil
Associate Professor @ King’s College London & Researcher @ The Francis Crick Institute
algorithmic information dynamicscausalityalgorithmic probabilitymachine intelligence
Oghenejokpeme I. Orhobor
Oghenejokpeme I. Orhobor
The National Institute of Agricultural Botany, Cambridge, CB3 0LE, U.K.
M
Marie Fisher
Arctoris Ltd, Oxford, OX14 4SA, UK.
R
Ross J. Collins
Arctoris Ltd, Oxford, OX14 4SA, UK.
E
Elizabeth Bourne
Arctoris Ltd, Oxford, OX14 4SA, UK.
G
Gareth W. Fearnley
Arctoris Ltd, Oxford, OX14 4SA, UK.
E
Emma Tate
Arctoris Ltd, Oxford, OX14 4SA, UK.
H
Holly X. Smith
Arctoris Ltd, Oxford, OX14 4SA, UK.
L
L. Soldatova
Department of Computing, Goldsmiths, University of London, SE14 6NW, U.K.
Ross D. King
Ross D. King
Professor of Machine Intelligence, Chalmers University
Automation of ScienceDrug DesignArtificial IntelligenceMachine LearningSynthetic Biology