🤖 AI Summary
Automated identification of Modes of Action (MoAs) in high-throughput toxicity screening remains challenging due to the difficulty of feature extraction and the high cost of manual annotation. To address this, we propose a self-supervised phenotypic representation learning method that extracts discriminative toxicological features directly from zebrafish embryo images—without requiring large-scale labeled data. Our approach is trained end-to-end on the EmbryoNet dataset and yields representations that effectively separate compounds by their underlying MoAs. Integrated with downstream classifiers, it achieves accurate prediction of toxicity endpoints. Experiments demonstrate substantial improvements in MoA classification accuracy, strong generalization across unseen compounds, and scalability to diverse chemical classes. The method has been successfully deployed within the TOXBOX physical screening platform, providing a practical, low-cost, automated, and high-throughput machine learning solution for toxicity assessment of novel chemicals and materials.
📝 Abstract
High-throughput toxicity testing offers a fast and cost-effective way to test large amounts of compounds. A key component for such systems is the automated evaluation via machine learning models. In this paper, we address critical challenges in this domain and demonstrate how representations learned via self-supervised learning can effectively identify toxicant-induced changes. We provide a proof-of-concept that utilizes the publicly available EmbryoNet dataset, which contains ten zebrafish embryo phenotypes elicited by various chemical compounds targeting different processes in early embryonic development. Our analysis shows that the learned representations using self-supervised learning are suitable for effectively distinguishing between the modes-of-action of different compounds. Finally, we discuss the integration of machine learning models in a physical toxicity testing device in the context of the TOXBOX project.