🤖 AI Summary
Electric bicycle accident reports are predominantly unstructured text, hindering quantitative safety analysis. To address this, we propose the first four-role LLM-based multi-agent system integrating prompt engineering and information extraction to enable end-to-end automatic identification and classification of safety variables—including causal categories, faulty components (e.g., pedals, tires, brakes), and environmental factors. Subsequently, we develop an ordered logistic regression model to uncover statistically significant divergences: device-related causes exhibit distinct patterns in both incident frequency and fatality rates compared to human-related causes. Our method achieves a weighted F1-score of 0.87 on accident severity classification. The framework delivers an interpretable, scalable, and empirically grounded analytical foundation for electric bicycle safety governance, vehicle design optimization, and evidence-informed policy formulation.
📝 Abstract
Electric bicycles (e-bikes) are rapidly increasing in use, raising safety concerns due to a rise in accident reports. However, e-bike incident reports often use unstructured narrative formats, which hinders quantitative safety analysis. This study introduces E-bike agents, a framework that uses large language models (LLM) powered agents to classify and extract safety variables from unstructured incident reports. Our framework consists of four LLM agents, handling data classification, information extraction, injury cause determination, and component linkage, to extract the key factors that could lead to E-bike accidents and cause varying severity levels. Furthermore, we used an ordered logit model to examine the relationship between the severity of the incident and the factors retrieved, such as gender, the type of cause, and environmental conditions. Our research shows that equipment issues are slightly more common than human-related ones, but human-related incidents are more often fatal. Specifically, pedals, tires, and brakes are frequent contributors to accidents. The model achieves a high weighted F1 score of 0.87 in classification accuracy, highlighting the potential of using LLMs to extract unstructured data in niche domains, such as transportation. Our method offers a scalable solution to improve e-bike safety analytics and provides actionable information for policy makers, designers, and regulators.