🤖 AI Summary
Official FRA Form 57 accident investigations at highway-rail grade crossings suffer from significant reporting delays—ranging from days to weeks—hindering timely local regulatory responses.
Method: We propose the first news-driven, real-time automated Form 57 filing system. Our approach innovatively integrates vision-language models (VLMs) with a layout-intent–guided grouped question-answering mechanism, augmented by sample-aggregated schema generation and a manually aligned news–FRA record dataset. This enables precise, structured parsing of semantically dense and irregularly formatted forms.
Contribution/Results: Experiments demonstrate substantial improvements in both information coverage and accuracy over baseline methods. The system achieves initial Form 57 completion within minutes, establishing— for the first time—the feasibility of news-based real-time railway accident reporting. It delivers timely, actionable data to support emergency response and regulatory decision-making.
📝 Abstract
Local railway committees need timely situational awareness after highway-rail grade crossing incidents, yet official Federal Railroad Administration (FRA) investigations can take days to weeks. We present a demo system that populates Highway-Rail Grade Crossing Incident Data (Form 57) from news in real time. Our approach addresses two core challenges: the form is visually irregular and semantically dense, and news is noisy. To solve these problems, we design a pipeline that first converts Form 57 into a JSON schema using a vision language model with sample aggregation, and then performs grouped question answering following the intent of the form layout to reduce ambiguity. In addition, we build an evaluation dataset by aligning scraped news articles with official FRA records and annotating retrievable information. We then assess our system against various alternatives in terms of information retrieval accuracy and coverage.