The LCLStream Ecosystem for Multi-Institutional Dataset Exploration

📅 2025-10-04
📈 Citations: 0
Influential: 0
📄 PDF

career value

226K/year
🤖 AI Summary
Emerging requirements in X-ray science—including AI training on high-speed data streams, femtosecond-level time-of-flight analysis, and distributed crystallographic structure determination—demand a scalable, secure, and low-latency experimental data infrastructure. Method: This paper introduces the first end-to-end experimental data stream framework integrating cloud-native microservices with traditional HPC batch processing. It innovatively combines RESTful API–driven request services, OAuth2.0 mutual authentication, Kafka-based high-throughput messaging, containerized microservices, and HPC job schedulers to realize high-throughput data buffering and cross-institutional secure sharing. Contribution/Results: The framework achieves millisecond-scale real-time data distribution, supports customizable visualization and distributed structure solution, and has been validated via the LCLStreamer prototype deployed across multiple synchrotron facilities. It improves data access efficiency by 3–5× and significantly enhances multi-center collaborative research, advancing synchrotron science toward a “streaming experiment” paradigm.

Technology Category

Application Category

📝 Abstract
We describe a new end-to-end experimental data streaming framework designed from the ground up to support new types of applications -- AI training, extremely high-rate X-ray time-of-flight analysis, crystal structure determination with distributed processing, and custom data science applications and visualizers yet to be created. Throughout, we use design choices merging cloud microservices with traditional HPC batch execution models for security and flexibility. This project makes a unique contribution to the DOE Integrated Research Infrastructure (IRI) landscape. By creating a flexible, API-driven data request service, we address a significant need for high-speed data streaming sources for the X-ray science data analysis community. With the combination of data request API, mutual authentication web security framework, job queue system, high-rate data buffer, and complementary nature to facility infrastructure, the LCLStreamer framework has prototyped and implemented several new paradigms critical for future generation experiments.
Problem

Research questions and friction points this paper is trying to address.

Supports AI training and high-rate X-ray analysis
Enables distributed crystal structure determination workflows
Provides API-driven high-speed data streaming for science
Innovation

Methods, ideas, or system contributions that make the work stand out.

API-driven data request service for flexible streaming
Cloud microservices merged with HPC batch execution
High-rate data buffer with mutual authentication security
🔎 Similar Papers
No similar papers found.
💼 Related Jobs
AI Data Engineer--LLMs / Agentic Systems
Pfizer
The annual base salary for this position ranges from $106,000.00 to $176,600.00. In addition, this position is eligible for participation in Pfizer’s Global Performance Plan with a bonus target of 15.0% of the base salary and eligibility to participate in our share based long term incentive program. We offer comprehensive and generous benefits and programs to help our colleagues lead healthy lives and to support each of life’s moments. Benefits offered include a 401(k) plan with Pfizer Matching Contributions and an additional Pfizer Retirement Savings Contribution, paid vacation, holiday and personal days, paid caregiver/parental and medical leave, and health benefits to include medical, prescription drug, dental and vision coverage. Learn more at Pfizer Candidate Site – U.S. Benefits | (uscandidates.mypfizerbenefits.com). Pfizer compensation structures and benefit packages are aligned based on the location of hire. The United States salary range provided does not apply to Tampa, FL or any location outside of the United States. Relocation assistance may be available based on business needs and/or eligibility.
United States - Massachusetts - Cambridge
David Rogers
David Rogers
Research Engineer, Vanderbilt University
V
Valerio Mariani
LCLS, SLAC National Accelerator Laboratory, Menlo Park, California, USA
C
Cong Wang
LCLS, SLAC National Accelerator Laboratory, Menlo Park, California, USA
Ryan Coffee
Ryan Coffee
LCLS-SLAC National Accelerator Lab
Molecular PhysicsUltrafast X-Ray SpectroscopyMaterial Response to Electronic Excitation
W
Wilko Kroeger
LCLS, SLAC National Accelerator Laboratory, Menlo Park, California, USA
M
Murali Shankar
LCLS, SLAC National Accelerator Laboratory, Menlo Park, California, USA
H
Hans Thorsten Schwander
LCLS, SLAC National Accelerator Laboratory, Menlo Park, California, USA
T
Tom Beck
NCCS, Oak Ridge Leadership Computing Facility, supported by the US DOE Office of Science under Contract No. DE-AC05-00OR22725. Oak Ridge, Tennessee, USA
Frédéric Poitevin
Frédéric Poitevin
SLAC National Accelerator Laboratory
Structural BiologyMachine LearningComputer Vision
J
Jana Thayer
LCLS, SLAC National Accelerator Laboratory, supported by the US DOE Office of Basic Energy Sciences under Contract No. DE-AC02-76SF00515. Menlo Park, California, USA