Gienapp, L. et al. (2025). The German Commons – 154 Billion Tokens of Openly Licensed Text for German Language Models. CoRR, abs/2510.13996 (Data)
Gienapp, L. et al. (2025). Topic-Specific Classifiers are Better Relevance Judges than Prompted LLMs. CoRR, abs/2510.04633 (Methods, Evaluation)
Gienapp, L. et al. (2025). Learning Effective Representations for Retrieval Using Self-Distillation with Adaptive Relevance Margins (Methods)
Gienapp, L. et al. (2025). The Viability of Crowdsourcing for RAG Evaluation. ACM (Data, Evaluation)
Peters, J. et al. (2025). ml4xcube: Machine Learning Toolkits for Earth System Data Cubes (Methods)
Fröbe, M. et al. (2024). Resources for Combining Teaching and Research in Information Retrieval Courses. ACM (Teaching)
Gienapp, L. et al. (2024). Evaluating Generative Ad Hoc Information Retrieval. ACM (Evaluation)
Elstner, T. et al. (2023). Shared Tasks as Tutorials: A Methodical Approach. AAAI Press (Teaching)
Reimer, J. et al. (2023). The Archive Query Log: Mining Millions of Search Result Pages of Hundreds of Search Engines from 25 Years of Web Archives. ACM (Data)
Research Experience
Since 2025: Researcher, Deep Semantic Learning Group, Kassel University — Research on Generative Models for Search and Search for Generative Models
2022–2025: Researcher, ScaDS.AI Centre for Scalable Data Science & Artificial Intelligence, Leipzig — Research on Generative Models for Search and Search for Generative Models
2019–2022: Researcher, Text Mining & Retrieval Group, Leipzig University — Research on Web Search, Crowdsourcing & Evaluation, and Plagiarism Detection
2017–2019: Student Assistant, Institute for Sociology, Leipzig University — Research Infrastructure, Technical Support, Experiment Assistance
2017–2019: Student Assistant, Institute for Translatology, Leipzig University — Programming, Typesetting, Research Assistance
Education
2019–2022: M.Sc. Data Science, Leipzig University
2019–2022: M.Sc. Digital Humanities, Leipzig University
2016–2019: B.Sc. Digital Humanities, Leipzig University