Sep 2022 – Aug 2025: Horizon Europe project HPLT – building open large-scale data extraction and model building pipelines using petabytes of web data from the Internet Archive
Oct 2022 – Sep 2025: Horizon Europe project Utter – applying large multilingual, multimodal models to speech/dialogue translation and meeting assistance
Jan 2019 – Mar 2022: H2020 project Gourmet – improving translation for under-resourced languages in journalism
Jan 2019 – Mar 2022: H2020 project Elitr – researching spoken language translation, multilingual MT, and automatic minuting
Apr 2018 – Jul 2021: IARPA-funded Material project – working on ASR, MT, IR, and summarization for under-resourced languages
Feb 2015 – Jan 2018: Coordinated the €3M Horizon 2020 HimL project on semantically accurate medical-domain MT for morphologically complex languages
Feb 2015 – Jan 2018: Cracker – coordination action for European MT research, shared tasks, workshops, and industrial outreach
Jan 2015 – Dec 2017: QT21 – improving statistical models for European languages with poor MT performance
Jan 2015 – Dec 2017: MMT – developing next-generation scalable, adaptable, open-source MT infrastructure
Feb 2012 – Jan 2015: MosesCore – supporting open-source MT via MT marathons, shared tasks, workshops, and Moses development
Jan 2012 – Dec 2014: Accept – improving user-generated content translation via pre/post-editing
Jan 2011 – Dec 2013: MLTMLV – automatic translation from standard German to dialects (e.g., Viennese)
Feb 2009 – Feb 2012: EUROMATRIXPLUS – user-focused MT tools, news translation, and core system improvements
Jan 2008 – Feb 2009: EUROMATRIX – statistical MT tools and evaluations for all EU language pairs
Sep 2005 – Dec 2007: TXM – biomedical NLP tools for curator assistance