Published multiple papers, including 'An Extensive Exploration of Back-Translation in 60 Languages' (ACL 2023), 'High Recall Text Retrieval for Public Health Systematic Review' (FLAIRS-30 2017), 'Language Identification for Creating Language-Specific Twitter Collections' (NAACL-HLT LSM-12 2012), 'Cross Language Entity Linking' (IJCNLP-2011), 'Addressing Morphological Variation in Alphabetic Languages' (SIGIR-2009), and his PhD thesis 'Textual Representations for Corpus-Based Bilingual Retrieval' (2008).
Research Experience
Worked on several projects including LITESABER (text analytics for low-resource languages), Apache JOSHUA (statistical machine translation in many languages), Knowledge Base Population (helped organize the inaugural NIST Text Analysis Conference Knowledge Base Population track), and HAIRCUT (Hopkins Automated Information Retriever for Combing Unstructured Text). Participated in numerous international evaluations such as TREC, CLEF, NTCIR, and FIRE.
Background
A computer scientist with research interests in multilingual text retrieval, information extraction, and machine translation. Holds appointments at the Johns Hopkins University Applied Physics Laboratory and the Human Language Technology Center of Excellence.
Miscellany
Senior member of the Association for Computing Machinery (ACM) and a member of the Association for Computational Linguistics (ACL).