IEEE ACCESS 2025: A Survey on Data Selection for Efficient Speech Processing
ACL 2025 Workshop on GEM: The fellowship of the llms: Multi-agent workflows for synthetic preference optimization dataset generation
COLING 2025: To Label or Not to Label: Hybrid Active Learning for Neural Machine Translation
EMNLP 2024 (Findings): Generalists vs. Specialists: Evaluating Large Language Models for Urdu
ACL 2024 Workshop on Low-Resource Machine Translation: Challenges in Urdu Machine Translation
ACL 2024 (Findings): Deepfake Defense: Constructing and Evaluating a Specialized Urdu Deepfake Audio Dataset
NeurIPS 2023 Workshop on Efficient Speech and Natural Language Processing: Representative Subset Selection for Efficient Fine-Tuning in Self-Supervised Speech Recognition
EMNLP 2023 (Findings): Data Pruning for Efficient Model Pruning in Neural Machine Translation
INTERSPEECH 2023: Self-Supervised Dataset Pruning for Efficient Training in Audio Anti-spoofing
INTERSPEECH 2022: Dataset Pruning for Resource-constrained Spoofed Audio Detection
AALTD - ECML PKDD 2021: RevDet: Robust and Memory Efficient Event Detection and Tracking in Large News Feeds
Background
Machine Learning Researcher. Research interests include data selection, neural machine translation, speech processing, etc.
Miscellany
Personal interests and tech stack include Typescript, Vercel AI SDK, OpenAI, RAG, Tool Calling, Frontend, Backend, Shell Scripting, Live Streaming, AngularJS, Firebase, Magento 2 REST API, DHL XML Web Services, Call Courier API, Android, Sinch API, Google Maps API, Firebase Cloud Functions, etc.