About the job
Are you passionate about delivering mission-critical, high quality machine learning models, using cutting-edge technology, in a dynamic environment? We are Compliance Engineering, a global team of more than 300 engineers and scientists who work on the most complex, mission-critical problems. We build and operate a suite of platforms and applications that prevent, detect, and mitigate regulatory and reputational risk across the firm. have access to the latest technology and to massive amounts of structured and unstructured data. leverage modern frameworks to build responsive and intuitive UX/UI and Big Data applications. Within Compliance engineering, we are hiring for a Machine Learning Engineering role within Models Engineering. The firm is making a significant investment improve the precision/ recall of the Compliance models portfolio in 2024. To achieve that we are hiring experienced MLEs who have experience of developing and deploying ML models for big data in a distributed architecture.
Responsibilities
Work with large scale structure and unstructured data.
Drive end to end Machine Learning projects that have a high degree of scale and complexity
Build infra for machine learning which involves feature engineering and scaling models to work at scale
Develop, productionize, and maintain ml models
Run ML experiments by constantly tuning the features and the modeling approaches, documenting findings and results
Collaborate closely with ML researchers, to accelerate the usage of cutting edge models
Perform code reviews and ensure code quality
Qualifications
Minimum
A Bachelor's or Master's degree in Computer Science, or a similar field of study.
10+ years of hands-on experience with building scalable machine learning systems
Solid coding skills and strong Computer Science fundamentals (algorithms, data structures, software design)
Expertise in Python & PySpark
Experience in working with distributed technologies like Scala, Pyspark, Iceberg, HDFS file formats (avro, parquet), AWS/ GCP, big data feature engineering.
Experience in system design and evaluating the pros and cons of database choices, schema definition for data storage.
Extensive experience with Machine Learning and Deep Learning toolkits (Tensorflow, PyTorch, Scikit-Learn, HuggingFace)
Preferred
Prior experience with LLMs and Prompt Engineering
Prior experience in architecting/ deploying ML applications on AWS/ GCP
Prior experience in code reviews/ architecture design for distributed systems.