About the job
We are looking for a Language Engineer to join MIDAS to write intuitive, labeler-friendly annotation guidelines that enable measurement of search quality, support data wrangling and analysis, define specifications for labeling UI templates, and report labeled-data quality metrics to deliver on stakeholder requirements and improve the customer experience. To achieve high accuracy and consistency in labeled data outputs, Language Engineers apply linguistic (e.g., semantics, syntax, pragmatics) and scripting expertise to solve natural language processing and language understanding challenges.
Responsibilities
Design and develop data annotation guidelines and workflows.
Manage and process large amounts of structured and unstructured data.
Adopt and design quality control metrics and methodology to evaluate the quality of data annotation.
Maximize productivity, process efficiency and quality through streamlined workflows, process standardization, documentation, audits and investigations on a periodic basis.
Handle annotation & data investigation requests from multiple stakeholders with high efficiency and quality in a fast-paced environment.
Collaborate with scientists, engineers, and product managers in defining metrics, guidelines, and workflows.
Initiate and contribute towards improvement projects, present solution proposals, and implement them.
Establish processes and mechanisms to onboard and train junior data associates on an ongoing basis.
Handle work prioritization and deliver based on business priorities.
Be flexible in changes to conventions deployed in response to customers’ requests and change workflows accordingly.
Qualifications
Minimum
2+ years of computational linguistics, language data processing, semantics, philosophy of language experience
Master's degree or above in Linguistics or a related field
5+ years of relevant professional experience
Knowledge of and proficiency in the use of Python scripting language
Knowledge of Regex, SQL, MS Excel, Git.
Ability to navigate a Unix terminal and use common command line tools.
Familiarity with annotation tools and workflows.
Excellent communication and strong organizational skills with a keen eye for details.
Comfortable working in a fast-paced, collaborative, and dynamic work environment.
Willingness to support several projects at one time and to accept reprioritization as necessary.
Preferred
Proficient in French, German, Dutch, Italian, Spanish, or Japanese.
Experience in data science and quantitative research.
Experience with language annotation and other forms of data markup.
Hands-on experience with machine learning and deep learning techniques in the fields of NLP and search.
Experience with AWS services (S3, Sagemaker, ML language services, etc.).
Knowledge of user experience concepts and methods.
Familiarity with online retail (e-commerce).