Sr Software Engineer, AI Tools – Quantizer

Qualcomm
Raleigh, North Carolina, United States of America2026-04-21onsite

About the job

As a Qualcomm Machine Learning Engineer, you will develop and implement cutting-edge machine learning techniques that enable the efficient utilization of state-of-the-art solutions across various technology verticals. In this position you will be responsible for the software design and development of quantization tooling within the AISW Tools team. Our inference engine empowers developers to deploy neural network models on Snapdragon platforms at exceptional speeds while maintaining minimal power consumption. You will participate in the agentic transformation of Qualcomm’s development and tooling workflow. You will have the opportunity to show your passion for software design and development with your analytical, design, programming, and debugging skills.

Responsibilities

Collaborate with cross-functional teams in the AI Software team at Qualcomm to gain knowledge of the capabilities of QAIRT SDK and use it to optimize inference of AI models on Qualcomm AI accelerator IP.

Validate and optimize the performance and accuracy of quantized models through detailed analysis and testing of machine learning use cases.

Debug complex issues, perform root cause analysis, and ensure high system reliability.

Contribute to the team's adoption of agentic AI workflows — leveraging tools such as Claude Code, or similar frameworks — to automate quantization experimentation, hyperparameter search, and model evaluation pipelines.

Participate in design and code reviews.

Work independently and lead junior team members. Your decision-making will impact your direct area of work and the work group.

Qualifications

Minimum

Bachelor's degree in Computer Science, Engineering, Information Systems, or related field and 2+ years of Hardware Engineering, Software Engineering, Systems Engineering, or related work experience.

OR

Master's degree in Computer Science, Engineering, Information Systems, or related field and 1+ year of Hardware Engineering, Software Engineering, Systems Engineering, or related work experience.

OR

PhD in Computer Science, Engineering, Information Systems, or related field.

Bachelor's degree or equivalent in Engineering, Information Systems, Computer Science, or related field

4+ years software development experience using Python and/or C/C++

Strong software development skills (e.g. data structure and algorithm design, object oriented or other software design paradigm knowledge, software debugging and testing, etc.)

Strong communication skills (verbal, presentation, written)

Solid understanding of Machine Learning and Deep Learning theory

Experience working with one of the Deep Learning frameworks like PyTorch, Tensorflow, ONNX, JAX

Preferred

2+ years Python programming experience

Experience / exposure to Quantization techniques – PTQ, QAT, AWQ, SpinQuant, etc.

Familiarity with different NN architectures: DNNs, CNNs, RNNs/LSTMs, GANs, LLMs, etc.

Experience with AIMET, TorchAO or other quantization-focused libraries in the PyTorch ecosystem

Familiarity with GenAI model architectures and challenges working with them – LLM, LVM, LMM

Experience with optimizing software, specifically AI graph workloads, for embedded platforms

Familiarity with agentic AI frameworks (e.g. Claude Code) and experience applying them to automate experimentation or evaluation workflows

Ability to collaborate across a globally diverse team and multiple interests