About the job
We are seeking an ML Features Solutions Engineer to join our Product and Solution Engineering team, driving the development and optimization of core ML features for enterprise deployment. This role combines deep ML expertise with hands-on engineering, working at the intersection of ML research and product development to deliver production-grade capabilities to our customers.
Responsibilities
Design and implement core ML features including model optimization, quantization, and inference enhancements
Optimize model performance for latency, throughput, and memory efficiency on SambaNova hardware
Develop and improve features such as Function Calling, Structured Output, and JSON mode conformance
Create end-to-end ML solutions that showcase platform capabilities and accelerate customer adoption
Convert cutting-edge ML research into practical, deployable product features
Establish benchmarks and quality standards for ML features in production environments
Work with SDK team to ensure ML features are properly exposed and documented for developers
Support enterprise customers implementing advanced ML features in their workflows
Partner with ML research, platform engineering, and customer teams
Qualifications
Minimum
Master’s degree or higher in Computer Science, Machine Learning, Electrical Engineering, or related field
5+ years of industry experience in ML engineering or applied ML research
3+ years of hands-on experience with large language models and transformer architectures
Expert proficiency in Python and deep learning frameworks: PyTorch (required), TensorFlow, or JAX
Experience with model optimization techniques: quantization, pruning, distillation, efficient inference
Strong understanding of LLM inference optimization: KV cache, batching strategies, memory management
Experience deploying ML models to production at scale
Track record of translating research concepts into production features
Preferred
PhD in Machine Learning, NLP, or related field
Experience with custom hardware acceleration (TPUs, custom ASICs)
Hands-on experience with inference frameworks: vLLM, TensorRT-LLM, or similar
Experience with function calling and tool use in LLMs
Knowledge of structured generation and constrained decoding
Experience with ML feature development in enterprise contexts
Contributions to open-source ML projects