Developed a technology that is 10 times more cost and energy efficient than today's market solutions; Implemented highly optimized ASIC products for LLM inference using 4nm process technology.
Research Experience
Involved in the development of the world's first hardware accelerator dedicated to end-to-end LLM inference - the Latency Processing Unit (LPU); Researched how to improve the efficiency of multi-LPU systems through innovative ESL technology.
Background
Focused on the field of generative AI, particularly in accelerating large language models (LLMs).
Miscellany
Looking for talented and passionate individuals to join the team and contribute to the advancement of generative AI.