Developed novel side-channel attacks on sparse DNN accelerators capable of recovering model architectures despite encryption
Designed an algorithm and hardware architecture reducing DNN training time and energy by 4× at iso-accuracy
Proposed convolutional layer fusion technique achieving up to 6× performance improvement
Integrated resistive memories with DNN accelerators for up to 6× energy savings
Created scheduling algorithms for tensor workloads that generate energy-efficient schedules faster than prior tools
Implemented efficient strong memory semantics and hardware transactional memory in GPUs
Used locality-sensitive hashing to significantly improve cache compression ratios and advocated decoupling compression from approximation
ML-based prefetcher placed 2nd in the 2021 ML-Based Data Prefetching Competition
Developed an abstraction and toolchain to factor data structures and reduce cache misses
Designed an FPGA MAP equalizer supporting Faster-than-Nyquist signaling for 5G
Built fast, accurate static/dynamic program slicers for bug tracking in Android and Java
Invented a dynamic, path-aware taint analysis technique
Realized the 110-core EM² chip validating execution migration for shared memory
SSC enabled the nanomedicine game NanoDoc
Research Experience
Worked in industry prior to UBC on Bluespec and high-speed network processors
Led research on DNN accelerator attacks/defenses, efficient DNN training/inference architectures, GPU memory semantics, cache compression/prefetching, split-frame rendering, 5G FPGA equalizers, program slicing, and taint analysis
Developed the Execution Migration Machine (EM²), a 110-core 45nm ASIC implementing unified shared memory via execution context migration
Created the scalable NoC simulator HORNET and researched oblivious routing, multi-path in-order delivery, and VC scheduling
Designed SSC, a domain-specific language and compiler for biochemical reaction networks, applied to immunology and cell-membrane signaling
Built Spanner, a protein structure modeling tool for threading sequences onto 3D templates