Welcome to my website! My full name is Hao-Jun Michael Shi, although I normally go by Michael.
I am a research scientist at Meta in the AI and Systems Co-Design Training team. My work has centered on the exploration (research) and implementation of scalable and distributed training algorithms and systems for deep learning. I have most recently focused on developing scalable adaptive gradient methods for training neural networks, specifically an efficient PyTorch implementation of the Distributed Shampoo optimizer.
I have a Ph.D. in numerical optimization, with some dabbling in deep learning and numerical analysis. During my Ph.D., I have designed algorithms for neural network training and stochastic optimization (progressive batching quasi-Newton), noisy optimization (noise-tolerant quasi-Newton), and derivative-free optimization (adaptive finite-difference methods under noise). I have also contributed to Facebook's open-source deep learning-based recommendation model (DLRM) and developed embedding compression techniques (QR embedding) during a prior internship at Facebook.
In my free time, I like to eat good food, read interesting books, play basketball, root for UCLA sports teams (Go Bruins!), serve at my church, and spend time with my wife while attempting to entertain my cat Taro.