2021 Speakers — XR @ Cornell

Matthias Grundmann (Google)

Talk: On-device ML solutions for Mobile and Web

Abstract: In this talk, I will present several on-device Machine Learning (ML) solutions for mobile and web that are powering a wide range of impactful Google Products. On-device ML has major benefits enabling low-latency, offline and privacy-preserving approaches. However, to ship these solutions in production, we need to overcome substantial technical challenges to deliver on-device ML in real-time and with low-latency. Once solved, our solutions power applications like background replacement and light adjustment in Google Meet, AR effects in YouTube and Duo, gesture controls of devices and view-finder tracking for Google Lens and Translate.

In this talk, I will cover some of the core-recipes behind Google’s on-device ML solutions, from model design over enabling ML solutions infrastructure (MediaPipe) to on-device ML inference acceleration. In particular we will be covering video segmentation, face meshes and iris tracking, hand tracking for gesture control and body tracking to power 3D avatars. The covered solutions are also available to the research and developer community via MediaPipe, —an open source cross platform framework for building customizable ML pipelines for mobile, web, desktop and python.

Bio: Matthias Grundmann is a Director of Research at Google working in the area of Machine Learning, Computer Vision and Computational Video. He is leading a vertical team of ~40 Applied ML and Software Engineers with focus on Machine Learning solutions for Live ML (low-latency, on-device and real-time). His team develops high-quality, cross-platform ML solutions (MediaPipe) driven by GPU/CPU accelerated ML inference (TFLite GPU and XNNPack) for mobile and web. Among the wide portfolio of technologies his team develops are solutions for hand and body tracking, high-fidelity facial geometry and iris estimation, video segmentation for Google Meet and YouTube, 2D object and calibration-free 6 DOF camera tracking, 3D object detection, Motion Photos and Live Photo stabilization.

Matthias received his Ph.D. from the Georgia Institute of Technology in 2013 for his work on Computational Video with focus on Video Stabilization and Rolling Shutter removal for YouTube. His work on Rolling Shutter removal won the best paper award at ICCP, 2012. He was the recipient of the 2011 Ph.D. Google Fellowship in Computer Vision.