Google is responsible for storing and serving a significant chunk of the world's data, ranging from emails to cat videos to Cloud customer data. Doing that job at low cost requires Google datacenters to employ a mixture of different storage hardware types, namely Flash memory and spinning magnetic disks (a.k.a. hard disks). An important way we manage this mixture is by caching hot data on Flash (where throughput is cheap) while keeping cold data on hard disk (where space is cheap). In this talk, we'll walk through a recently published paper on Google's Flash cache. We will discuss the context of Google's datacenter storage environment and then dive deep on how we applied some simple ML techniques to optimize placement of data on Flash vs. hard disk.
Software Engineering Manager
Seth Pollen is an Engineering Manager at Google in Madison, Wisconsin. He received an M.S. in Computer Sciences from the University of Wisconsin in 2013 and has worked at Google (specifically in Storage) for 9 years.
GDSC Lead