Machine Learning Tech Talk – CacheSack: Admission Optimization for Google Datacenter Flash Caches

CS 1240 - 1210 W Dayton St Madison, 53706 - View Map University of Wisconsin-Madison
Thu, Oct 13, 2022, 5:00 PM (CDT)

Google is responsible for storing and serving a significant chunk of the world's data, ranging from emails to cat videos to Cloud customer data. Doing that job at low cost requires Google datacenters to employ a mixture of different storage hardware types, namely Flash memory and spinning magnetic disks (a.k.a. hard disks). An important way we manage this mixture is by caching hot data on Flash (where throughput is cheap) while keeping cold data on hard disk (where space is cheap). In this talk, we'll walk through a recently published paper on Google's Flash cache. We will discuss the context of Google's datacenter storage environment and then dive deep on how we applied some simple ML techniques to optimize placement of data on Flash vs. hard disk.



