What are tiny objects?

where they might be seen? Why need them?

so kangaroo we’re focusing on caching tiny objects and the reason we’re focused on tiny objects is because they’re prevalent and they’re an underserved use case and so where do we see tiny objects well tiny objects are apparent in the social graph such as facebook social graph which has edges that average around 100 bytes that connect any friends or any posts people make we also see tiny objects in iot metadata for example at microsoft azure all sensor metadata averages around 300 bytes and then we also see in things like tweets or other text data for instance twitter tweets average less than 33 characters on average

Why tiny objects are needed?

so with all these tiny objects at massive scales we want to be able to serve them at scale to various applications and one of the ways we do this is by caching massive amounts of mass amounts of data and so when an application wants a tiny object what will typically happen it will send that request to a caching layer the caching layer will return the object if it’s a hit has a hit otherwise on a cache miss the request will be sent back to database layer and so the caches here have two main goals they want to lower the average latency of the entire service and they also want to keep load off the backend services and to really have a caching layer that’s effective at scale these caches need to be really big and one way to make them really big is to use flash because it’s 100 times cheaper per bit so you can have much larger caches for the same cost and so at scale a lot of uh companies deploy flash caches

so with kangaroo we’re really trying to solve this problem where we need to cache billions of tiny objects on flash and if you look at the prior work of caching tiny objects on flash it they either have this problem of having too many flash rights or a large memory overhead in either way you’re going to waste money on your cash so kangaroo allows us to solve this problem while reducing misses and we reduce misses by 29 over uh comparisons and while keeping rights and memory pressure under uh production constraints kangaroo is open source and it’s integrated into cache lib which is a facebook’s caching engine that they use in production and is open source and can be found at cashlique.org

Caching on flash

now that i’ve introduced the problem that kanger is trying to solve and a little bit about what kangaroo does i’m going to be talking about the challenges of caching on flash then i’ll be moving on to how we can cache on flash while minimizing dram overhead which will lead us into kangaroo’s design and finally the results

so when we think about caching on flash um we have all the challenges that you have with caching on dram but we have additional additional challenges and so flash allows us to have cheaper caches but flash devices also have this problem having limited light endurance there’s only so many times you can write data to flash uh before that flash device will wear out and no longer work and since uh caches are constantly changing what data is in the cache this right endurance becomes a really significant factor that we have to take into account when building caches in addition flash caches have to write in at least four kilobyte blocks because this is the minimum read write granularity of a flash device and this right granularity is bigger than the objects that we’re actually looking at and so to solve both of these problems kind of in conjunction most flash caches use the log structured cache

Create a free account to access the full course.

By signing up, you agree to Educative's Terms of Service and Privacy Policy