Counting at Scale with HyperLogLog
Explore how to use Redis HyperLogLog to efficiently estimate cardinality with constant time operations and minimal memory usage. Understand key HyperLogLog commands and apply them in a practical Go application that tracks unique product views and maintains a top-viewed products list in real time.
We'll cover the following...
HyperLogLog
In Redis, HyperLogLog is a probabilistic data structure that can be used to count the number of members (cardinality) of a set. It trades perfect accuracy for efficient space utilization, which makes it better than a set as the number of elements increases.
The following characteristics make HyperLogLog suitable for high volume operations:
Memory usage: It uses a fixed amount of memory—less than 12 KB per key for up to 264 cardinalities.
Performance: HyperLogLog commands have constant time complexity—O(1) per key.
Accuracy trade-off: The cardinality returned is not exact, with a standard error of less than 1%—approximately 0.81% with default configuration.
Commands for HyperLogLog
To better understand how HyperLogLog works, let’s explore the commands associated with this data type.
The PFAdd command
Use the PFAdd command to add items to a HyperLogLog. In this example, we add user IDs to a HyperLogLog named views_from_user.
Behind the scenes, the cardinality of the set is calculated.
client.PFAdd(context.Background(), "views_from_user", "user1", "user2", "user1", "user3", "user4", "user3", "user5", "user6")
The PFCount command
With the PFCount command, we can retrieve the approximate count of items in a HyperLogLog. If the HyperLogLog doesn’t exist, the result of this command is 0. In the example below, we query the cardinality of the views_from_user HyperLogLog: ...