Introduction to HyperLogLog

Get an introduction to the HyperLogLog extension in PostgreSQL.

If you’ve been keeping up with the newer statistics developments, you might have heard about this new state-of-the-art cardinality estimation algorithm called HyperLogLog.

This technique is now available for PostgreSQL in the extension postgresql-hll available on GitHub and is packaged for multiple operating systems such as Debian and RHEL through the PostgreSQL community packaging efforts and resources.

HyperLogLog is a very special hash value. It aggregates enough information into a single scalar value to compute a distinct value with some precision loss.

Use case: Count unique visitors

Say we’re counting unique visitors. With HyperLogLog, we can maintain a single value per day and then union those values together to obtain unique weekly or monthly visitor counts.

Here’s an example in SQL of the magic provided by the hll extension:

Get hands-on with 1200+ tech skills courses.