Skip to main content
Skip to main content

uniqHLL12

Calculates the approximate number of different argument values, using the HyperLogLog algorithm.

uniqHLL12(x[, ...])

Arguments

The function takes a variable number of parameters. Parameters can be Tuple, Array, Date, DateTime, String, or numeric types.

Returned value

Implementation details

Function:

  • Calculates a hash for all parameters in the aggregate, then uses it in calculations.

  • Uses the HyperLogLog algorithm to approximate the number of different argument values.

    2^12 5-bit cells are used. The size of the state is slightly more than 2.5 KB. The result is not very accurate (up to ~10% error) for small data sets (<10K elements). However, the result is fairly accurate for high-cardinality data sets (10K-100M), with a maximum error of ~1.6%. Starting from 100M, the estimation error increases, and the function will return very inaccurate results for data sets with extremely high cardinality (1B+ elements).

  • Provides the determinate result (it does not depend on the query processing order).

We do not recommend using this function. In most cases, use the uniq or uniqCombined function.

See Also

uniqHLL12

Introduced in: v1.1

Calculates the approximate number of different argument values, using the HyperLogLog algorithm.

Note

We do not recommend using this function. In most cases, use the uniq or uniqCombined function.

Details

Implementation details This function calculates a hash for all parameters in the aggregate, then uses it in calculations. It uses the HyperLogLog algorithm to approximate the number of different argument values.

2^12 5-bit cells are used. The size of the state is slightly more than 2.5 KB. The result is not very accurate (up to ~10% error) for small data sets (<10K elements). However, the result is fairly accurate for high-cardinality data sets (10K-100M), with a maximum error of ~1.6%. Starting from 100M, the estimation error increases, and the function will return very inaccurate results for data sets with extremely high cardinality (1B+ elements).

Provides a determinate result (it does not depend on the query processing order).

Syntax

uniqHLL12(x[, ...])

Arguments

Returned value

Returns a UInt64-type number representing the approximate number of different argument values. UInt64

Examples

Basic usage

CREATE TABLE example_hll
(
    id UInt32,
    category String
)
ENGINE = Memory;

INSERT INTO example_hll VALUES
(1, 'A'), (2, 'B'), (3, 'A'), (4, 'C'), (5, 'B'), (6, 'A');

SELECT uniqHLL12(category) AS hll_unique_categories
FROM example_hll;
┌─hll_unique_categories─┐
│                     3 │
└───────────────────────┘