Skip to main content
Skip to main content

approx_top_sum

Returns an array of the approximately most frequent values and their counts in the specified column. The resulting array is sorted in descending order of approximate frequency of values (not by the values themselves). Additionally, the weight of the value is taken into account.

approx_top_sum(N)(column, weight)
approx_top_sum(N, reserved)(column, weight)

This function does not provide a guaranteed result. In certain situations, errors might occur and it might return frequent values that aren't the most frequent values.

Maximum value of N = 65536.

Parameters

  • N — The number of elements to return. Optional. Default value: 10.
  • reserved — Defines, how many cells reserved for values. If uniq(column) > reserved, result of topK function will be approximate. Optional. Default value: N * 3.

Arguments

  • column — The value to calculate frequency.
  • weight — The weight. Every value is accounted weight times for frequency calculation. UInt64.

Example

Query:

SELECT approx_top_sum(2)(k, w)
FROM VALUES('k Char, w UInt64', ('y', 1), ('y', 1), ('x', 5), ('y', 1), ('z', 10))

Result:

┌─approx_top_sum(2)(k, w)─┐
│ [('z',10,0),('x',5,0)]  │
└─────────────────────────┘

See Also

approx_top_sum

Introduced in: v1.1

Returns an array of the approximately most frequent values and their counts in the specified column. The resulting array is sorted in descending order of approximate frequency of values (not by the values themselves). Additionally, the weight of the value is taken into account.

This function does not provide a guaranteed result. In certain situations, errors might occur and it might return frequent values that aren't the most frequent values.

See Also

Syntax

approx_top_sum(N[, reserved])(column, weight)

Parameters

  • N — The number of elements to return. Optional. Default value: 10. UInt64
  • reserved — Optional. Defines, how many cells reserved for values. If uniq(column) > reserved, result of topK function will be approximate. Default value: N * 3. Maximum value of N = 65536. UInt64

Arguments

  • column — The name of the column for which to find the most frequent values. String
  • weight — The weight. Every value is accounted weight times for frequency calculation. UInt64

Returned value

Returns an array of the approximately most frequent values and their counts, sorted in descending order of approximate frequency. Array

Examples

Usage example

SELECT approx_top_sum(2)(k, w)
FROM VALUES('k Char, w UInt64', ('y', 1), ('y', 1), ('x', 5), ('y', 1), ('z', 10));
┌─approx_top_sum(2)(k, w)─┐
│ [('z',10,0),('x',5,0)]  │
└─────────────────────────┘