Skip to main content
Skip to main content

topKWeighted

Returns an array of the approximately most frequent values in the specified column. The resulting array is sorted in descending order of approximate frequency of values (not by the values themselves). Additionally, the weight of the value is taken into account.

Syntax

topKWeighted(N)(column, weight)
topKWeighted(N, load_factor)(column, weight)
topKWeighted(N, load_factor, 'counts')(column, weight)

Parameters

  • N — The number of elements to return. Optional. Default value: 10.
  • load_factor — Defines, how many cells reserved for values. If uniq(column) > N * load_factor, result of topK function will be approximate. Optional. Default value: 3.
  • counts — Defines, should result contain approximate count and error value.

Arguments

  • column — The value.
  • weight — The weight. Every value is accounted weight times for frequency calculation. UInt64.

Returned value

Returns an array of the values with maximum approximate sum of weights.

Example

Query:

SELECT topKWeighted(2)(k, w) FROM
VALUES('k Char, w UInt64', ('y', 1), ('y', 1), ('x', 5), ('y', 1), ('z', 10))

Result:

┌─topKWeighted(2)(k, w)──┐
│ ['z','x']              │
└────────────────────────┘

Query:

SELECT topKWeighted(2, 10, 'counts')(k, w)
FROM VALUES('k Char, w UInt64', ('y', 1), ('y', 1), ('x', 5), ('y', 1), ('z', 10))

Result:

┌─topKWeighted(2, 10, 'counts')(k, w)─┐
│ [('z',10,0),('x',5,0)]              │
└─────────────────────────────────────┘

See Also

topKWeighted

Introduced in: v1.1

Returns an array of the approximately most frequent values in the specified column. The resulting array is sorted in descending order of approximate frequency of values (not by the values themselves). Additionally, the weight of the value is taken into account.

See Also

Syntax

topKWeighted(N)(column, weight)
topKWeighted(N, load_factor)(column, weight)
topKWeighted(N, load_factor, 'counts')(column, weight)

Parameters

  • N — The number of elements to return. Default value: 10. UInt64
  • load_factor — Optional. Defines, how many cells reserved for values. If uniq(column) > N * load_factor, result of topK function will be approximate. Default value: 3. UInt64
  • counts — Optional. Defines whether the result should contain an approximate count and error value. Bool

Arguments

  • column — The name of the column for which to find the most frequent values. - weight — The weight. Every value is accounted weight times for frequency calculation. UInt64

Returned value

Returns an array of the values with maximum approximate sum of weights. Array

Examples

Usage example

SELECT topKWeighted(2)(k, w) FROM
VALUES('k Char, w UInt64', ('y', 1), ('y', 1), ('x', 5), ('y', 1), ('z', 10));
┌─topKWeighted(2)(k, w)──┐
│ ['z','x']              │
└────────────────────────┘

With counts parameter

SELECT topKWeighted(2, 10, 'counts')(k, w)
FROM VALUES('k Char, w UInt64', ('y', 1), ('y', 1), ('x', 5), ('y', 1), ('z', 10));
┌─topKWeighted(2, 10, 'counts')(k, w)─┐
│ [('z',10,0),('x',5,0)]              │
└─────────────────────────────────────┘