Skip to main content
Skip to main content

quantilesGK

quantilesGK works similarly to quantileGK but allows us to calculate quantities at different levels simultaneously and returns an array.

Syntax

quantilesGK(accuracy, level1, level2, ...)(expr)

Returned value

  • Array of quantiles of the specified levels.

Type of array values:

  • Float64 for numeric data type input.
  • Date if input values have the Date type.
  • DateTime if input values have the DateTime type.

Example

Query:

SELECT quantilesGK(1, 0.25, 0.5, 0.75)(number + 1)
FROM numbers(1000)

┌─quantilesGK(1, 0.25, 0.5, 0.75)(plus(number, 1))─┐
│ [1,1,1]                                          │
└──────────────────────────────────────────────────┘

SELECT quantilesGK(10, 0.25, 0.5, 0.75)(number + 1)
FROM numbers(1000)

┌─quantilesGK(10, 0.25, 0.5, 0.75)(plus(number, 1))─┐
│ [156,413,659]                                     │
└───────────────────────────────────────────────────┘
SELECT quantilesGK(100, 0.25, 0.5, 0.75)(number + 1)
FROM numbers(1000)

┌─quantilesGK(100, 0.25, 0.5, 0.75)(plus(number, 1))─┐
│ [251,498,741]                                      │
└────────────────────────────────────────────────────┘

SELECT quantilesGK(1000, 0.25, 0.5, 0.75)(number + 1)
FROM numbers(1000)

┌─quantilesGK(1000, 0.25, 0.5, 0.75)(plus(number, 1))─┐
│ [249,499,749]                                       │
└─────────────────────────────────────────────────────┘

quantilesGK

Introduced in: v23.4

Computes multiple quantiles of a numeric data sequence at different levels simultaneously using the Greenwald-Khanna algorithm.

This function works similarly with quantileGK but allows computing multiple quantile levels in a single pass, which is more efficient than calling individual quantile functions.

The Greenwald-Khanna algorithm is an algorithm used to compute quantiles on a stream of data in a highly efficient manner. It was introduced by Michael Greenwald and Sanjeev Khanna in 2001. The algorithm is highly efficient, taking only O(log n) space and O(log log n) time per item (where n is the size of the input). It is also highly accurate, providing approximate quantile values with controllable accuracy.

Syntax

quantilesGK(accuracy, level1, level2, ...)(expr)

Parameters

  • accuracy — Accuracy of quantiles. Constant positive integer. Larger accuracy value means less error. For example, if the accuracy argument is set to 100, the computed quantiles will have an error no greater than 1% with high probability. There is a trade-off between the accuracy of the computed quantiles and the computational complexity of the algorithm. UInt*
  • level — Levels of quantiles. One or more constant floating-point numbers from 0 to 1. Float*

Arguments

Returned value

Array of quantiles of the specified levels in the same order as the levels were specified. Array(Float64) or Array(Date) or Array(DateTime)

Examples

Computing multiple quantiles with GK algorithm

SELECT quantilesGK(1, 0.25, 0.5, 0.75)(number + 1) FROM numbers(1000);
┌─quantilesGK(1, 0.25, 0.5, 0.75)(plus(number, 1))─┐
│ [1, 1, 1]                                        │
└──────────────────────────────────────────────────┘

Higher accuracy quantiles

SELECT quantilesGK(100, 0.25, 0.5, 0.75)(number + 1) FROM numbers(1000);
┌─quantilesGK(100, 0.25, 0.5, 0.75)(plus(number, 1))─┐
│ [251, 498, 741]                                    │
└────────────────────────────────────────────────────┘