SQL的窗口函数  
 
 文章目录  SQL的窗口函数 1. 介绍 2. 聚合函数 0.数据准备 1. AVG 2. COUNT 3. MAX 4. MIN 5. 标准差 6. SUM   3. 排序函数 1. CUME_DIST 2. RANK, DENSE_RANK, ROW_NUMBER 3. PERCENT_RANK 4. NTILE   4. 值函数(偏移函数) 1. FIRST_VALUE 2. LAST_VALUE 3. LAG 4. LEAD 5. NTH_VALUE   END         
  
 
 1. 介绍  
窗口函数分类:聚合函数、排序函数和值函数(偏移函数)。  SQL中的窗口函数是一种特殊的函数,它能够在查询结果中创建一个窗口或者窗口集合,然后对这个窗口进行计算。窗口函数可以用于分析和汇总数据,同时保留原始查询结果的行顺序。  窗口函数基于查询结果的行数据进行计算,窗口函数运行在HAVING子句之后、 ORDER BY子句之前。窗口函数需要特殊的关键字OVER子句来指定窗口即触发一个窗口函数。
 
 2. 聚合函数  
AVG:该函数用于计算平均值。
COUNT:该函数用于计算记录数。
MAX:该函数用于计算最大值。
MIN:该函数用于计算最小值。
STDDEV:返回数值的总体标准差。
STDDEV_SAMP:返回一组数值(整数、小数或浮点)的样本标准差。
SUM:该函数用于计算汇总值。
  
 0.数据准备  
WITH  t_sales_info AS  ( SELECT  *  FROM  ( VALUES ( 'A' ,  '2023-06-01' ,  '2023' ,  '1' ,  36.0 ) ,  ( 'A' ,  '2023-09-01' ,  '2023' ,  '2' ,  48.0 ) ,  ( 'B' ,  '2023-09-07' ,  '2023' ,  '2' ,  48.0 ) ,  ( 'C' ,  '2023-10-10' ,  '2023' ,  '3' ,  23.0 ) ,  ( 'A' ,  '2023-11-11' ,  '2023' ,  '3' ,  23.0 ) ,  ( 'C' ,  '2023-12-11' ,  '2023' ,  '3' ,  23.0 ) ,  ( 'A' ,  '2024-01-01' ,  '2024' ,  '2' ,  48.0 ) ,  ( 'B' ,  '2024-02-02' ,  '2024' ,  '2' ,  48.0 ) ,  ( 'C' ,  '2024-02-04' ,  '2024' ,  '1' ,  36.0 ) ,  ( 'C' ,  '2024-02-11' ,  '2024' ,  '1' ,  36.0 ) ,  ( 'B' ,  '2024-03-16' ,  '2024' ,  '3' ,  23.0 ) ,  ( 'B' ,  '2024-04-01' ,  '2024' ,  '3' ,  23.0 ) ,  ( 'A' ,  '2024-04-14' ,  '2024' ,  '3' ,  23.0 ) ,  ( 'C' ,  '2024-04-21' ,  '2024' ,  '3' ,  23.0 ) ,  ( 'B' ,  '2024-05-07' ,  '2024' ,  '3' ,  23.0 ) )  AS  tbl_name( customer_id,  order_date,  year ,  product_id,  product_price) 
) 
  
 1. AVG  
该函数用于计算平均值。
输入值类型:BIGINT、DOUBLE或FLOAT。
返回值类型:DOUBLE。
  
SELECT  customer_id,  order_date,  ` year` ,  product_id,  product_price,  AVG ( product_price)  OVER ( PARTITION  BY  customer_id ORDER  BY  customer_id)  AS  price_avg 
FROM  t_sales_info ; 
  
customer_id order_date year product_id product_price price_avg A 2024-04-14 2024 3 23.0 35.6 A 2024-01-01 2024 2 48.0 35.6 A 2023-09-01 2023 2 48.0 35.6 A 2023-06-01 2023 1 36.0 35.6 A 2023-11-11 2023 3 23.0 35.6 B 2024-05-07 2024 3 23.0 33.0 B 2023-09-07 2023 2 48.0 33.0 B 2024-02-02 2024 2 48.0 33.0 B 2024-03-16 2024 3 23.0 33.0 B 2024-04-01 2024 3 23.0 33.0 C 2024-04-21 2024 3 23.0 28.2 C 2023-10-10 2023 3 23.0 28.2 C 2023-12-11 2023 3 23.0 28.2 C 2024-02-04 2024 1 36.0 28.2 C 2024-02-11 2024 1 36.0 28.2 
 
SELECT  customer_id,  order_date,  ` year` ,  product_id,  product_price,  AVG ( product_price)  OVER ( PARTITION  BY  customer_id ORDER  BY  ` year` )  AS  price_avg 
FROM  t_sales_info ; 
  
customer_id order_date year product_id product_price price_avg A 2023-09-01 2023 2 48.0 35.666666666666664 A 2023-06-01 2023 1 36.0 35.666666666666664 A 2023-11-11 2023 3 23.0 35.666666666666664 A 2024-01-01 2024 2 48.0 35.6 A 2024-04-14 2024 3 23.0 35.6 B 2023-09-07 2023 2 48.0 48.0 B 2024-05-07 2024 3 23.0 33.0 B 2024-02-02 2024 2 48.0 33.0 B 2024-03-16 2024 3 23.0 33.0 B 2024-04-01 2024 3 23.0 33.0 C 2023-10-10 2023 3 23.0 23.0 C 2023-12-11 2023 3 23.0 23.0 C 2024-04-21 2024 3 23.0 28.2 C 2024-02-04 2024 1 36.0 28.2 C 2024-02-11 2024 1 36.0 28.2 
 
 2. COUNT  
该函数用于计算记录数。
输入值类型:数值、字符串类型或BOOLEAN类型。
返回值类型:BIGINT。
  
SELECT  customer_id,  order_date,  ` year` ,  product_id,  product_price,  COUNT ( product_id)  OVER ( PARTITION  BY  customer_id,  ` year`   ORDER  BY  ` year` )  AS  product_ct 
FROM  t_sales_info ; 
  
customer_id order_date year product_id product_price product_ct A 2023-09-01 2023 2 48.0 3 A 2023-06-01 2023 1 36.0 3 A 2023-11-11 2023 3 23.0 3 A 2024-01-01 2024 2 48.0 2 A 2024-04-14 2024 3 23.0 2 B 2023-09-07 2023 2 48.0 1 B 2024-05-07 2024 3 23.0 4 B 2024-02-02 2024 2 48.0 4 B 2024-03-16 2024 3 23.0 4 B 2024-04-01 2024 3 23.0 4 C 2023-10-10 2023 3 23.0 2 C 2023-12-11 2023 3 23.0 2 C 2024-04-21 2024 3 23.0 3 C 2024-02-04 2024 1 36.0 3 C 2024-02-11 2024 1 36.0 3 
 
 3. MAX  
该函数用于计算最大值。
输入值类型:该函数支持输入任意类型的参数,但是BOOLEAN类型的数据不允许参与运算。
返回值类型:与该函数的输入值类型保持一致。注意:当列中的值为NULL时,该行不参与计算。
  
SELECT  customer_id,  order_date,  ` year` ,  product_id,  product_price,  MAX ( product_price)  OVER ( PARTITION  BY  customer_id ORDER  BY  customer_id)  AS  product_ct 
FROM  t_sales_info ; 
  
customer_id order_date year product_id product_price product_ct A 2024-04-14 2024 3 23.0 48.0 A 2024-01-01 2024 2 48.0 48.0 A 2023-09-01 2023 2 48.0 48.0 A 2023-06-01 2023 1 36.0 48.0 A 2023-11-11 2023 3 23.0 48.0 B 2024-05-07 2024 3 23.0 48.0 B 2023-09-07 2023 2 48.0 48.0 B 2024-02-02 2024 2 48.0 48.0 B 2024-03-16 2024 3 23.0 48.0 B 2024-04-01 2024 3 23.0 48.0 C 2024-04-21 2024 3 23.0 36.0 C 2023-10-10 2023 3 23.0 36.0 C 2023-12-11 2023 3 23.0 36.0 C 2024-02-04 2024 1 36.0 36.0 C 2024-02-11 2024 1 36.0 36.0 
 
 4. MIN  
该函数用于计算最小值。
输入值类型:该函数支持输入任意类型的参数,但是BOOLEAN类型的数据不允许参与运算。
返回值类型:与该函数的输入值类型保持一致。注意:当列中的值为NULL时,该行不参与计算。
  
SELECT  customer_id,  order_date,  ` year` ,  product_id,  product_price,  MIN ( product_price)  OVER ( PARTITION  BY  customer_id ORDER  BY  customer_id)  AS  product_ct 
FROM  t_sales_info ; 
  
customer_id order_date year product_id product_price product_ct A 2024-04-14 2024 3 23.0 23.0 A 2024-01-01 2024 2 48.0 23.0 A 2023-09-01 2023 2 48.0 23.0 A 2023-06-01 2023 1 36.0 23.0 A 2023-11-11 2023 3 23.0 23.0 B 2024-05-07 2024 3 23.0 23.0 B 2023-09-07 2023 2 48.0 23.0 B 2024-02-02 2024 2 48.0 23.0 B 2024-03-16 2024 3 23.0 23.0 B 2024-04-01 2024 3 23.0 23.0 C 2024-04-21 2024 3 23.0 23.0 C 2023-10-10 2023 3 23.0 23.0 C 2023-12-11 2023 3 23.0 23.0 C 2024-02-04 2024 1 36.0 23.0 C 2024-02-11 2024 1 36.0 23.0 
 
 5. 标准差  
STDDEV:返回数值的总体标准差。
STDDEV_SAMP:返回一组数值(整数、小数或浮点)的样本标准差。
样本标准差= 方差的算术平方根= s= sqrt(( ( x1- x) ^ 2  + ( x2- x) ^ 2  + ......( xn- x) ^ 2 ) / ( n- 1 ))  
总体标准差= σ= sqrt(( ( x1-x) ^2 +( x2-x) ^2 +.. .. .. ( xn-x) ^2) /n ) 
  
SELECT  customer_id,  order_date,  ` year` ,  product_id,  product_price,  STDDEV( product_price)  OVER ( PARTITION  BY  customer_id ORDER  BY  customer_id)  AS  price_std,  STDDEV_SAMP( product_price)  OVER ( PARTITION  BY  customer_id ORDER  BY  customer_id)  AS  price_std_samp 
FROM  t_sales_info ; 
  
customer_id order_date year product_id product_price price_std price_std_samp A 2024-04-14 2024 3 23.0 11.18212859879549 12.501999840025595 A 2024-01-01 2024 2 48.0 11.18212859879549 12.501999840025595 A 2023-09-01 2023 2 48.0 11.18212859879549 12.501999840025595 A 2023-06-01 2023 1 36.0 11.18212859879549 12.501999840025595 A 2023-11-11 2023 3 23.0 11.18212859879549 12.501999840025595 B 2024-05-07 2024 3 23.0 12.24744871391589 13.693063937629153 B 2023-09-07 2023 2 48.0 12.24744871391589 13.693063937629153 B 2024-02-02 2024 2 48.0 12.24744871391589 13.693063937629153 B 2024-03-16 2024 3 23.0 12.24744871391589 13.693063937629153 B 2024-04-01 2024 3 23.0 12.24744871391589 13.693063937629153 C 2024-04-21 2024 3 23.0 6.368673331236264 7.12039324756716 C 2023-10-10 2023 3 23.0 6.368673331236264 7.12039324756716 C 2023-12-11 2023 3 23.0 6.368673331236264 7.12039324756716 C 2024-02-04 2024 1 36.0 6.368673331236264 7.12039324756716 C 2024-02-11 2024 1 36.0 6.368673331236264 7.12039324756716 
 
 6. SUM  
该函数用于计算汇总值。
输入值类型:BIGINT、DOUBLE或FLOAT。
返回值类型:BIGINT。
  
SELECT  customer_id,  order_date,  ` year` ,  product_id,  product_price,  SUM ( product_price)  OVER ( PARTITION  BY  customer_id,  ` year`   ORDER  BY  ` year` )  AS  price_sum
FROM  t_sales_info ; 
  
customer_id order_date year product_id product_price price_sum A 2023-09-01 2023 2 48.0 107.0 A 2023-06-01 2023 1 36.0 107.0 A 2023-11-11 2023 3 23.0 107.0 A 2024-01-01 2024 2 48.0 71.0 A 2024-04-14 2024 3 23.0 71.0 B 2023-09-07 2023 2 48.0 48.0 B 2024-05-07 2024 3 23.0 117.0 B 2024-02-02 2024 2 48.0 117.0 B 2024-03-16 2024 3 23.0 117.0 B 2024-04-01 2024 3 23.0 117.0 C 2023-10-10 2023 3 23.0 46.0 C 2023-12-11 2023 3 23.0 46.0 C 2024-04-21 2024 3 23.0 95.0 C 2024-02-04 2024 1 36.0 95.0 C 2024-02-11 2024 1 36.0 95.0 
 
 3. 排序函数  
CUME_DIST:返回一组数值中每个值的累计分布。
RANK:返回数据集中每个值的排名。
DENSE_RANK:返回一组数值中每个数值的排名。
ROW_NUMBER:根据行在窗口分区内的顺序,为每行数据返回一个唯一的有序行号,行号从1开始。
PERCENT_RANK:返回数据集中每个数据的排名百分比,其结果由( r - 1 )  / ( n - 1 ) 计算得出。其中r为RANK( ) 计算的当前行排名, n为当前窗口分区内总的行数。
NTILE:将每个窗口分区的数据分散到桶号从1到n的n个桶中。
  
语法
 
function  over  ( partition  by  a order  by  b RANGE| ROWS  BETWEEN  start  AND  end )        
  
分区规范:用于将输入行分散到不同的分区中,过程和GROUP BY子句的分散过程相似。
排序规范:决定输入数据行在窗口函数中执行的顺序。
窗口区间:指定计算数据的窗口边界。
窗口区间支持RANGE、ROWS两种模式:RANGE按照计算列值的范围进行定义。ROWS按照计算列的行数进行范围定义。RANGE、ROWS中可以使用BETWEEN start AND end指定边界可取值。BETWEEN start AND end取值为:CURRENT ROW,当前行。N PRECEDING,前n行。UNBOUNDED PRECEDING,直到第1行。N FOLLOWING,后n行。UNBOUNDED FOLLOWING,直到最后1行。
  
 1. CUME_DIST  
返回一组数值中每个值的累计分布。
返回结果:在窗口分区中对窗口进行排序后的数据集,包括当前行和当前行之前的数据行数。排序中任何关联值均会计算成相同的分布值。
返回值类型:DOUBLE。
  
SELECT  customer_id,  order_date,  ` year` ,  product_id,  product_price,  CUME_DIST( )  OVER ( PARTITION  BY  customer_id ORDER  BY  order_date)  AS  cume_dist 
FROM  t_sales_info ; 
  
customer_id order_date year product_id product_price cume_dist A 2023-06-01 2023 1 36.0 0.2 A 2023-09-01 2023 2 48.0 0.4 A 2023-11-11 2023 3 23.0 0.6 A 2024-01-01 2024 2 48.0 0.8 A 2024-04-14 2024 3 23.0 1.0 B 2023-09-07 2023 2 48.0 0.2 B 2024-02-02 2024 2 48.0 0.4 B 2024-03-16 2024 3 23.0 0.6 B 2024-04-01 2024 3 23.0 0.8 B 2024-05-07 2024 3 23.0 1.0 C 2023-10-10 2023 3 23.0 0.2 C 2023-12-11 2023 3 23.0 0.4 C 2024-02-04 2024 1 36.0 0.6 C 2024-02-11 2024 1 36.0 0.8 C 2024-04-21 2024 3 23.0 1.0 
 
 2. RANK, DENSE_RANK, ROW_NUMBER  
返回数据集中每个值的排名: RANK( )  会跳过重复值;  DENSE_RANK( )  不会跳过重复值序号;  ROW_NUMBER( )  不包含重复值, 相同名次按顺序排列; 
排名值是将当前行之前的行数加1,不包含当前行。因此,排序的关联值可能产生顺序上的空隙,而且这个排名会对每个窗口分区进行计算。
DENSE_RANK( ) 与RANK( ) 功能相似,但是DENSE_RANK( ) 关联值不会产生顺序上的空隙。
返回值类型:BIGINT。
  
SELECT  customer_id,  product_price,  RANK( )  OVER ( PARTITION  BY  customer_id ORDER  BY  product_price)  AS  rk ,  DENSE_RANK( )  OVER ( PARTITION  BY  customer_id ORDER  BY  product_price)  AS  d_rk ,  ROW_NUMBER( )  OVER ( PARTITION  BY  customer_id ORDER  BY  product_price)  AS  rn
FROM  t_sales_info ; 
  
customer_id product_price rk d_rk rn A 23.0 1 1 1 A 23.0 1 1 2 A 36.0 3 2 3 A 48.0 4 3 4 A 48.0 4 3 5 B 23.0 1 1 1 B 23.0 1 1 2 B 23.0 1 1 3 B 48.0 4 2 4 B 48.0 4 2 5 C 23.0 1 1 1 C 23.0 1 1 2 C 23.0 1 1 3 C 36.0 4 2 4 C 36.0 4 2 5 
 
 3. PERCENT_RANK  
命令说明:返回数据集中每个数据的排名百分比,其结果由( r - 1 )  / ( n - 1 ) 计算得出。其中,r为RANK( ) 计算的当前行排名, n为当前窗口分区内总的行数。
返回值类型:DOUBLE。
  
SELECT  customer_id,  order_date,  ` year` ,  product_id,  product_price,  RANK( )  OVER ( PARTITION  BY  customer_id ORDER  BY  product_price)  AS  rk ,  PERCENT_RANK( )  OVER ( PARTITION  BY  customer_id ORDER  BY  product_price)  AS  pr
FROM  t_sales_info ; 
  
customer_id order_date year product_id product_price rk pr A 2024-04-14 2024 3 23.0 1 0.0 A 2023-11-11 2023 3 23.0 1 0.0 A 2023-06-01 2023 1 36.0 3 0.5 A 2024-01-01 2024 2 48.0 4 0.75 A 2023-09-01 2023 2 48.0 4 0.75 B 2024-05-07 2024 3 23.0 1 0.0 B 2024-03-16 2024 3 23.0 1 0.0 B 2024-04-01 2024 3 23.0 1 0.0 B 2023-09-07 2023 2 48.0 4 0.75 B 2024-02-02 2024 2 48.0 4 0.75 C 2024-04-21 2024 3 23.0 1 0.0 C 2023-10-10 2023 3 23.0 1 0.0 C 2023-12-11 2023 3 23.0 1 0.0 C 2024-02-04 2024 1 36.0 4 0.75 C 2024-02-11 2024 1 36.0 4 0.75 
 
 4. NTILE  
命令说明:将每个窗口分区的数据分散到桶号从1到n的n个桶中。桶号值最多间隔1,如果窗口分区中的数据行数不能均匀地分散到每一个桶中,则剩余值将从第1个桶开始,每1个桶分1行数据。例如,有6行数据和4个桶, 最终桶号值为1 1  2  2  3  4 。返回值类型:BIGINT。
  
SELECT  customer_id,  order_date,  ` year` ,  product_id,  product_price,  NTILE( 4 )  OVER ( PARTITION  BY  ` year`   ORDER  BY  order_date)  AS  rk 
FROM  t_sales_info ; 
  
customer_id order_date year product_id product_price rk A 2023-06-01 2023 1 36.0 1 A 2023-09-01 2023 2 48.0 1 B 2023-09-07 2023 2 48.0 2 C 2023-10-10 2023 3 23.0 2 A 2023-11-11 2023 3 23.0 3 C 2023-12-11 2023 3 23.0 4 A 2024-01-01 2024 2 48.0 1 B 2024-02-02 2024 2 48.0 1 C 2024-02-04 2024 1 36.0 1 C 2024-02-11 2024 1 36.0 2 B 2024-03-16 2024 3 23.0 2 B 2024-04-01 2024 3 23.0 3 A 2024-04-14 2024 3 23.0 3 C 2024-04-21 2024 3 23.0 4 B 2024-05-07 2024 3 23.0 4 
 
 4. 值函数(偏移函数)  
FIRST_VALUE:返回窗口分区第1行的值。
LAST_VALUE返回窗口分区最后1行的值。
LAG:返回窗口内距离当前行之前偏移offset后的值。
LEAD:返回窗口内距离当前行偏移offset后的值。
NTH_VALUE:返回窗口内偏移指定offset后的值,偏移量从1开始。
  
 1. FIRST_VALUE  
命令说明:返回窗口分区第一行的值。返回值类型:与输入参数类型相同。
  
SELECT  customer_id,  order_date,  ` year` ,  product_id,  product_price,  FIRST_VALUE( product_id)  OVER ( PARTITION  BY  customer_id ORDER  BY  order_date)  AS  fv 
FROM  t_sales_info ; 
  
customer_id order_date year product_id product_price fv A 2023-06-01 2023 1 36.0 1 A 2023-09-01 2023 2 48.0 1 A 2023-11-11 2023 3 23.0 1 A 2024-01-01 2024 2 48.0 1 A 2024-04-14 2024 3 23.0 1 B 2023-09-07 2023 2 48.0 2 B 2024-02-02 2024 2 48.0 2 B 2024-03-16 2024 3 23.0 2 B 2024-04-01 2024 3 23.0 2 B 2024-05-07 2024 3 23.0 2 C 2023-10-10 2023 3 23.0 3 C 2023-12-11 2023 3 23.0 3 C 2024-02-04 2024 1 36.0 3 C 2024-02-11 2024 1 36.0 3 C 2024-04-21 2024 3 23.0 3 
 
 2. LAST_VALUE  
命令说明:返回窗口分区最后一行的值。LAST_VALUE默认统计范围是 rows between unbounded preceding and current row,即取当前行数据与当前行之前的数据进行比较。如果像FIRST_VALUE那样直接在每行数据中显示最后一行数据,需要在 order by 条件的后面加上语句:rows between unbounded preceding and unbounded following。返回值类型:与输入参数类型相同。
  
SELECT  customer_id,  order_date,  ` year` ,  product_id,  product_price,  FIRST_VALUE( product_id)  OVER ( PARTITION  BY  customer_id ORDER  BY  order_date)  AS  lv 
FROM  t_sales_info ; 
  
customer_id order_date year product_id product_price lv A 2023-06-01 2023 1 36.0 1 A 2023-09-01 2023 2 48.0 2 A 2023-11-11 2023 3 23.0 3 A 2024-01-01 2024 2 48.0 2 A 2024-04-14 2024 3 23.0 3 B 2023-09-07 2023 2 48.0 2 B 2024-02-02 2024 2 48.0 2 B 2024-03-16 2024 3 23.0 3 B 2024-04-01 2024 3 23.0 3 B 2024-05-07 2024 3 23.0 3 C 2023-10-10 2023 3 23.0 3 C 2023-12-11 2023 3 23.0 3 C 2024-02-04 2024 1 36.0 1 C 2024-02-11 2024 1 36.0 1 C 2024-04-21 2024 3 23.0 3 
 
SELECT  customer_id,  order_date,  ` year` ,  product_id,  product_price,  LAST_VALUE( product_id)  OVER ( PARTITION  BY  customer_id ORDER  BY  order_date ROWS  BETWEEN  unbounded  preceding  AND  unbounded  following  )  AS  lv 
FROM  t_sales_info ; 
  
customer_id order_date year product_id product_price lv A 2023-06-01 2023 1 36.0 3 A 2023-09-01 2023 2 48.0 3 A 2023-11-11 2023 3 23.0 3 A 2024-01-01 2024 2 48.0 3 A 2024-04-14 2024 3 23.0 3 B 2023-09-07 2023 2 48.0 3 B 2024-02-02 2024 2 48.0 3 B 2024-03-16 2024 3 23.0 3 B 2024-04-01 2024 3 23.0 3 B 2024-05-07 2024 3 23.0 3 C 2023-10-10 2023 3 23.0 3 C 2023-12-11 2023 3 23.0 3 C 2024-02-04 2024 1 36.0 3 C 2024-02-11 2024 1 36.0 3 C 2024-04-21 2024 3 23.0 3 
 
 3. LAG  
LAG( x[ , offset[ , default_value] ] )    命令说明:返回窗口内距离当前行之前偏移offset后的值。
偏移量起始值是0,也就是当前数据行。偏移量可以是标量表达式,默认offset是1 。
如果偏移量的值是null或者大于窗口长度,则返回default_value;如果没有指定default_value,则返回null。
返回值类型:与输入参数类型相同。
  
SELECT  customer_id,  order_date,  ` year` ,  product_id,  product_price,  LAG( product_id,  1 ,  '无' )  OVER ( PARTITION  BY  customer_id ORDER  BY  order_date)  AS  lag1 
FROM  t_sales_info ; 
  
customer_id order_date year product_id product_price lag1 A 2023-06-01 2023 1 36.0 无 A 2023-09-01 2023 2 48.0 1 A 2023-11-11 2023 3 23.0 2 A 2024-01-01 2024 2 48.0 3 A 2024-04-14 2024 3 23.0 2 B 2023-09-07 2023 2 48.0 无 B 2024-02-02 2024 2 48.0 2 B 2024-03-16 2024 3 23.0 2 B 2024-04-01 2024 3 23.0 3 B 2024-05-07 2024 3 23.0 3 C 2023-10-10 2023 3 23.0 无 C 2023-12-11 2023 3 23.0 3 C 2024-02-04 2024 1 36.0 3 C 2024-02-11 2024 1 36.0 1 C 2024-04-21 2024 3 23.0 1 
 
 4. LEAD  
LEAD( x[ ,offset[ , default_value] ] )   命令说明:返回窗口内距离当前行偏移offset后的值。偏移量offset起始值是0,也就是当前数据行。偏移量可以是标量表达式,默认offset是1 。如果偏移量的值是null或者大于窗口长度,则返回default_value;如果没有指定default_value,则返回null。返回值类型:与输入参数类型相同。
  
SELECT  customer_id,  order_date,  ` year` ,  product_id,  product_price,  LEAD( product_id,  1 )  OVER ( PARTITION  BY  customer_id ORDER  BY  order_date)  AS  lead1 
FROM  t_sales_info ; 
  
customer_id order_date year product_id product_price lead1 A 2023-06-01 2023 1 36.0 2 A 2023-09-01 2023 2 48.0 3 A 2023-11-11 2023 3 23.0 2 A 2024-01-01 2024 2 48.0 3 A 2024-04-14 2024 3 23.0 \N B 2023-09-07 2023 2 48.0 2 B 2024-02-02 2024 2 48.0 3 B 2024-03-16 2024 3 23.0 3 B 2024-04-01 2024 3 23.0 3 B 2024-05-07 2024 3 23.0 \N C 2023-10-10 2023 3 23.0 3 C 2023-12-11 2023 3 23.0 1 C 2024-02-04 2024 1 36.0 1 C 2024-02-11 2024 1 36.0 3 C 2024-04-21 2024 3 23.0 \N 
 
 5. NTH_VALUE  
NTH_VALUE( x, offset)  命令说明:返回窗口内偏移指定offset后的值,偏移量从1开始。如果偏移量offset是null或者大于窗口内值的个数,则返回null;如果偏移量offset为0或者负数,则系统提示报错。返回值类型:与输入参数类型相同。
  
SELECT  customer_id,  order_date,  ` year` ,  product_id,  product_price,  NTH_VALUE( product_id,  2 )  OVER ( PARTITION  BY  customer_id ORDER  BY  order_date)  AS  nv1 
FROM  t_sales_info ; 
  
customer_id order_date year product_id product_price nv1 A 2023-06-01 2023 1 36.0 \N A 2023-09-01 2023 2 48.0 2 A 2023-11-11 2023 3 23.0 2 A 2024-01-01 2024 2 48.0 2 A 2024-04-14 2024 3 23.0 2 B 2023-09-07 2023 2 48.0 \N B 2024-02-02 2024 2 48.0 2 B 2024-03-16 2024 3 23.0 2 B 2024-04-01 2024 3 23.0 2 B 2024-05-07 2024 3 23.0 2 C 2023-10-10 2023 3 23.0 \N C 2023-12-11 2023 3 23.0 3 C 2024-02-04 2024 1 36.0 3 C 2024-02-11 2024 1 36.0 3 C 2024-04-21 2024 3 23.0 3 
 
 END