力扣题
1、题目地址
1939. 主动请求确认消息的用户
2、模拟表
表:Signups
| Column Name | Type | 
|---|---|
| user_id | int | 
| time_stamp | datetime | 
- user_id is the primary key for this table.
- 每行包含有关 ID 为 user_id 的用户的注册时间的信息。
表:Confirmations
| Column Name | Type | 
|---|---|
| user_id | int | 
| time_stamp | datetime | 
| action | ENUM | 
- (user_id, time_stamp) is the primary key for this table.
- user_id is a foreign key with a reference to the Signups table.
- action is an ENUM of the type (‘confirmed’, ‘timeout’)
- 此表的每一行都表示 ID 为 user_id 的用户在 time_stamp 请求了确认消息,
- 并且该确认消息已被确认(‘confirmed’)或已过期(‘timeout’)。
3、要求
编写 SQL 查询以查找在 24 小时窗口内两次请求确认消息的用户的 ID。
 两个正好相隔 24 小时的消息被认为是在窗口内。 该操作不会影响答案,只会影响请求时间。
以任意顺序返回结果表。
查询结果格式如下例:
Signups 表:
| user_id | time_stamp | 
|---|---|
| 3 | 2020-03-21 10:16:13 | 
| 7 | 2020-01-04 13:57:59 | 
| 2 | 2020-07-29 23:09:44 | 
| 6 | 2020-12-09 10:39:37 | 
Confirmations 表:
| user_id | time_stamp | action | 
|---|---|---|
| 3 | 2021-01-06 03:30:46 | timeout | 
| 3 | 2021-01-06 03:37:45 | timeout | 
| 7 | 2021-06-12 11:57:29 | confirmed | 
| 7 | 2021-06-13 11:57:30 | confirmed | 
| 2 | 2021-01-22 00:00:00 | confirmed | 
| 2 | 2021-01-23 00:00:00 | timeout | 
| 6 | 2021-10-23 14:14:14 | confirmed | 
| 6 | 2021-10-24 14:14:13 | timeout | 
输出:
| user_id | 
|---|
| 2 | 
| 3 | 
| 6 | 
解析:
 用户 2 在彼此恰好 24 小时内请求了两条消息,因此我们将它们包括在内。
 用户 3 在 6 分 59 秒内请求了两条消息,因此我们将它们包括在内。
 用户 6 在 23 小时 59 分 59 秒内请求了两条消息,因此我们将它们包括在内。
 用户 7 在 24 小时 1 秒内请求了两条消息,因此我们将它们从答案中排除。
4、代码编写
知识点(先看这个方便理解代码)
TIMESTAMPDIFF ⭐️
语法: TIMESTAMPDIFF (interval, datetime_expr1, datetime_expr2)
说明: 返回结果 datetime_expr2 - datetime_expr1
参数: interval 可以为 SECOND - 秒、MINUTE - 分、HOUR - 时、DAY - 日、MONTH - 月、YEAR - 年
mysql> SELECT TIMESTAMPDIFF(DAY,'2024-01-03','2024-01-01'); 
-> -2
mysql> SELECT TIMESTAMPDIFF(DAY,'2024-01-01','2024-01-03'); 
-> 2
参考:MySQL – 计算时间差
Lead 和 Lag⭐️
语法:
 Lead (列名, 偏移量, 超出记录窗口时的默认值)
 Lag (列名, 偏移量, 超出记录窗口时的默认值)
 注:如果使用 Lead (列名) 或 Lag (列名) 语法,则偏移量默认1,默认值是 null(无显示)
说明:
 Lead:形象的理解就是把数据从下向上推,下端出现空格
 Lag:形象的理解就是把数据从上向下推,上端出现空格
原始数据 (items):
| item_id | item_brand | 
|---|---|
| 1 | Samsung | 
| 2 | Lenovo | 
| 3 | LG | 
| 4 | HP | 
Lead代码效果
SELECT *, Lead(item_brand, 1, 0) OVER (ORDER BY item_id) AS lead_data
FROM items
| item_id | item_brand | lead_data | 
|---|---|---|
| 1 | Samsung | Lenovo | 
| 2 | Lenovo | LG | 
| 3 | LG | HP | 
| 4 | HP | 0 | 
Lag代码效果
SELECT *, Lag(item_brand, 1, 0) OVER (ORDER BY item_id) AS lag_data
FROM items
| item_id | item_brand | lag_data | 
|---|---|---|
| 1 | Samsung | 0 | 
| 2 | Lenovo | Samsung | 
| 3 | LG | Lenovo | 
| 4 | HP | LG | 
参考:sql中的窗口函数:lead,lag
1、TIMESTAMPDIFF 配合 Lead 写法
SELECT DISTINCT user_id
FROM (SELECT user_id, TIMESTAMPDIFF(SECOND, time_stamp, LEAD(time_stamp, 1, 0) OVER (PARTITION BY user_id ORDER BY time_stamp)) AS diffFROM Confirmations
) AS one
WHERE diff IS NOT NULL
AND diff <= 24*60*60
代码分析
SELECT user_id, TIMESTAMPDIFF(SECOND, time_stamp, LEAD(time_stamp, 1, 0) OVER (PARTITION BY user_id ORDER BY time_stamp)
) AS diff
FROM Confirmations
Lead的效果是下向上推,例如
| 2       | 2021-01-22 00:00:00 |
| 2       | 2021-01-23 00:00:00 |
变成
| 2       | 2021-01-23 00:00:00 |
| 2       | null |
利用向上推的数据减去原先的数据,
 上面SQL语句 TIMESTAMPDIFF 的第二个参数是 2021-01-22 00:00:00,第三个参数是 2021-01-23 00:00:00
 下面出现空格,默认值null(无显示)没错,
| user_id | diff  |
| ------- | ----- |
| 2       | 86400 |
| 2       | null  |
| 3       | 419   |
| 3       | null  |
| 6       | 86399 |
| 6       | null  |
| 7       | 86401 |
| 7       | null  |
要小于等于86400(一天)里面只有三个符合要求
| user_id | diff  |
| ------- | ----- |
| 2       | 86400 |
| 3       | 419   |
| 6       | 86399 |
2、TIMESTAMPDIFF 配合 Lag 写法
SELECT DISTINCT user_id
FROM (SELECT user_id, TIMESTAMPDIFF(SECOND, time_stamp, LAG(time_stamp, 1, 0) OVER (PARTITION BY user_id ORDER BY time_stamp)) AS diffFROM Confirmations
) AS one
WHERE diff IS NOT NULL
AND ABS(diff) <= 24*60*60
Lag的效果是上向下推,例如
| 2       | 2021-01-22 00:00:00 |
| 2       | 2021-01-23 00:00:00 |
变成
| 2       | null |
| 2       | 2021-01-22 00:00:00 |
利用向上推的数据减去原先的数据,
 上面SQL语句 TIMESTAMPDIFF 的第二个参数是 2021-01-23 00:00:00,第三个参数是 2021-01-22 00:00:00
 上面出现空格,默认值null(无显示)没错,
| user_id | diff   |
| ------- | ------ |
| 2       | null   |
| 2       | -86400 |
| 3       | null   |
| 3       | -419   |
| 6       | null   |
| 6       | -86399 |
| 7       | null   |
| 7       | -86401 |
要小于等于86400(一天)里面只有三个符合要求(需要取绝对值)
| user_id | diff  |
| ------- | ----- |
| 2       | -86400 |
| 3       | -419   |
| 6       | -86399 |
你要简单写就把 TIMESTAMPDIFF 参数二和参数三换个位置就不用取绝对值了,主要还是演示多种情况,方便更好理解
SELECT DISTINCT user_id
FROM (SELECT user_id, TIMESTAMPDIFF(SECOND, LAG(time_stamp, 1, 0) OVER (PARTITION BY user_id ORDER BY time_stamp), time_stamp) AS diffFROM Confirmations
) AS one
WHERE diff IS NOT NULL
AND diff <= 24*60*60
注:ABS() 函数的效果和数学中的绝对值效果一样
3、单 TIMESTAMPDIFF 写法
有限制正范围的情况就只保证两个时间不相同即可(a.time_stamp != b.time_stamp 是为了不和自己连接,因为没有唯一键做区分)
SELECT DISTINCT a.user_id
FROM Confirmations a, Confirmations b
WHERE a.user_id = b.user_id
AND TIMESTAMPDIFF(SECOND, a.time_stamp, b.time_stamp) BETWEEN 0 AND 24*60*60
AND a.time_stamp != b.time_stamp
包含负范围的情况就还得控制算出的时间差是正的(b.time_stamp - a.time_stamp > 0)
SELECT DISTINCT a.user_id
FROM Confirmations a, Confirmations b
WHERE a.user_id = b.user_id
AND TIMESTAMPDIFF(SECOND, a.time_stamp, b.time_stamp) <= 86400
AND a.time_stamp < b.time_stamp