问题
nnnn现在有一张用户登陆日志表,该表包括user_id,ip,log_time三个字段,现在需要找出共同使用ip数量超过3个(含)的所有用户对。比如下面的示例数据,101和102用户共同使用的ip为4个,101和103用户共同使用的ip为3个,102和103用户共同使用的ip为3个。
nnnnuser_id ip log_timen101 192.168.10.101 2022-05-10 11:00:00 n101 192.168.10.101 2022-05-10 11:01:00 n101 192.168.10.102 2022-05-10 11:02:00 n101 192.168.10.103 2022-05-10 11:03:00 n101 192.168.10.104 2022-05-10 11:04:00 n102 192.168.10.101 2022-05-10 11:04:30 n102 192.168.10.102 2022-05-10 11:05:00 n102 192.168.10.103 2022-05-10 11:06:00 n102 192.168.10.104 2022-05-10 11:07:00 n103 192.168.10.102 2022-05-10 11:08:00 n103 192.168.10.103 2022-05-10 11:08:00 n103 192.168.10.104 2022-05-10 11:10:00 n104 192.168.10.103 2022-05-10 11:11:00 n104 192.168.10.104 2022-05-10 11:12:00 n105 192.168.10.105 2022-05-10 11:13:00
nnnn解答
nnnn类似相同好友问题,关键点是使用自连接。先对用户+ip去重之后,使用ip进行自关联,得到的t1.user_id和t2.user_id的关系就是有同一个ip。按这个ip进行聚合,出现的就是t1和t2的用户相同的ip集合,判断这个集合大小即可。
nnnnwith user_login as (nselect 101 as user_id ,'192.168.10.101' as ip ,'2022-05-10 11:00:00' as log_timenunion all nselect 101 as user_id ,'192.168.10.101' as ip ,'2022-05-10 11:01:00' as log_timenunion all nselect 101 as user_id ,'192.168.10.102' as ip ,'2022-05-10 11:02:00' as log_timenunion all nselect 101 as user_id ,'192.168.10.103' as ip ,'2022-05-10 11:03:00' as log_timenunion all nselect 101 as user_id ,'192.168.10.104' as ip ,'2022-05-10 11:04:00' as log_timenunion all nselect 102 as user_id ,'192.168.10.101' as ip ,'2022-05-10 11:04:30' as log_timenunion all nselect 102 as user_id ,'192.168.10.102' as ip ,'2022-05-10 11:05:00' as log_timenunion all nselect 102 as user_id ,'192.168.10.103' as ip ,'2022-05-10 11:06:00' as log_timenunion all nselect 102 as user_id ,'192.168.10.104' as ip ,'2022-05-10 11:07:00' as log_timenunion all nselect 103 as user_id ,'192.168.10.102' as ip ,'2022-05-10 11:08:00' as log_timenunion all nselect 103 as user_id ,'192.168.10.103' as ip ,'2022-05-10 11:08:00' as log_timenunion all nselect 103 as user_id ,'192.168.10.104' as ip ,'2022-05-10 11:10:00' as log_timenunion all nselect 104 as user_id ,'192.168.10.103' as ip ,'2022-05-10 11:11:00' as log_timenunion all nselect 104 as user_id ,'192.168.10.104' as ip ,'2022-05-10 11:12:00' as log_timenunion all nselect 105 as user_id ,'192.168.10.105' as ip ,'2022-05-10 11:13:00' as log_timen),ntmp asn(n select n user_idn ,ipn from user_login --实际换成自己的表或上面的样例数据n group by user_id,ip --同一个ip同一用户可能多次登录,先去重n)nnselectnt1.user_idn,t2.user_idn,count(t1.ip) as ip_cntnfrom tmp t1ninner join tmp t2non t1.ip=t2.ip --通过ip自关联nwhere t1.user_id<t2.user_id --因为存在101对102,102对101的情况,保留一种即可ngroup by t1.user_id,t2.user_idnhaving ip_cnt>=3 --保留用户对ip数量超过3个的(含)
n
Comments NOTHING