首页 / MYSQL / 用于大型表连接的mysql查询优化
用于大型表连接的mysql查询优化
内容导读
互联网集市收集整理的这篇技术教程文章主要介绍了用于大型表连接的mysql查询优化,小编现在分享给大家,供广大互联网技能从业者学习和参考。文章包含10166字,纯文字阅读大概需要15分钟。
内容图文
![用于大型表连接的mysql查询优化](/upload/InfoBanner/zyjiaocheng/907/400057033f6146e6bc79f6f2e47bd9a7.jpg)
我正在为广播电台创建一个报告,该报告生成在线听众的日志,以记录IP,日期,时间,用户总听力等.
听众表
client_ip date time date_time listeners
--------------- ---------- -------- ------------------- -----------
166.147.81.179 2012-04-30 00:19:46 2012-04-30 00:19:46 1
64.12.243.203 2012-04-30 04:38:37 2012-04-30 04:38:37 1
198.228.211.195 2012-04-30 05:36:33 2012-04-30 05:36:33 1
198.228.211.195 2012-04-30 05:36:34 2012-04-30 05:36:34 2
198.228.211.195 2012-04-30 05:36:35 2012-04-30 05:36:35 2
198.228.211.195 2012-04-30 05:36:35 2012-04-30 05:36:35 3
166.147.81.179 2012-04-30 05:47:13 2012-04-30 05:47:13 2
76.170.251.97 2012-04-30 06:01:37 2012-04-30 06:01:37 2
76.170.251.97 2012-04-30 06:01:39 2012-04-30 06:01:39 2
76.170.251.97 2012-04-30 06:01:39 2012-04-30 06:01:39 2
同时它保存歌曲细节(标题,艺术家,专辑,长度,日期,时间等)的记录.
播放列表表
title artist length_in_secs played_date played_time start_date_time end_date_time
-------------------------- ------------------------------- -------------- ----------- ----------- ------------------- ---------------------
We Found Love Rihanna 184 2012-04-30 00:00:21 2012-04-30 00:00:21 2012-04-30 00:03:25
Photograph Nickelback 216 2012-04-30 00:03:31 2012-04-30 00:03:31 2012-04-30 00:07:07
Not Over You Gavin DeGraw 214 2012-04-30 00:07:18 2012-04-30 00:07:18 2012-04-30 00:10:52
Stereo Hearts Gym Class Heroes Ft Adam Levine 210 2012-04-30 00:10:55 2012-04-30 00:10:55 2012-04-30 00:14:25
I Gotta Feeling Black Eyed Peas 243 2012-04-30 00:15:03 2012-04-30 00:15:03 2012-04-30 00:19:06
One Thing Leads To Another Fixx 182 2012-04-30 00:19:14 2012-04-30 00:19:14 2012-04-30 00:22:16
Raise Your Glass Pink 202 2012-04-30 00:22:29 2012-04-30 00:22:29 2012-04-30 00:25:51
Better In Time Leona Lewis 216 2012-04-30 00:30:13 2012-04-30 00:30:13 2012-04-30 00:33:49
Tainted Love Soft Cell 153 2012-04-30 00:33:56 2012-04-30 00:33:56 2012-04-30 00:36:29
Haven't Met You Yet Michael Buble' 242 2012-04-30 00:37:14 2012-04-30 00:37:14 2012-04-30 00:41:16
因此,报告要求是“在日期或日期范围内有多少用户收听歌曲”,我就是这样写的查询.它给出了正确的输出(据我所知),但查询执行需要时间与数据大小不成比例 – 从5秒到5-10分钟,具体取决于日期范围.
SELECT DATE_FORMAT(p.played_date, "%m/%d/%Y") `played_date`, p.played_time, p.length_in_secs, p.title, p.artist, RTRIM(p.label) `label`, RTRIM(p.album) `album`, IFNULL((SELECT SUM(l.listeners) FROM listeners `l` WHERE l.date_time >= p.start_date_time AND l.date_time <= p.end_date_time LIMIT 1), 0) `listeners` FROM playlists `p` WHERE p.title <> "" AND (p.played_date >= '2012-04-30' AND p.played_date <= '2012-05-30') HAVING listeners > 0 ORDER BY p.title ASC;
// formatted //
SELECT
DATE_FORMAT(p.played_date, "%m/%d/%Y") `played_date`,
p.played_time,
p.length_in_secs,
p.title,
p.artist,
RTRIM(p.label) `label`,
RTRIM(p.album) `album`,
IFNULL(
(SELECT
SUM(l.listeners)
FROM
listeners `l`
WHERE l.date_time >= p.start_date_time
AND l.date_time <= p.end_date_time
LIMIT 1),
0
) `listeners`
FROM
playlists `p`
WHERE p.title <> ""
AND (
p.played_date >= '2012-04-30'
AND p.played_date <= '2012-05-30'
)
HAVING listeners > 0
ORDER BY p.title ASC
输出:
played_date played_time length_in_secs title artist label album listeners
----------- ----------- -------------- --------------------- ------------------------ ------------------ ------------------ -----------
04/30/2012 08:06:26 228 Brighter Than The Sun Colbie Caillat (Cal-Lay) Universal Republic All of You 9
04/30/2012 08:44:16 248 Breakfast At Tiffanys Deep Blue Something 6
04/30/2012 18:06:40 253 Bizarre Love Triangle New Order 2
04/30/2012 17:05:21 183 Animal Neon Trees Mercury Habits 5
04/30/2012 08:58:05 253 Always Be My Baby Mariah Carey 2
04/30/2012 07:25:52 264 Already Gone Kelly Clarkson RCA All I Ever Wante 3
04/30/2012 16:21:33 236 All The Right Moves One Republic Interscope Waking Up 7
04/30/2012 11:58:26 199 All That She Wants Ace Of Base 12
04/30/2012 11:14:17 247 All I Wanna Do Sheryl Crow 2
04/30/2012 16:12:59 235 A Thousand Miles Vanessa Carlton 5
有没有办法优化此查询以更快地运行,或写一个新的,更快的?请建议/帮助我.谢谢!!
使用EXPLAIN
EXPLAIN playlists;
Field Type Null Key Default Extra
--------------- ---------------- ------ ------ ----------------- -----------------------------
playlist_id int(10) unsigned NO PRI (NULL) auto_increment
title varchar(255) YES (NULL)
artist varchar(255) YES (NULL)
label varchar(255) YES (NULL)
album varchar(255) YES (NULL)
length_in_secs int(11) NO (NULL)
played_date date NO (NULL)
played_time time NO (NULL)
start_date_time datetime NO (NULL)
end_date_time datetime NO (NULL)
added_date datetime NO (NULL)
modified_date timestamp NO CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP
EXPLAIN listeners;
Field Type Null Key Default Extra
------------- ------------------- ------ ------ ----------------- -----------------------------
listener_id bigint(20) unsigned NO PRI (NULL) auto_increment
station_id int(10) unsigned NO (NULL)
client_ip varchar(50) NO (NULL)
time time NO (NULL)
date date NO (NULL)
date_time datetime YES (NULL)
timestamp bigint(20) unsigned NO (NULL)
listeners int(10) unsigned NO (NULL)
processes int(10) unsigned NO (NULL)
uid int(10) unsigned NO (NULL)
user_agent varchar(255) YES (NULL)
added_date datetime NO (NULL)
modified_date timestamp NO CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP
解决方法:
正如评论中所讨论的,您的查询实际上并不是您希望它执行的操作.根据你拥有的数据,我会亲自在SQL之外处理这个以创建一个表来存储每首歌的听众数量,然后你可以在SQL中查询以获取这些信息.如果你绝对想要一个SQL查询来做到这一点,那么它将需要像这个怪物一样;
SELECT
DATE_FORMAT(p.played_date, "%m/%d/%Y") `played_date`,
p.played_time,
p.length_in_secs,
p.title,
p.artist,
RTRIM(p.label) `label`,
RTRIM(p.album) `album`,
SUM(SMALLEST(prev_listeners,next_listeners,dur_listeners) AS listeners
FROM (
SELECT
P.start_date_time,
SUBSTRING_INDEX(GROUP_CONCAT(l_before.listeners ORDER BY l_before.date_time DESC),',',1) AS prev_listeners,
SUBSTRING_INDEX(GROUP_CONCAT(l_after.listeners ORDER BY l_after.date_time ASC),',',1) AS next_listeners,
MIN(l_during) AS dur_listeners
FROM playlists p
JOIN listeners l_before ON l_before.date_time < p.start_date_time
LEFT JOIN listeners l_after ON l_before.client_ip = l_after.client_ip AND l_after.date_time > p.end_date_time
LEFT JOIN listeners l_during ON l.client_ip = l_during.client_ip AND l_during.date_time BETWEEN p.start_date_time AND p.end_date_time
WHERE p.title <> ""
AND p.played_date BETWEEN '2012-04-30' AND '2012-05-30'
GROUP BY p.start_date_time, l_before.client_ip
) l
JOIN playlists p USING (start_date_time)
GROUP BY p.start_date_time
ORDER BY p.start_date_time
其中SMALLEST是返回最小non_null参数的函数.
这将比您当前的查询花费更长的时间,但这是我能够想到的最有效的方式来获得每首歌曲的实际听众数量.
哦,这假设当来自ip地址的每个人都停止收听时,日志会记录一个零监听器的行,否则实际上没有办法做到这一点.
内容总结
以上是互联网集市为您收集整理的用于大型表连接的mysql查询优化全部内容,希望文章能够帮你解决用于大型表连接的mysql查询优化所遇到的程序开发问题。 如果觉得互联网集市技术教程内容还不错,欢迎将互联网集市网站推荐给程序员好友。
内容备注
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 gblab@vip.qq.com 举报,一经查实,本站将立刻删除。
内容手机端
扫描二维码推送至手机访问。