优化具有时区转换的Mysql查询并逐个小时
内容导读
互联网集市收集整理的这篇技术教程文章主要介绍了优化具有时区转换的Mysql查询并逐个小时,小编现在分享给大家,供广大互联网技能从业者学习和参考。文章包含4893字,纯文字阅读大概需要7分钟。
内容图文
![优化具有时区转换的Mysql查询并逐个小时](/upload/InfoBanner/zyjiaocheng/904/de4c12908d494c54b513d1a2a33353c5.jpg)
这是MySql 5.5中的表,有3000万条记录
CREATE TABLE `campaign_logs` (
`domain` varchar(50) DEFAULT NULL,
`campaign_id` varchar(50) DEFAULT NULL,
`subscriber_id` varchar(50) DEFAULT NULL,
`message` varchar(21000) DEFAULT NULL,
`log_time` datetime DEFAULT NULL,
`log_type` varchar(50) DEFAULT NULL,
`level` varchar(50) DEFAULT NULL,
`campaign_name` varchar(500) DEFAULT NULL,
KEY `subscriber_id_index` (`subscriber_id`),
KEY `log_type_index` (`log_type`),
KEY `log_time_index` (`log_time`),
KEY `campid_domain_logtype_logtime_subid_index` (`campaign_id`,`domain`,`log_type`,`log_time`,`subscriber_id`),
KEY `domain_logtype_logtime_index` (`domain`,`log_type`,`log_time`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
在下面的查询中,我正在按时间分组进行分组
QUERY
SELECT
log_type
,DATE_FORMAT(CONVERT_TZ(log_time,'+00:00','+05:30'),'%l %p') AS log_date
,count(*) AS total
,count(DISTINCT subscriber_id) d
FROM
stats.campaign_logs USE INDEX(campid_domain_logtype_logtime_subid_index)
WHERE
DOMAIN='xxx'
AND campaign_id='123'
AND log_type = 'EMAIL_OPENED'
AND log_time BETWEEN
CONVERT_TZ('2015-02-01 00:00:00','+00:00','+05:30') AND
CONVERT_TZ('2015-03-01 23:59:58','+00:00','+05:30')
GROUP BY log_date
UNION ALL
SELECT
log_type
,DATE_FORMAT(CONVERT_TZ(log_time,'+00:00','+05:30'),'%l %p') AS log_date
,count(*) AS total
,count(DISTINCT subscriber_id) d
FROM
stats.campaign_logs USE INDEX(campid_domain_logtype_logtime_subid_index)
WHERE
DOMAIN='xxx'
AND campaign_id='123'
AND log_type = 'EMAIL_SENT'
AND log_time BETWEEN
CONVERT_TZ('2015-02-01 00:00:00','+00:00','+05:30') AND
CONVERT_TZ('2015-03-01 23:59:58','+00:00','+05:30')
GROUP BY log_date
UNION ALL
SELECT
log_type
,DATE_FORMAT(CONVERT_TZ(log_time,'+00:00','+05:30'),'%l %p') AS log_date
,count(*) AS total
,count(DISTINCT subscriber_id) d
FROM
stats.campaign_logs USE INDEX(campid_domain_logtype_logtime_subid_index)
WHERE
DOMAIN='xxx'
AND campaign_id='123'
AND log_type = 'EMAIL_CLICKED'
AND log_time BETWEEN
CONVERT_TZ('2015-02-01 00:00:00','+00:00','+05:30') AND
CONVERT_TZ('2015-03-01 23:59:58','+00:00','+05:30')
GROUP BY log_date;
结果
上面的查询将给出这样的结果
+---------------+-------+----------------+-------------+
| EMAIL_CLICKED | 1 AM | 71 | 83 |
| EMAIL_CLICKED | 1 PM | 25 | 27 |
| EMAIL_SENT | 10 AM | 51 | 59 |
| EMAIL_OPENED | 10 PM | 16 | 18 |
这是上述查询的解释
说明
+----+--------------+---------------+-------+-------------------------------------------+-------------------------------------------+---------+------+--------+------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------+---------------+-------+-------------------------------------------+-------------------------------------------+---------+------+--------+------------------------------------------+
| 1 | PRIMARY | campaign_logs | range | campid_domain_logtype_logtime_subid_index | campid_domain_logtype_logtime_subid_index | 468 | NULL | 55074 | Using where; Using index; Using filesort |
| 2 | UNION | campaign_logs | range | campid_domain_logtype_logtime_subid_index | campid_domain_logtype_logtime_subid_index | 468 | NULL | 330578 | Using where; Using index; Using filesort |
| 3 | UNION | campaign_logs | range | campid_domain_logtype_logtime_subid_index | campid_domain_logtype_logtime_subid_index | 468 | NULL | 1589 | Using where; Using index; Using filesort |
|NULL| UNION RESULT | <union1,2,3> | ALL | NULL | NULL | NULL | NULL | NULL | |
+----+--------------+---------------+-------+-------------------------------------------+-------------------------------------------+---------+------+--------+------------------------------------------+
优化?
我们在这张桌子上有一个覆盖索引.
此查询需要很长时间(超过1分钟).
如果我从查询中删除distinct_count(subscriber_id),那么我们将在1.5秒内获得结果,但我需要查询中的distinct_count subscriber_id.
有没有办法优化这个查询?
谢谢
解决方法:
您没有处理大量数据,因此该组不应该花费40秒 – 假设您不在桌面上具有大量锁定活动的非常繁忙的服务器上.
试试这个版本的查询(限于一个log_type):
SELECT log_type,
DATE_FORMAT(CONVERT_TZ(log_time,'+00:00','+05:30'),'%l %p') AS time,
count(DISTINCT subscriber_id) AS distinct_count,
count(subscriber_id) AS total_count
FROM stats.campaign_logs
WHERE DOMAIN = 'xxxx' AND
campaign_id='1234' AND
log_type = 'EMAIL_SENT' AND
log_time BETWEEN CONVERT_TZ('2015-02-07 00:00:00','+00:00','+05:30') AND CONVERT_TZ('2015-02-14 23:59:58','+00:00','+05:30')
GROUP BY time;
这应该最佳地使用索引.如果这很快,那么使用union all将行组合在一起.丑陋,但由于索引优化,有时union all比OR / IN快得多.
内容总结
以上是互联网集市为您收集整理的优化具有时区转换的Mysql查询并逐个小时全部内容,希望文章能够帮你解决优化具有时区转换的Mysql查询并逐个小时所遇到的程序开发问题。 如果觉得互联网集市技术教程内容还不错,欢迎将互联网集市网站推荐给程序员好友。
内容备注
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 gblab@vip.qq.com 举报,一经查实,本站将立刻删除。
内容手机端
扫描二维码推送至手机访问。