视频1 视频21 视频41 视频61 视频文章1 视频文章21 视频文章41 视频文章61 推荐1 推荐3 推荐5 推荐7 推荐9 推荐11 推荐13 推荐15 推荐17 推荐19 推荐21 推荐23 推荐25 推荐27 推荐29 推荐31 推荐33 推荐35 推荐37 推荐39 推荐41 推荐43 推荐45 推荐47 推荐49 关键词1 关键词101 关键词201 关键词301 关键词401 关键词501 关键词601 关键词701 关键词801 关键词901 关键词1001 关键词1101 关键词1201 关键词1301 关键词1401 关键词1501 关键词1601 关键词1701 关键词1801 关键词1901 视频扩展1 视频扩展6 视频扩展11 视频扩展16 文章1 文章201 文章401 文章601 文章801 文章1001 资讯1 资讯501 资讯1001 资讯1501 标签1 标签501 标签1001 关键词1 关键词501 关键词1001 关键词1501 专题2001
MySQL隐式类型的转换陷阱和规则_MySQL
2020-11-09 09:16:08 责编:小采
文档


前言

相信大家都知道隐式类型转换有无法命中索引的风险,在高并发、大数据量的情况下,命不中索引带来的后果非常严重。将数据库拖死,继而整个系统崩溃,对于大规模系统损失惨重。所以下面通过本文来好好学习下MySQL隐式类型的转换陷阱和规则。

1. 隐式类型转换实例

今天生产库上突然出现MySQL线程数告警,IOPS很高,实例会话里面出现许多类似下面的sql:(修改了相关字段和值)

SELECT f_col3_id,f_qq1_id FROM d_dbname.t_tb1 WHERE f_col1_id=1226391 and f_col2_id=1244378 and 
f_qq1_id in (12345,23456,34567,45678,567,670,701,012,90123,901231,901232,901233)

用 explain 看了下扫描行数和索引选择情况:

mysql>explain SELECT f_col3_id,f_qq1_id FROM d_dbname.t_tb1 WHERE f_col1_id=1226391 
and f_col2_id=1244378 and f_qq1_id in (12345,23456,34567,45678,567,670,701,012,90123,901231,901232,901233);
+------+---------------+---------+--------+--------------------------------+---------------+------------+--------+--------+------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+---------------+---------+--------+--------------------------------+---------------+------------+--------+--------+------------------------------------+
| 1 | SIMPLE | t_tb1 | ref | uid_type_frid,idx_corpid_qq1id | uid_type_frid | 8 | const | 1386 | Using index condition; Using where |
+------+---------------+---------+--------+--------------------------------+---------------+------------+--------+--------+------------------------------------+
共返回 1 行记录,花费 11.52 ms.

t_tb1 表上有个索引uid_type_frid(f_col2_id,f_type) idx_corp_id_qq1id(f_col1_id,f_qq1_id) ,而且如果选择后者时,f_qq1_id的过滤效果应该很佳,但却选择了前者。当使用 hint use index(idx_corp_id_qq1id)时:

mysql>explain extended SELECT f_col3_id,f_qq1_id FROM d_dbname.t_tb1 use index(idx_corpid_qq1id) WHERE f_col1_id=1226391 and f_col2_id=1244378 and f_qq1_id in (12345,23456,34567,45678,567,670,701,012,90123,901231,901232,901233);
+------+---------------+--------+--------+---------------------+------------------+------------+----------+-------------+------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+---------------+--------+--------+---------------------+------------------+------------+----------+-------------+------------------------------------+
| 1 | SIMPLE | t_tb1 | ref | idx_corpid_qq1id | idx_corpid_qq1id | 8 | const | 2375752 | Using index condition; Using where |
+---- -+---------------+--------+--------+---------------------+------------------+------------+----------+-------------+------------------------------------+
共返回 1 行记录,花费 17.48 ms.
mysql>show warnings;
+-----------------+----------------+-----------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message |
+-----------------+----------------+-----------------------------------------------------------------------------------------------------------------------+
| Warning | 1739 | Cannot use range access on index 'idx_corpid_qq1id' due to type or collation conversion on field 'f_qq1_id' |
| Note | 1003 | /* select#1 */ select `d_dbname`.`t_tb1`.`f_col3_id` AS `f_col3_id`,`d_dbname`.`t_tb1`.`f_qq1_id` AS `f_qq1_id` from `d_dbname`.`t_tb1` USE INDEX (`idx_corpid_qq1id`) where |
| | | ((`d_dbname`.`t_tb1`.`f_col2_id` = 1244378) and (`d_dbname`.`t_tb1`.`f_col1_id` = 1226391) and (`d_dbname`.`t_tb1`.`f_qq1_id` in |
| | | (12345,23456,34567,45678,567,670,701,012,90123,901231,901232,901233))) |
+-----------------+----------------+-----------------------------------------------------------------------------------------------------------------------+
共返回 2 行记录,花费 10.81 ms.

rows列达到200w行,但问题也发现了:select_type应该是 range 才对,key_len看出来只用到了idx_corpid_qq1id索引的第一列。上面explain使用了 extended,所以show warnings;可以很明确的看到 f_qq1_id 出现了隐式类型转换:f_qq1_idvarchar,而后面的比较值是整型。

解决该问题就是避免出现隐式类型转换(implicit type conversion)带来的不可控:把f_qq1_id in的内容写成字符串:

mysql>explain SELECT f_col3_id,f_qq1_id FROM d_dbname.t_tb1 WHERE f_col1_id=1226391 and f_col2_id=1244378 and 
f_qq1_id in ('12345','23456','34567','45678','567','670','701','012','90123','901231');
+-------+---------------+--------+---------+--------------------------------+------------------+-------------+---------+---------+------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+-------+---------------+--------+---------+--------------------------------+------------------+-------------+---------+---------+------------------------------------+
| 1 | SIMPLE | t_tb1 | range | uid_type_frid,idx_corpid_qq1id | idx_corpid_qq1id | 70 | | 40 | Using index condition; Using where |
+-------+---------------+--------+---------+--------------------------------+------------------+-------------+---------+---------+------------------------------------+
共返回 1 行记录,花费 12.41 ms.

扫描行数从1386减少为40。

类似的还出现过一例:

SELECT count(0) FROM d_dbname.t_tb2 where f_col1_id= '1931231' AND f_phone in(0980);
| Warning | 1292 | Truncated incorrect DOUBLE value: '1512-984356'

优化后直接从扫描rows 100w行降为1。

借这个机会,系统的来看一下mysql中的隐式类型转换。

2. mysql隐式转换规则

2.1 规则

下面来分析一下隐式转换的规则:

a. 两个参数至少有一个是 NULL 时,比较的结果也是 NULL,例外是使用 <=> 对两个 NULL 做比较时会返回 1,这两种情况都不需要做类型转换

b. 两个参数都是字符串,会按照字符串来比较,不做类型转换

c. 两个参数都是整数,按照整数来比较,不做类型转换

d. 十六进制的值和非数字做比较时,会被当做二进制串

e. 有一个参数是 TIMESTAMP DATETIME,并且另外一个参数是常量,常量会被转换为 timestamp

f. 有一个参数是 decimal 类型,如果另外一个参数是 decimal 或者整数,会将整数转换为 decimal 后进行比较,如果另外一个参数是浮点数,则会把 decimal 转换为浮点数进行比较

g. 所有其他情况下,两个参数都会被转换为浮点数再进行比较

mysql> select 11 + '11', 11 + 'aa', 'a1' + 'bb', 11 + '0.01a'; 
+-----------+-----------+-------------+--------------+
| 11 + '11' | 11 + 'aa' | 'a1' + 'bb' | 11 + '0.01a' |
+-----------+-----------+-------------+--------------+
| 22 | 11 | 0 | 11.01 |
+-----------+-----------+-------------+--------------+
1 row in set, 4 warnings (0.00 sec)
mysql> show warnings;
+---------+------+-------------------------------------------+
| Level | Code | Message |
+---------+------+-------------------------------------------+
| Warning | 1292 | Truncated incorrect DOUBLE value: 'aa' |
| Warning | 1292 | Truncated incorrect DOUBLE value: 'a1' |
| Warning | 1292 | Truncated incorrect DOUBLE value: 'bb' |
| Warning | 1292 | Truncated incorrect DOUBLE value: '0.01a' |
+---------+------+-------------------------------------------+
4 rows in set (0.00 sec)
mysql> select '11a' = 11, '11.0' = 11, '11.0' = '11', NULL = 1;
+------------+-------------+---------------+----------+
| '11a' = 11 | '11.0' = 11 | '11.0' = '11' | NULL = 1 |
+------------+-------------+---------------+----------+
| 1 | 1 | 0 | NULL |
+------------+-------------+---------------+----------+
1 row in set, 1 warning (0.01 sec)

上面可以看出11 + 'aa',由于操作符两边的类型不一样且符合第g条,aa要被转换成浮点型小数,然而转换失败(字母被截断),可以认为转成了 0,整数11被转成浮点型还是它自己,所以11 + 'aa' = 11。

0.01a转成double型也是被截断成0.01,所以11 + '0.01a' = 11.01。

等式比较也说明了这一点,'11a'和'11.0'转换后都等于 11,这也正是文章开头实例为什么没走索引的原因: varchar型的f_qq1_id,转换成浮点型比较时,等于 12345 的情况有无数种如12345a、12345.b等待,MySQL优化器无法确定索引是否更有效,所以选择了其它方案。

但并不是只要出现隐式类型转换,就会引起上面类似的性能问题,最终是要看转换后能否有效选择索引。像f_id = '654321'f_mtime between '2016-05-01 00:00:00' and '2016-05-04 23:59:59'就不会影响索引选择,因为前者f_id是整型,即使与后面的字符串型数字转换成double比较,依然能根据double确定f_id的值,索引依然有效。后者是因为符合第e条,只是右边的常量做了转换。

开发人员可能都只要存在这么一个隐式类型转换的坑,但却又经常不注意,所以干脆无需记住那么多规则,该什么类型就与什么类型比较。

2.2 隐式类型转换的安全问题

implicit type conversion 不仅可能引起性能问题,还有可能产生安全问题。

mysql> desc t_account;
+-----------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------+-------------+------+-----+---------+----------------+
| fid | int(11) | NO | PRI | NULL | auto_increment |
| fname | varchar(20) | YES | | NULL | |
| fpassword | varchar(50) | YES | | NULL | |
+-----------+-------------+------+-----+---------+----------------+
mysql> select * from t_account;
+-----+-----------+-------------+
| fid | fname | fpassword |
+-----+-----------+-------------+
| 1 | xiaoming | p_xiaoming |
| 2 | xiaoming1 | p_xiaoming1 |
+-----+-----------+-------------+

假如应用前端没有WAF防护,那么下面的sql很容易注入:

mysql> select * from t_account where fname='A' ;
fname传入 A' OR 1='1 
mysql> select * from t_account where fname='A' OR 1='1';

攻击者更聪明一点: fname传入 A'+'B ,fpassword传入 ccc'+0 :

mysql> select * from t_account where fname='A'+'B' and fpassword='ccc'+0;
+-----+-----------+-------------+
| fid | fname | fpassword |
+-----+-----------+-------------+
| 1 | xiaoming | p_xiaoming |
| 2 | xiaoming1 | p_xiaoming1 |
+-----+-----------+-------------+
2 rows in set, 7 warnings (0.00 sec)

总结

下载本文
显示全文
专题