MySQL字符数据查询拆分
问题描述
数据表中某字段为特定单词组字符串,特定字符分隔。
现有需求:在不影响原始数据的情况下,查询显示拆分后的单词,方便后续对其进行后续操作。
演示
演示数据源
-- 测试表结构create table word_table(words varchar(255) null);-- 表数据INSERT INTO test.word_table(words) VALUES ('apple,orange,lemon');INSERT INTO test.word_table(words) VALUES ('computer,apple');INSERT INTO test.word_table(words) VALUES ('laddy,man,woman');
涉及MySQL内置表/内置函数
MySQL内置表
-
mysql.help_topic
mysql.help_topic本身是Mysql的一张信息表,用来存储各种注释等帮助信息。这里需要借助该表中的help_topic_id字段数据,其有特点:help_topic_id 从0开始,固定数量的(和数据库版本有关),我这里为0~699。
-- 查看help_topic_id序列总数SELECT help_topic_id FROM mysql.help_topic ORDER BY help_topic_id ASC
-- 测试 SELECT help_topic_id FROM mysql.help_topic WHERE help_topic_id<2
+---------------+ | help_topic_id | +---------------+ | 0 | | 1 | +---------------+ 2 rows in set (0.00 sec)
内置函数
-
LENGTH(str)
返回str中的字符个数
-
REPLACE(str, old_sub_str, new_sub_str)
替换str中 旧的子字串 为 新的子字符串
-
SUBSTRING_INDEX(str, sub_str, count)
返回 str 中 第 count 次 查找到 sub_str 之前的字符串,不包含最后一次查找到的sub_str。
如:
SUBSTRING_INDEX("a,b,a,c,a,d",'a',3)
==>"a,b,a,c,"
注意:如果count为负数,则表示从右边查找,并返回右边的字符串。
mysql> -- 拆分单词 mysql> -- SUBSTRING_INDEX(str, sub_str, count):在str中查找delim,返回查找到的第count个sub_str之前的字符串(从左开始) mysql> -- 如果count为负数,则从右开始 mysql> -- 示例:取左边第一个单词 mysql> SELECT SUBSTRING_INDEX(T.words,',',1) FROM word_table T ; +--------------------------------+ | SUBSTRING_INDEX(T.words,',',1) | +--------------------------------+ | apple | | computer | | laddy | +--------------------------------+ 3 rows in set (0.00 sec)mysql> -- 示例:取左边二个单词 mysql> SELECT SUBSTRING_INDEX(T.words,',',2) FROM word_table T ; +--------------------------------+ | SUBSTRING_INDEX(T.words,',',2) | +--------------------------------+ | apple,orange | | computer,apple | | laddy,man | +--------------------------------+ 3 rows in set (0.00 sec)mysql> -- 示例:取右边一个单词,由此可以把每一个单词都拿出来 mysql> SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(T.words,',',1),',',-1) FROM word_table T ; +--------------------------------------------------------+ | SUBSTRING_INDEX(SUBSTRING_INDEX(T.words,',',1),',',-1) | +--------------------------------------------------------+ | apple | | computer | | laddy | +--------------------------------------------------------+ 3 rows in set (0.00 sec)mysql> SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(T.words,',',2),',',-1) FROM word_table T ; +--------------------------------------------------------+ | SUBSTRING_INDEX(SUBSTRING_INDEX(T.words,',',2),',',-1) | +--------------------------------------------------------+ | orange | | apple | | man | +--------------------------------------------------------+ 3 rows in set (0.00 sec)
基本实现思路
-
计算单词组字符串中的分隔符个数从而得出单词个数(分隔符个数+1)。
-- 计算单词个数,即分隔符的个数+1,便为单词个数-- LENGTH:计算字符串中的字符个数-- REPLACE: 分换字符串中的字符,这里是将分隔符删除SELECT words,(LENGTH(T.words) - LENGTH(REPLACE(T.words, ',',''))+1) AS wordsCount FROM word_table T
+--------------------+------------+ | words | wordsCount | +--------------------+------------+ | apple,orange,lemon | 3 | | computer,apple | 2 | | laddy,man,woman | 3 | +--------------------+------------+ 3 rows in set (0.00 sec)
-
根据单词个数显示查询行数。
-- 借助mysql.help_topic表中的help_topic_id 进行关联,进行拆分第一步-- help_topic_id 从0开始,固定数量的(和数据库版本有关),我这里为0~699-- 示例:help_topic_id < 2,则符合条件的有0,1SELECT T.words FROM word_table T INNER JOIN mysql.help_topic H ON H.help_topic_id < (LENGTH(T.words) - LENGTH(REPLACE(T.words, ',',''))+1)
+--------------------+ | words | +--------------------+ | apple,orange,lemon | | apple,orange,lemon | | apple,orange,lemon | | computer,apple | | computer,apple | | laddy,man,woman | | laddy,man,woman | | laddy,man,woman | +--------------------+ 8 rows in set (0.00 sec)
-
在该组的每一行,根据顺序依次切割出单词。
第一次切割出一个,第二次切割出两个,第三次切割出三个。
-- 通过 mysql.help_topic 的help_topic_id 序列ID来顺序分割SELECT SUBSTRING_INDEX(T.words,',',H.help_topic_id+1) FROM word_table T INNER JOIN mysql.help_topic HON H.help_topic_id < (LENGTH(T.words)-LENGTH(REPLACE(T.words,',',''))+1)
+------------------------------------------------+ | SUBSTRING_INDEX(T.words,',',H.help_topic_id+1) | +------------------------------------------------+ | apple | | apple,orange | | apple,orange,lemon | | computer | | computer,apple | | laddy | | laddy,man | | laddy,man,woman | +------------------------------------------------+ 8 rows in set (0.00 sec)
-
每行拿右边第一个单词即可。
-- 完整拆分操作SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(T.words,',',H.help_topic_id+1),',',-1) FROM word_table T INNER JOIN mysql.help_topic HON H.help_topic_id < (LENGTH(T.words)-LENGTH(REPLACE(T.words,',',''))+1)
+------------------------------------------------------------------------+ | SUBSTRING_INDEX(SUBSTRING_INDEX(T.words,',',H.help_topic_id+1),',',-1) | +------------------------------------------------------------------------+ | apple | | orange | | lemon | | computer | | apple | | laddy | | man | | woman | +------------------------------------------------------------------------+ 8 rows in set (0.00 sec)
参考博文
https://www.jb51.net/database/305918jub.htm
https://blog.csdn.net/qq_23845083/article/details/135410361
https://www.lidihuo.com/mysql/mysql-string-substring_index-function.html