文章目录
- 一、准备
- 1. 导入依赖
- 2. 测试数据
- 二、全文搜索
- 2.1. 匹配搜索(会拆词)
- 2.2. 短语搜索
- 2.3. queryString搜索
- 2.4. 多字段匹配搜索
- 三、词条级搜索
- 3.1. 词条级搜索
- 3.2. 词条集合搜索(terms query)
- 3.3. 范围搜索( range query)
- 3.4. 不为空搜索(exists query)
- 3.5. 词项前缀搜索(prefix query)
- 3.6. 通配符搜索(wildcard query)
- 3.7. 正则搜索
- 3.8. 模糊搜索(fuzzy query)
- 3.9. ids搜索(id集合查询)(ids query)
- 四、复合查询
- 4.1. 复合查询
- 4.2. 排序
- 五、分词器
- 5.1. 不设置分词器,使用默认分词器
- 5.2. 指定分词器,使用IK最大分词
- 5.3. 指定分词器,使用IK最小分词
一、准备
1. 导入依赖
<!--springboot <=2.2.5 需要指定es版本默认引入es版本6.x--><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-data-elasticsearch</artifactId></dependency>
2. 测试数据
PUT /yuangong
{"settings": {},"mappings": {"properties": {"name": {"type": "text","analyzer": "ik_max_word"},"alias": {"type": "text"},"age": {"type": "integer"},"sex": {"type": "keyword"},"phone": {"type": "text"},"title": {"type": "text","analyzer": "ik_max_word"},"slogan": {"type": "text"}}}
}POST /yuangong/_doc/
{"name": "张三三","alias": "小张","age": 28,"sex": "男","phone": 183xxxx0000,"title": "初级Java开发","slogan": "苟富贵,勿相忘"
}POST /yuangong/_doc/
{"name": "李四","alias": "小四","age": 25,"sex": "男","phone": 183xxxx0001,"title": "高级Java开发","slogan": "行路难,行路难,多歧路,今安在"
}POST /yuangong/_doc/
{"name": "王五","alias": "五哥","age": 30,"sex": "男","phone": 183xxxxx0002,"title": "资深Java开发","slogan": "逾期感慨路难行,不如马上出发"
}POST /yuangong/_doc/
{"name": "王六","alias": "名与","age": 28,"sex": "男","phone": 183xxxx0003,"title": "高级前端开发","slogan": "超越自己"
}POST /yuangong/_doc/
{"name": "王狗蛋","alias": "狗蛋","age": 31,"sex": "男","phone": 183xxxx0004,"title": "高级产品经理","slogan": "明哲保身"
}POST /yuangong/_doc/
{"name": "王麻子","alias": "逍遥子","age": 30,"sex": "女","phone": 183xxxx0005,"title": "资深业务专家","slogan": "鹏之大不知其几千里也"
}POST /yuangong/_doc/
{"name": "周二","alias": "二子","age": 22,"sex": "男","phone": 18300000006,"title": "初级测试开发","slogan": "冷冷的冰雨在我脸上胡乱的拍"
}
二、全文搜索
2.1. 匹配搜索(会拆词)
和数据类型有关
(1)如果被查询的字段是keywork,查询的时候只能全量匹配,不会被拆分
(2)如果查询字段类型是text,查询的值会根据查询字段的分词器进行分词
如:如果没有设置分词器,张三三会拆成张、三、三,匹配出包含【张】或者【三】或者【三】的数据
如:如果设置IK分词器,
"name": {"type": "text","analyzer": "ik_max_word"}
张三三会拆成张三、三三,匹配出包含【张三】或者【三三】的数据
如:如果设置IK分词器,三三会拆成三三,匹配出包含【三三】的数据
(3)上面使用的是默认的operator默认OR, 设置operator为AND的时候
如:如果没有设置分词器,张三三会拆成张、三、三,匹配出包含【张】并且【三】并且【三】的数据
如:如果设置IK分词器,张三三会拆成张三、三三,匹配出包含【张三】并且【三三】的数据
如:如果设置IK分词器,三三会拆成三三,匹配出包含【三三】的数据
public void matchQuery() throws IOException {// 搜索请求对象SearchRequest searchRequest = new SearchRequest("yuangong");// 搜索源构建对象SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();MatchQueryBuilder matchQueryBuilder = QueryBuilders.matchQuery("name", "三三").operator(Operator.OR);doSearch(searchRequest,searchSourceBuilder,matchQueryBuilder);
}
2.2. 短语搜索
public void matchPhraseQuery() throws IOException {// 搜索请求对象SearchRequest searchRequest = new SearchRequest("yuangong");// 搜索源构建对象SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();MatchPhraseQueryBuilder matchPhraseQueryBuilder = QueryBuilders.matchPhraseQuery("title", "JAVA开发").slop(1);doSearch(searchRequest,searchSourceBuilder,matchPhraseQueryBuilder);
}
2.3. queryString搜索
查询时可以不指定字段查询
public void queryString() throws IOException {// 搜索请求对象SearchRequest searchRequest = new SearchRequest("yuangong");// 搜索源构建对象SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();// QueryStringQueryBuilder queryStringQueryBuilder = QueryBuilders.queryStringQuery("不");QueryStringQueryBuilder queryStringQueryBuilder = QueryBuilders.queryStringQuery("不").field("slogan");doSearch(searchRequest,searchSourceBuilder,queryStringQueryBuilder);}
2.4. 多字段匹配搜索
public void multiMatchQuery() throws IOException {// 搜索请求对象SearchRequest searchRequest = new SearchRequest("yuangong");// 搜索源构建对象SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();MultiMatchQueryBuilder multiMatchQuery = QueryBuilders.multiMatchQuery("明","alias","slogan");doSearch(searchRequest,searchSourceBuilder,multiMatchQuery);
}
三、词条级搜索
3.1. 词条级搜索
类似于SQL里面的like
①、和字段的数据类型有关,如果数据类型是text但是没有指定分词器
例如:张三三,不指定分词器,这个词会被拆分成张、三、三
如果你termQuery里面的value为张三、三三,或者张三三都是无法查询到的;只有搜索张或者三才能搜索到
②、和字段的数据类型有关,如果数据类型是text但是指定IK分词器
例如:张三三,指定IK分词器,这个词会被拆分成张三、三三
如果你termQuery里面的value为张,三,或者张三三都是无法查询到的;只有搜索张三或者三三才能搜索到
③、和字段的数据类型有关,如果数据类型是keyword
搜索只能是用张三三作为搜索
public void termQuery() throws IOException {// 搜索请求对象SearchRequest searchRequest = new SearchRequest("yuangong");// 搜索源构建对象SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();TermQueryBuilder termQuery = QueryBuilders.termQuery("name","张三三");doSearch(searchRequest,searchSourceBuilder,termQuery);}
3.2. 词条集合搜索(terms query)
类型与SQL里面的in
public void termsQuery() throws IOException {// 搜索请求对象SearchRequest searchRequest = new SearchRequest("yuangong");// 搜索源构建对象SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();TermsQueryBuilder termsQuery = QueryBuilders.termsQuery("slogan","不","鹏");doSearch(searchRequest,searchSourceBuilder,termsQuery);
}
3.3. 范围搜索( range query)
/*** 大于或大于等于某个值* @param from 被比较的值* @param includeLower 是否包含这个值* @return*/
public RangeQueryBuilder from(Object from, boolean includeLower){}/*** 小于或小于等于某个值* @param to 被比较的值* @param includeLower 是否包含这个值* @return*/
public RangeQueryBuilder to(Object to, boolean includeLower){}
3.4. 不为空搜索(exists query)
相当 SQL 中的 column is not null
public void existsQuery() throws IOException {// 搜索请求对象SearchRequest searchRequest = new SearchRequest("yuangong");// 搜索源构建对象SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();ExistsQueryBuilder existsQuery = QueryBuilders.existsQuery("sex");doSearch(searchRequest,searchSourceBuilder,existsQuery);}
3.5. 词项前缀搜索(prefix query)
注意prefix query会和分词器联动
①、如果是keyword,可以随意前缀
②、如果设置了分词器,会根据分词器分的第一个词作为前缀。
如不设置分词器,使用默认分词器,词汇张三三,会被切分成张、三、三,这时候前缀只能查张,此时搜索张三或者张三三无效
如设置分词器为IK,词汇张三三,会被切分成张三、三三,这时候前缀只能查张三,此时搜索张无效
public void prefixQuery() throws IOException {// 搜索请求对象SearchRequest searchRequest = new SearchRequest("yuangong");// 搜索源构建对象SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();PrefixQueryBuilder prefixQuery = QueryBuilders.prefixQuery("title","资深");doSearch(searchRequest,searchSourceBuilder,prefixQuery);
}
3.6. 通配符搜索(wildcard query)
public void wildcardQuery() throws IOException {// 搜索请求对象SearchRequest searchRequest = new SearchRequest("yuangong");// 搜索源构建对象SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();WildcardQueryBuilder wildcardQuery = QueryBuilders.wildcardQuery("name","张*三").boost(2);doSearch(searchRequest,searchSourceBuilder,wildcardQuery);
}
3.7. 正则搜索
public void regexpQuery() throws IOException {// 搜索请求对象SearchRequest searchRequest = new SearchRequest("yuangong");// 搜索源构建对象SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();RegexpQueryBuilder regexpQueryBuilder = QueryBuilders.regexpQuery("title","高级.*");doSearch(searchRequest,searchSourceBuilder,regexpQueryBuilder);
}
3.8. 模糊搜索(fuzzy query)
public void fuzzyQuery() throws IOException {// 搜索请求对象SearchRequest searchRequest = new SearchRequest("yuangong");// 搜索源构建对象SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();FuzzyQueryBuilder fuzzyQueryBuilder = QueryBuilders.fuzzyQuery("title","高级");doSearch(searchRequest,searchSourceBuilder,fuzzyQueryBuilder);
}
3.9. ids搜索(id集合查询)(ids query)
public void idsQuery() throws IOException {// 搜索请求对象SearchRequest searchRequest = new SearchRequest("yuangong");// 搜索源构建对象SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();IdsQueryBuilder idsQueryBuilder = QueryBuilders.idsQuery().addIds("t1UY3nsB1bu3ZtXKDENc");doSearch(searchRequest,searchSourceBuilder,idsQueryBuilder);
}
四、复合查询
4.1. 复合查询
public void boolQuery() throws IOException {// 搜索请求对象SearchRequest searchRequest = new SearchRequest("yuangong");// 搜索源构建对象SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();MatchQueryBuilder matchTitleQueryBuilder = QueryBuilders.matchQuery("title", "高级");RangeQueryBuilder rangeQueryBuilder = QueryBuilders.rangeQuery("age").from(25,false).to(30);BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery().must(matchTitleQueryBuilder).mustNot(rangeQueryBuilder);doSearch(searchRequest,searchSourceBuilder,boolQueryBuilder);
}
4.2. 排序
public void sortQuery() throws IOException {// 搜索请求对象SearchRequest searchRequest = new SearchRequest("yuangong");// 搜索源构建对象SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();MatchQueryBuilder matchTitleQueryBuilder = QueryBuilders.matchQuery("title", "高级");RangeQueryBuilder rangeQueryBuilder = QueryBuilders.rangeQuery("age").from(25,false).to(30);BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery().must(matchTitleQueryBuilder).mustNot(rangeQueryBuilder);doSearch(searchRequest,searchSourceBuilder,boolQueryBuilder);
}public void doSearch(SearchRequest searchRequest,SearchSourceBuilder searchSourceBuilder,QueryBuilder query) throws IOException {// sort,text字段不能用于排序FieldSortBuilder age = SortBuilders.fieldSort("age").order(SortOrder.ASC);searchSourceBuilder.query(query).sort(age);// 向搜索请求对象中设置搜索源searchRequest.source(searchSourceBuilder);// 执行搜索,向ES发起http请求SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);// 搜索结果SearchHits hits = searchResponse.getHits();System.out.println(Arrays.toString(hits.getHits()));
}
五、分词器
查看要查询的词是被如何分词了
5.1. 不设置分词器,使用默认分词器
# 不设置分词器,使用默认分词器
POST _analyze
{"text": "资深Java开发"
}
分词结果:
{"tokens" : [{"token" : "资","start_offset" : 0,"end_offset" : 1,"type" : "<IDEOGRAPHIC>","position" : 0},{"token" : "深","start_offset" : 1,"end_offset" : 2,"type" : "<IDEOGRAPHIC>","position" : 1},{"token" : "java","start_offset" : 2,"end_offset" : 6,"type" : "<ALPHANUM>","position" : 2},{"token" : "开","start_offset" : 6,"end_offset" : 7,"type" : "<IDEOGRAPHIC>","position" : 3},{"token" : "发","start_offset" : 7,"end_offset" : 8,"type" : "<IDEOGRAPHIC>","position" : 4}]
}
5.2. 指定分词器,使用IK最大分词
# 指定分词器,使用IK最大分词
POST _analyze
{"analyzer": "ik_max_word","text": "南京市长江大桥"
}
分词结果 :
{"tokens" : [{"token" : "南京市","start_offset" : 0,"end_offset" : 3,"type" : "CN_WORD","position" : 0},{"token" : "南京","start_offset" : 0,"end_offset" : 2,"type" : "CN_WORD","position" : 1},{"token" : "市长","start_offset" : 2,"end_offset" : 4,"type" : "CN_WORD","position" : 2},{"token" : "长江大桥","start_offset" : 3,"end_offset" : 7,"type" : "CN_WORD","position" : 3},{"token" : "长江","start_offset" : 3,"end_offset" : 5,"type" : "CN_WORD","position" : 4},{"token" : "大桥","start_offset" : 5,"end_offset" : 7,"type" : "CN_WORD","position" : 5}]
}
5.3. 指定分词器,使用IK最小分词
# 指定分词器,使用IK最小分词
POST _analyze
{"text": "南京市长江大桥","analyzer": "ik_smart"
}
分词结果:
{"tokens" : [{"token" : "南京市","start_offset" : 0,"end_offset" : 3,"type" : "CN_WORD","position" : 0},{"token" : "长江大桥","start_offset" : 3,"end_offset" : 7,"type" : "CN_WORD","position" : 1}]
}