ElasticSearch索引字段检索时使其不区分大小写
内容导读
互联网集市收集整理的这篇技术教程文章主要介绍了ElasticSearch索引字段检索时使其不区分大小写,小编现在分享给大家,供广大互联网技能从业者学习和参考。文章包含3249字,纯文字阅读大概需要5分钟。
内容图文
官网指南:https://www.elastic.co/guide/en/elasticsearch/reference/current/normalizer.html
在 Elasticsearch 中处理字符串类型的数据时,如果我们想把整个字符串作为一个完整的 term 存储,我们通常会将其类型 type
设定为 keyword
。但有时这种设定又会给我们带来麻烦,比如同一个数据再写入时由于没有做好清洗,导致大小写不一致,比如 apple
、Apple
两个实际都是 apple
,但当我们去搜索 apple
时却无法返回 Apple
的文档。要解决这个问题,就需要 Normalizer
出场了。废话不多说,直接上手看!
静态映射创建索引
PUT test
{
"settings": {
"analysis": {
"normalizer": {
"my_normalizer": {
"type": "custom",
"char_filter": [],
"filter": ["lowercase", "asciifolding"]
}
}
}
},
"mappings": {
"properties": {
"foo": {
"type": "keyword",
"normalizer": "my_normalizer"
}
}
}
}
准备数据
PUT test/_doc/1
{
"foo": "BàR"
}
PUT test/_doc/2
{
"foo": "bar"
}
PUT test/_doc/3
{
"foo": "baz"
}
测试效果
GET test/_search
{
"query": {
"term": {
"foo": "BAR"
}
}
}
GET test/_search
{
"query": {
"match": {
"foo": "BAR"
}
}
}
实战创建索引demo:
{
"settings": {
"number_of_replicas": 1,
"number_of_shards": 3,
"refresh_interval": "1s",
"translog": {
"flush_threshold_size": "1.6gb"
},
"merge": {
"scheduler": {
"max_thread_count": "1"
}
},
"index": {
"routing": {
"allocation": {
"total_shards_per_node": "2"
}
}
},
"analysis": {
"normalizer": {
"my_normalizer": {
"type": "custom",
"filter": [
"lowercase",
"asciifolding"
]
}
}
}
},
"mappings": {
"properties": {
"huid": {
"index": true,
"store": true,
"type": "keyword"
},
"standard_name": {
"index": true,
"store": true,
"type": "keyword",
"normalizer": "my_normalizer"
},
"hcode": {
"index": true,
"store": true,
"type": "keyword"
},
"name": {
"index": true,
"store": true,
"type": "keyword",
"normalizer": "my_normalizer"
},
"name_segments": {
"index": true,
"store": true,
"type": "keyword",
"normalizer": "my_normalizer"
},
"name_segments_loc": {
"index": true,
"store": true,
"type": "keyword",
"normalizer": "my_normalizer"
},
"pcode": {
"index": true,
"store": true,
"type": "keyword"
},
"label": {
"index": true,
"store": true,
"type": "keyword"
},
"hcreatetime": {
"index": true,
"store": true,
"format": "yyyy-MM-dd HH:mm:ss",
"type": "date"
},
"hupdatetime": {
"index": true,
"store": true,
"format": "yyyy-MM-dd HH:mm:ss",
"type": "date"
},
"create_by": {
"index": true,
"store": true,
"type": "keyword"
},
"update_by": {
"index": true,
"store": true,
"type": "keyword"
},
"hisvalid": {
"index": true,
"store": true,
"type": "integer"
},
"src": {
"index": true,
"store": true,
"type": "keyword"
},
"SEC_HCODE": {
"index": true,
"store": true,
"type": "keyword",
"normalizer": "my_normalizer"
},
"SEC_TYPE": {
"index": true,
"store": true,
"type": "keyword"
},
"EXCH_HCODE": {
"index": true,
"store": true,
"type": "keyword"
},
"COMB_SYMBOL": {
"index": true,
"store": true,
"type": "keyword"
},
"CNAME": {
"index": true,
"store": true,
"type": "keyword",
"normalizer": "my_normalizer"
},
"CSNAME_PINYIN_FSIM": {
"index": true,
"store": true,
"type": "keyword",
"normalizer": "my_normalizer"
},
"CSNAME": {
"index": true,
"store": true,
"type": "keyword"
},
"ENAME": {
"index": true,
"store": true,
"type": "keyword"
},
"ESNAME": {
"index": true,
"store": true,
"type": "keyword"
},
"is_mstr_name": {
"index": true,
"store": true,
"type": "integer"
},
"tag": {
"index": true,
"store": true,
"type": "keyword",
"normalizer": "my_normalizer"
},
"name_rinse": {
"index": true,
"store": true,
"type": "keyword",
"normalizer": "my_normalizer"
}
}
}
}
内容总结
以上是互联网集市为您收集整理的ElasticSearch索引字段检索时使其不区分大小写全部内容,希望文章能够帮你解决ElasticSearch索引字段检索时使其不区分大小写所遇到的程序开发问题。 如果觉得互联网集市技术教程内容还不错,欢迎将互联网集市网站推荐给程序员好友。
内容备注
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 gblab@vip.qq.com 举报,一经查实,本站将立刻删除。
内容手机端
扫描二维码推送至手机访问。