ElasticSearch配置IK分词器
#下载:https://github.com/medcl/elasticsearch-analysis-ik/releasescd /usr/local/elasticsearch-7.9.3/bin./elasticsearch-plugin installhttps://github.com/medcl/elasticsearch-analysis-ik/releases/download/v
·
#下载:https://github.com/medcl/elasticsearch-analysis-ik/releases
cd /usr/local/elasticsearch-7.9.3/bin
./elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.9.3/elasticsearch-analysis-ik-7.9.3.zip
重启ES:
ps -ef | grep elastic
kill -9 1710[进程号]
sh elasticsearch -d
#配置IK分词器
PUT /ikdb?pretty
{
"settings" : {
"analysis" : {
"analyzer" : {
"ik" : {
"tokenizer" : "ik_max_word"
}
}
}
}
}
POST /ikdb/article?pretty=true
{
"mappings": {
"properties": {
"subject": {
"index": true,
"type": "keyword"
}
}
}
}
#查询
POST /ikdb/article/_search?pretty
{
"query" : { "match" : { "subject" : "希拉里和韩国" }}
}
说明:ik带有两个分词器:
ik_max_word :会将文本做最细粒度的拆分;尽可能多的拆分出词语
句子:我爱我的祖国
结果: 我|爱|我|的|祖|国|祖国
ik_smart:会做最粗粒度的拆分;已被分出的词语将不会再次被其它词语占有
句子:我爱我的祖国
结果: 我|爱|我|的|祖国
#查看分词效果
GET _analyze?pretty
{
"analyzer": "ik_max_word",
"text": "我爱我的祖国"
}
#插入测试数据
POST /ikdb/article/_bulk?pretty
{ "index" : { "_id" : "1" } }
{"subject" : ""闺蜜"崔顺实被韩检方传唤 韩总统府促彻查真相" }
{ "index" : { "_id" : "2" } }
{"subject" : "韩举行"护国训练" 青瓦台:决不许国家安全出问题" }
{ "index" : { "_id" : "3" } }
{"subject" : "媒体称FBI已经取得搜查令 检视希拉里电邮" }
{ "index" : { "_id" : "4" } }
{"subject" : "村上春树获安徒生奖 演讲中谈及欧洲排外问题" }
{ "index" : { "_id" : "5" } }
{"subject" : "希拉里团队炮轰FBI 参院民主党领袖批其”违法”" }
#对"希拉里和韩国"进行分词查询
POST /ikdb/article/_search?pretty
{
"query" : { "match" : { "subject" : "希拉里和韩国" }},
"highlight" : {
"pre_tags" : ["<font color=red>"],
"post_tags" : ["</font>"],
"fields" : {
"subject" : {}
}
}
}
#修改mappings
POST /ikdb/article?pretty=true
{
"mappings": {
"properties": {
"subject": {
"index": true,
"type": "keyword"
},
"author": {
"index": true,
"type": "keyword"
}
}
}
}
#插入测试数据
POST /ikdb/article/_bulk?pretty
{ "index" : { "_id" : "6" } }
{"subject" : ""闺蜜"崔顺实被韩检方传唤 韩总统府促彻查真相","author":"czl" }
#查询
POST /ikdb/article/_search?pretty
{
"query" : { "match" : { "author" : "czl" }}
}
#热词更新
运行效果:
更多推荐
已为社区贡献10条内容
所有评论(0)