简介
Elasticsearch是一个高度可扩展的全文搜索和分析引擎,基于Apache Lucence(事实上,Lucence也是百度所采用的搜索引擎)构建,能够对大容量的数据进行接近实时的存储、搜索和分析操作。
安装
安装Java8
在安装Elasticsearch前,首先需要安装JAVA环境。
可以先执行以下命令查看是否安装:
java -version
如果没有安装需要先到oracle官网下载jdk-6u45-windows-x64.exe并安装。下载时会提示需要登录账号才能下载。
配置Java环境变量
在系统环境变量中,新建JAVA_HOME
变量,值为:C:\Program Files\Java\jdk1.6.0_45
:
在path这个系统环境变量中,添加:;%JAVA_HOME%\bin
:
添加CLASSPATH变量,值为:.;%JAVA_HOME%\lib\dt.jar;%JAVA_HOME%\lib\tools.jar;
打开cmd命令行,输入javac -version
,如果成功执行命令,则说明jdk安装成功。
安装Elasticsearch
下载地址:https://www.elastic.co/cn/downloads/elasticsearch
目前最新版是8.4.1
下载后解压压缩包,在config/elasticsearch.yml
末尾添加:
ingest.geoip.downloader.enabled: false
然后修改如下配置:
network.host: 127.0.0.1
http.port: 9200
双击bin/elasticsearch.bat
运行后,报错,再次打开elasticsearch.yml
,修改:
xpack.security.enabled: false
双击elasticsearch.bat
运行,浏览器访问127.0.0.1:9200
,看到以下界面则说明安装成功:
基本使用
创建索引
PUT http://127.0.0.1:9200/<index>
<index>
:索引的名称
注:也可以不创建索引,后续在添加文档时,如果没有索引会自动创建。
删除索引
DELETE http://127.0.0.1:9200/<index>
添加/更新文档
POST http://127.0.0.1:9200/<index>/_doc/<_id>
<index>
:索引的名称<_id>
:文档的唯一标识符
例如:
POST http://127.0.0.1:9200/movies/_doc/1
{"id":1,"title":"Kung Fu Panda","overview":"When the Valley of Peace is threatened, lazy Po the panda discovers his destiny as the \"chosen one\" and trains to become a kung fu hero, but transforming the unsleek slacker into a brave warrior won't be easy. It's up to Master Shifu and the Furious Five -- Tigress, Crane, Mantis, Viper and Monkey -- to give it a try.","genres":["Action","Adventure","Animation","Family","Comedy"],"poster":"https://image.tmdb.org/t/p/w500/wWt4JYXTg5Wr3xBW2phBrMKgp3x.jpg","release_date":1212537600}
可以使用Postman来操作:
批量添加文档
PUT http://127.0.0.1:9200/<index>/_bulk
例如:
PUT http://127.0.0.1:9200/movies/_bulk
{"index":{"_id":1}}
{"id":1,"title":"Kung Fu Panda","overview":"When the Valley of Peace is threatened, lazy Po the panda discovers his destiny as the \"chosen one\" and trains to become a kung fu hero, but transforming the unsleek slacker into a brave warrior won't be easy. It's up to Master Shifu and the Furious Five -- Tigress, Crane, Mantis, Viper and Monkey -- to give it a try.","genres":["Action","Adventure","Animation","Family","Comedy"],"poster":"https://image.tmdb.org/t/p/w500/wWt4JYXTg5Wr3xBW2phBrMKgp3x.jpg","release_date":1212537600}
{"index":{"_id":2}}
{"id":2,"title":"Batman","overview":"Batman has not been seen for ten years. A new breed of criminal ravages Gotham City, forcing 55-year-old Bruce Wayne back into the cape and cowl. But, does he still have what it takes to fight crime in a new era?","genres":["Action","Animation","Mystery"],"poster":"https://image.tmdb.org/t/p/w500/kkjTbwV1Xnj8wBL52PjOcXzTbnb.jpg","release_date":1345507200}
注:提交数据的最后一行有一行空行\n
不能省略。
获取文档
GET http://127.0.0.1:9200/<index>/_doc/<_id>
例如:
GET http://127.0.0.1:9200/movies/_doc/1
删除文档
DELETE http://127.0.0.1:9200/<index>/_doc/<_id>
搜索文档
URL参数查询
GET http://127.0.0.1:9200/<index>/_search?q=<keyword>&sort=<field>:<direction>
q
:使用q参数来运行查询参数搜索<keyword>
:查询字符串sort
:排序
例如:
GET http://127.0.0.1:9200/movies/_search?q=panda&sort=id:asc
url中的更多参数请查看官方文档
响应正文
搜素后得到如下响应正文:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": null,
"hits": [
{
"_index": "movies",
"_id": "1",
"_score": null,
"_ignored": [
"overview.keyword"
],
"_source": {
"id": 1,
"title": "Kung Fu Panda",
"overview": "When the Valley of Peace...",
"genres": [
"Action",
...
],
"poster": "https://image.xxx.jpg",
"release_date": 1212537600
},
"sort": [
1
]
}
]
}
}
- took – 执行搜索的时间(以毫秒为单位)
- timed_out – 搜索是否超时
- _shards – 搜索了多少个分片,以及搜索成功/失败的分片数
- hits – 搜索结果
- hits.total – 符合搜索条件的文档总数
- hits.hits – 搜索结果数组(默认为前10个文档)
- hits.sort – 结果的排序键
- hits._score – 文档的相关性,数字越高,文档越相关。
更多参数解释可以查看官方文档
DSL查询
Elasticsearch 提供了基于 JSON 的完整 Query DSL(Domain Specific Language)来定义查询。
- term查询
查询ID为1的文档:
GET http://127.0.0.1:9200/movies/_search
{
"query": {
"term": {
"id": 1
}
}
}
一般使用term
检索非文本的精确值,例如商品价格、商品ID、登录账号等。
- terms查询
相当于多个term检索, 类似于SQL中in关键字的用法, 即在某些给定的数据中检索:
GET http://127.0.0.1:9200/movies/_search
{
"query": {
"terms": {
"title.keyword": [
"Kung Fu Panda", "Batman"
]
}
}
}
一般使用term
检索非文本的精确值,例如商品价格、商品ID、登录账号等。
- match文本模糊查询
查询标题包含panda的文档:
GET http://127.0.0.1:9200/movies/_search
{
"query": {
"match": {
"title": "panda"
}
}
}
- keyword匹配精确值
使用keyword后,文本精确值必须是匹配值才算匹配成功
GET http://127.0.0.1:9200/movies/_search
{
"query": {
"match": {
"title.keyword": "Kung Fu Panda"
}
}
}
- match_phrase短语匹配
使用match时会自动分词,如果不想分词,想查询一个完整的短语就可以使用短语匹配
GET http://127.0.0.1:9200/movies/_search
{
"query": {
"match_phrase": {
"title": "Kung Fu"
}
}
}
- multi_match多字段匹配
在title和overview两个字段中匹配关键词
GET http://127.0.0.1:9200/movies/_search
{
"query": {
"multi_match": {
"query": "Kung Fu",
"fields": ["title","overview"]
}
}
}
注: 查询内容会分词
- filer过滤查询
filer与must,must_not,should的不同: 是否满足都不会增加评分
GET http://127.0.0.1:9200/movies/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"title": "panda"
}
}
],
"filter": {
"range": {
"release_date": {
"gte": 1212537500,
"lte": 1212537700
}
}
}
}
}
}
- 复合查询:
例如查询标题中包含panda
并且发布时间在指定范围的文档
GET http://127.0.0.1:9200/movies/_search
{
"query": {
"bool": {
"must": [
{
"range": {
"release_date": {
"gte": 1212537500,
"lte": 1212537700
}
}
},
{
"match": {
"title": "panda"
}
}
],
"boost": 1.0
}
}
}
bool复合查询可以理解为与
, 多个查询条件都要一起满足。
must,must_not,should
- must: 必须满足
- must_not : 必须不满足
- should: 满不满足都可以,满足评分会更高
跟多用法参考官方文档
参考:
https://blog.csdn.net/Tc_lccc/article/details/118061349
https://blog.csdn.net/weixin_30650039/article/details/98046946