ElasticSearch 7 SQL 详解
Elasticsearch SQL是一个X-Pack组件,它允许针对Elasticsearch实时执行类似SQL的查询.无论使用REST接口,命令行还是JDBC,任何客户端都可以使用SQL对Elasticsearch中的数据进行原生搜索和聚合数据.可以将Elasticsearch SQL看作是一种翻译器,它可以将SQL翻译成Query DSL.
平时使用Elasticsearch的时候,会在Kibana中使用Query DSL来查询数据.每次要用到Query DSL时都基本忘光了,需要重新在回顾一遍,最近发现Elasticsearch已经支持SQL查询了(6.3版本以后),整理了下一些用法.
简介
Elasticsearch SQL是一个X-Pack组件,它允许针对Elasticsearch实时执行类似SQL的查询.无论使用REST接口,命令行还是JDBC,任何客户端都可以使用SQL对Elasticsearch中的数据进行原生搜索和聚合数据.可以将Elasticsearch SQL看作是一种翻译器,它可以将SQL翻译成Query DSL.
Elasticsearch SQL具有如下特性:
- 原生支持:Elasticsearch SQL是专门为Elasticsearch打造的.
- 没有额外的零件:无需其他硬件,处理器,运行环境或依赖库即可查询Elasticsearch,Elasticsearch SQL直接在Elasticsearch内部运行.
- 轻巧高效:Elasticsearch SQL并未抽象化其搜索功能,相反的它拥抱并接受了SQL来实现全文搜索,以简洁的方式实时运行全文搜索.
准备
先安装好Elasticsearch和Kibana,这里安装的是7.17.0版本
安装完成后在Kibana中 http://127.0.0.1:5601/app/dev_tools#/console{target=“_blank”}
导入测试数据,数据地址: https://github.com/macrozheng/mall-learning/blob/master/document/json/accounts.json{target=“_blank”}
直接在Kibana的Dev Tools中运行如下命令即可:
POST /account/_bulk
{"index":{"_id":"1"}}
{"account_number":1,"balance":39225,"firstname":"Amber","lastname":"Duke","age":32,"gender":"M","address":"880 Holmes Lane","employer":"Pyrami","email":"amberduke@pyrami.com","city":"Brogan","state":"IL"}
{"index":{"_id":"6"}}
{"account_number":6,"balance":5686,"firstname":"Hattie","lastname":"Bond","age":36,"gender":"M","address":"671 Bristol Street","employer":"Netagy","email":"hattiebond@netagy.com","city":"Dante","state":"TN"}
{"index":{"_id":"13"}}
{"account_number":13,"balance":32838,"firstname":"Nanette","lastname":"Bates","age":28,"gender":"F","address":"789 Madison Street","employer":"Quility","email":"nanettebates@quility.com","city":"Nogal","state":"VA"}
{"index":{"_id":"18"}}
{"account_number":18,"balance":4180,"firstname":"Dale","lastname":"Adams","age":33,"gender":"M","address":"467 Hutchinson Court","employer":"Boink","email":"daleadams@boink.com","city":"Orick","state":"MD"}
{"index":{"_id":"20"}}
{"account_number":20,"balance":16418,"firstname":"Elinor","lastname":"Ratliff","age":36,"gender":"M","address":"282 Kings Place","employer":"Scentric","email":"elinorratliff@scentric.com","city":"Ribera","state":"WA"}
{"index":{"_id":"25"}}
{"account_number":25,"balance":40540,"firstname":"Virginia","lastname":"Ayala","age":39,"gender":"F","address":"171 Putnam Avenue","employer":"Filodyne","email":"virginiaayala@filodyne.com","city":"Nicholson","state":"PA"}
{"index":{"_id":"32"}}
{"account_number":32,"balance":48086,"firstname":"Dillard","lastname":"Mcpherson","age":34,"gender":"F","address":"702 Quentin Street","employer":"Quailcom","email":"dillardmcpherson@quailcom.com","city":"Veguita","state":"IN"}
{"index":{"_id":"37"}}
{"account_number":37,"balance":18612,"firstname":"Mcgee","lastname":"Mooney","age":39,"gender":"M","address":"826 Fillmore Place","employer":"Reversus","email":"mcgeemooney@reversus.com","city":"Tooleville","state":"OK"}
{"index":{"_id":"44"}}
{"account_number":44,"balance":34487,"firstname":"Aurelia","lastname":"Harding","age":37,"gender":"M","address":"502 Baycliff Terrace","employer":"Orbalix","email":"aureliaharding@orbalix.com","city":"Yardville","state":"DE"}
{"index":{"_id":"49"}}
{"account_number":49,"balance":29104,"firstname":"Fulton","lastname":"Holt","age":23,"gender":"F","address":"451 Humboldt Street","employer":"Anocha","email":"fultonholt@anocha.com","city":"Sunriver","state":"RI"}
第一个SQL查询
我们使用SQL来查询下前10条记录,可以通过format参数控制返回结果的格式,txt表示文本格式,看起来更直观点,默认为json格式.
在Kibana的Dev Tools中输入如下命令:
POST /_sql?format=txt
{
"query": "SELECT account_number,address,age,balance FROM account LIMIT 10"
}
查询结果显示如下.
account_number | address | age | balance
---------------+--------------------+---------------+---------------
1 |880 Holmes Lane |32 |39225
6 |671 Bristol Street |36 |5686
13 |789 Madison Street |28 |32838
18 |467 Hutchinson Court|33 |4180
20 |282 Kings Place |36 |16418
25 |171 Putnam Avenue |39 |40540
32 |702 Quentin Street |34 |48086
37 |826 Fillmore Place |39 |18612
44 |502 Baycliff Terrace|37 |34487
49 |451 Humboldt Street |23 |29104
如上实例,使用 _sql
指明使用SQL模块,在 query
字段中指定要执行的SQL语句.使用 format
指定返回数据的格式,数据格式可选项有以下几个,它们都是见名识意的:
format | Accept Http header | 说明 |
---|---|---|
csv | text/csv | 逗号分隔 |
json | application/json | Json 格式 |
tsv | text/tab-separated-values | tab 分隔 |
txt | text/plain | 文本格式 |
yaml | application/yaml | yaml |
cbor | application/cbor | 简洁的二进制对象表示格式 |
smile | application/smile | 类似于 cbor 的另一种二进制格式 |
将SQL转化为DSL
当我们需要使用Query DSL时,也可以先使用SQL来查询,然后通过Translate API转换即可.
例如我们翻译以下查询语句:
POST /_sql/translate
{
"query": "SELECT account_number,address,age,balance FROM account WHERE age>32 LIMIT 10"
}
最终获取到Query DSL结果如下.
{
"size" : 10,
"query" : {
"range" : {
"age" : {
"from" : 32,
"to" : null,
"include_lower" : false,
"include_upper" : false,
"boost" : 1.0
}
}
},
"_source" : false,
"fields" : [
{
"field" : "account_number"
},
{
"field" : "address"
},
{
"field" : "age"
},
{
"field" : "balance"
}
],
"sort" : [
{
"_doc" : {
"order" : "asc"
}
}
]
}
然后可以用Query DSL 语法来查询:
GET /account/_search
{
"size": 10,
"query": {
"range": {
"age": {
"from": 32,
"to": null,
"include_lower": false,
"include_upper": false,
"boost": 1
}
}
},
"_source": false,
"fields": [
{
"field": "account_number"
},
{
"field": "address"
},
{
"field": "age"
},
{
"field": "balance"
}
],
"sort": [
{
"_doc": {
"order": "asc"
}
}
]
}
SQL和DSL混合使用
我们还可以将SQL和Query DSL混合使用,比如使用Query DSL来设置过滤条件.
例如查询 age在30-35
之间的记录,可以使用如下查询语句:
POST /_sql?format=txt
{
"query": "SELECT account_number,address,age,balance FROM account",
"filter": {
"range": {
"age": {
"gte": 30,
"lte": 35
}
}
},
"fetch_size": 10
}
SQL和ES对应关系
虽然 SQL 和 Elasticsearch 对于数据的组织方式(以及不同的语义)有不同的术语,但本质上它们的用途是相同的.下面是它们的映射关系表:
SQL | Elasticsearch | 说明 |
---|---|---|
column | field | 在 Elasticsearch 字段时,SQL 将这样的条目调用为 column.注意,在 Elasticsearch,一个字段可以包含同一类型的多个值(本质上是一个列表) ,而在 SQL 中,一个列可以只包含一个表示类型的值.Elasticsearch SQL 将尽最大努力保留 SQL 语义,并根据查询的不同,拒绝那些返回多个值的字段. |
row | document | 列和字段本身不存在; 它们是行或文档的一部分.两者的语义略有不同: 行row往往是严格的(并且有更多的强制执行),而文档往往更灵活或更松散(同时仍然具有结构). |
table | index | 在 SQL 还是 Elasticsearch 中查询针对的目标 |
schema | implicit | 在关系型数据库中,schema 主要是表的名称空间,通常用作安全边界.Elasticsearch没有为它提供一个等价的概念. |
虽然这些概念之间的映射在语义上有些不同,但它们间更多的是有共同点,而不是不同点.
词法结构
ES SQL 的词法结构很大程度上类似于 ANSI SQL 本身.ES SQL 当前一次只能接受一个命令,这里的命令是由输入流结尾结束的 token 序列.这些 token 可以是关键字,标识符(带引号或者不带引号),文本(或者常量),特殊字符符号(通常是分隔符).
关键字
关键词这个其实跟我们写 SQL 语句那种关键字的定义是一样的,例如 SELECT,FROM 等都是关键字,需要注意的是,关键字不区分大小写.
SELECT * FROM my_table
如上示例,共有 4 个 token:SELECT, * ,FROM ,my_table
,其中 SELECT,* ,FROM
是关键词,表示在 SQL 具有固定含义的词.而 my_table
是一个标识符,其表示了 SQL 中实体,如表,列等
标识符
标识符有两种类型:带引号的和不带引号的,示例如下:
SELECT ip_address FROM "hosts-*"
如上示例,查询中有两个标识符分别为不带引号的 ip_address
和带引号的 hosts-*
(通配符模式).
因为 ip_address
不与任何关键字冲突,所以可以不带引号.而 hosts-*
与 -
(减号操作)和 *
冲突,所以要加引号.
📝注意: 对于标识符来说,应该尽量避免使用复杂的命名和与关键字冲突的命名,并且在输入的时候使用引号作为标识符,这样可以消除歧义.
直接常量
ES SQL 支持两种隐式的类型常量:字符串 和 数字.
- 字符串,字符串可以用单引号进行限定,例如:
'mysql'
.如果在字符串中包含了单引号,则需要使用另一个单引号进行转义,例如:'Captain EO''s Voyage'
. - 数值常量,数值常量可以使用十进制和科学计数法进行表示,其示例如下:
1969 -- integer notation
3.14 -- decimal notation
.1234 -- decimal notation starting with decimal point
4E5 -- scientific notation (with exponent marker)
1.2e-3 -- scientific notation with decimal point
一个包含小数点的数值常量会被解析为 Double 类型.如果适合解析为整型,则解析为 Integer,否则解析为长整型(Long).
单引号,双引号
在 SQL 中,单引号和双引号具有不同的含义,不能互换使用.单引号用于声明字符串,而双引号用于表示标识符.示例如下:
SELECT "first_name" FROM "musicians" WHERE "last_name" = 'Carroll'
如上示例,first_name,musicians,last_name
都是标识符,用双引号.而 Carroll 是字符串,用单引号.
特殊字符
一些非数字和字母的字符具有不同于运算符的专用含义,特殊字符有:
字符 | 描述 |
---|---|
* | 在一些上下文中表示数据表的所有字段,也可以表示某些聚合函数的参数. |
, | 用于列举列表的元素 |
. | 用于数字常量或者分隔标识符限定符(表,列等) |
() | 用于特定的 SQL 命令,函数声明,或者强制优先级. |
运算符
ES SQL 中大多数的运算符它们的优先级都是相同的,并且是左关联.如果需要修改优先级,则要用括号来强制改变其优先级.下表是 ES SQL 支持的运算符和其优先级:
运算符 | 结合性 | 说明 |
---|---|---|
. | 左结合 | 限定符或者分割符 |
:: | 左结合 | PostgreSQL-style 风格的类型转换符 |
+ - | 右结合 | 一元加减符 |
* / % | 左结合 | 乘法,除法,取模 |
+ - | 左结合 | 加法,减法运算 |
BETWEEN IN LIKE | 范围包含,字符匹配 | |
< > <= >= = <=> <> != | 比较运算 | |
NOT | 右结合 | 逻辑非 |
AND | 左结合 | 逻辑与 |
OR | 左结合 | 逻辑或 |
注释
ES SQL 支持两种注释:单行和多行注释,其示例如下:
-- single line comment,单行注释
/* multi
line
comment
that supports /* nested comments */
多行注释
*/
常用SQL操作
语法介绍
在ES中使用SQL查询的语法与在数据库中使用基本一致,具体格式如下:
SELECT select_expr [, ...]
[ FROM table_name ]
[ WHERE condition ]
[ GROUP BY grouping_element [, ...] ]
[ HAVING condition]
[ ORDER BY expression [ ASC | DESC ] [, ...] ]
[ LIMIT [ count ] ]
[ PIVOT ( aggregation_expr FOR column IN ( value [ [ AS ] alias ] [, ...] ) ) ]
WHERE
可以使用WHERE
语句设置查询条件,比如查询state字段为VA的记录,查询语句如下.
POST /_sql?format=txt
{
"query": "SELECT account_number,address,age,balance,state FROM account WHERE state='VA' LIMIT 10"
}
查询结果如下:
account_number | address | age | balance | state
---------------+--------------------+---------------+---------------+---------------
13 |789 Madison Street |28 |32838 |VA
486 |991 Applegate Court |22 |35902 |VA
703 |489 Flatlands Avenue|29 |27443 |VA
835 |641 Royce Street |25 |46558 |VA
897 |731 Poplar Street |25 |45973 |VA
564 |842 Congress Street |22 |43631 |VA
588 |301 Anna Court |31 |43531 |VA
660 |916 Amersfort Place |33 |46427 |VA
797 |919 Quay Street |26 |6854 |VA
836 |953 Dinsmore Place |25 |20797 |VA
GROUP BY
我们可以使用 GROUP BY
语句对数据进行分组,统计出分组记录数量,最大age和平均balance等信息,查询语句如下.
POST /_sql?format=txt
{
"query": "SELECT state,COUNT(*),MAX(age),AVG(balance) FROM account GROUP BY state LIMIT 10"
}
HAVING
我们可以使用 HAVING
语句对分组数据进行二次筛选,比如筛选分组记录数量大于15的信息,查询语句如下.
POST /_sql?format=txt
{
"query": "SELECT state,COUNT(*),MAX(age),AVG(balance) FROM account GROUP BY state HAVING COUNT(*)>15 LIMIT 10"
}
查询结果如下:
state | COUNT(*) | MAX(age) | AVG(balance)
---------------+---------------+---------------+------------------
AK |22 |40 |26131.863636363636
AL |25 |40 |25739.56
AR |18 |39 |27238.166666666668
CA |17 |40 |22517.882352941175
CT |16 |39 |28278.4375
DC |24 |40 |23180.583333333332
FL |18 |38 |20443.444444444445
ORDER BY
我们可以使用ORDER BY
语句对数据进行排序,比如按照balance字段从高到低排序,查询语句如下.
POST /_sql?format=txt
{
"query": "SELECT account_number,address,age,balance,state FROM account ORDER BY balance DESC LIMIT 10 "
}
查询结果如下:
account_number | address | age | balance | state
---------------+----------------------+---------------+---------------+---------------
248 |717 Hendrickson Place |36 |49989 |WA
854 |603 Cooper Street |25 |49795 |AL
240 |659 Highland Boulevard|35 |49741 |NH
97 |512 Cumberland Walk |40 |49671 |MO
842 |833 Bushwick Court |23 |49587 |TX
168 |975 Flatbush Avenue |20 |49568 |IL
803 |963 Highland Avenue |25 |49567 |MS
926 |833 Quincy Street |21 |49433 |VT
954 |688 Hart Street |22 |49404 |MD
572 |994 Chester Court |20 |49355 |UT
DESCRIBE
我们可以使用 DESCRIBE
语句查看表(ES中为索引)中有哪些字段,比如查看account表的字段,查询语句如下.
POST /_sql?format=txt
{
"query": "DESCRIBE account"
}
查询结果如下:
column | type | mapping
-----------------+---------------+---------------
account_number |BIGINT |long
address |VARCHAR |text
address.keyword |VARCHAR |keyword
age |BIGINT |long
balance |BIGINT |long
city |VARCHAR |text
city.keyword |VARCHAR |keyword
email |VARCHAR |text
email.keyword |VARCHAR |keyword
employer |VARCHAR |text
employer.keyword |VARCHAR |keyword
firstname |VARCHAR |text
firstname.keyword|VARCHAR |keyword
gender |VARCHAR |text
gender.keyword |VARCHAR |keyword
lastname |VARCHAR |text
lastname.keyword |VARCHAR |keyword
state |VARCHAR |text
state.keyword |VARCHAR |keyword
SHOW TABLES
我们可以使用 SHOW TABLES
查看所有的表(ES中为索引).
POST /_sql?format=txt
{
"query": "SHOW TABLES"
}
查询结果如下:
#! this request accesses system indices: [.kibana_7.17.0_001, .kibana_task_manager_7.17.0_001], but in a future major version, direct access to system indices will be prevented by default
#! this request accesses system indices: [.apm-agent-configuration, .apm-custom-link, .async-search, .kibana_7.17.0_001, .kibana_task_manager_7.17.0_001, .tasks], but in a future major version, direct access to system indices will be prevented by default
catalog | name | type | kind
---------------+-------------------------------+---------------+---------------
my-application |.apm-agent-configuration |TABLE |INDEX
my-application |.apm-custom-link |TABLE |INDEX
my-application |.async-search |TABLE |INDEX
my-application |.kibana |VIEW |ALIAS
my-application |.kibana_7.17.0 |VIEW |ALIAS
my-application |.kibana_7.17.0_001 |TABLE |INDEX
my-application |.kibana_task_manager |VIEW |ALIAS
my-application |.kibana_task_manager_7.17.0 |VIEW |ALIAS
my-application |.kibana_task_manager_7.17.0_001|TABLE |INDEX
my-application |.tasks |TABLE |INDEX
my-application |account |TABLE |INDEX
my-application |kibana_sample_data_flights |TABLE |INDEX
支持的函数
使用SQL查询ES中的数据,不仅可以使用一些SQL中的函数,还可以使用一些ES中特有的函数.
查询支持的函数
我们可以使用 SHOW FUNCTIONS
语句查看所有支持的函数,比如搜索所有带有 DATE
字段的函数可以使用如下语句.
POST /_sql?format=txt
{
"query": "SHOW FUNCTIONS LIKE '%DATE%'"
}
查询结果如下:
name | type
---------------+---------------
CURDATE |SCALAR
CURRENT_DATE |SCALAR
DATEADD |SCALAR
DATEDIFF |SCALAR
DATEPART |SCALAR
DATETIME_FORMAT|SCALAR
DATETIME_PARSE |SCALAR
DATETRUNC |SCALAR
DATE_ADD |SCALAR
DATE_DIFF |SCALAR
DATE_PARSE |SCALAR
DATE_PART |SCALAR
DATE_TRUNC |SCALAR
全文搜索函数
全文搜索函数是ES中特有的,当使用 MATCH
或 QUERY
函数时,会启用全文搜索功能,SCORE
函数可以用来统计搜索评分.
MATCH()
使用MATCH
函数查询address中包含Street的记录.
POST /_sql?format=txt
{
"query": "SELECT account_number,address,age,balance,SCORE() FROM account WHERE MATCH(address,'Street') LIMIT 10"
}
查询结果如下:
account_number | address | age | balance | SCORE()
---------------+-----------------------+---------------+---------------+---------------
6 |671 Bristol Street |36 |5686 |0.95395315
13 |789 Madison Street |28 |32838 |0.95395315
32 |702 Quentin Street |34 |48086 |0.95395315
49 |451 Humboldt Street |23 |29104 |0.95395315
51 |334 River Street |31 |14097 |0.95395315
63 |510 Sedgwick Street |30 |6077 |0.95395315
87 |446 Halleck Street |22 |1133 |0.95395315
107 |694 Jefferson Street |28 |48844 |0.95395315
138 |422 Malbone Street |39 |9006 |0.95395315
140 |878 Schermerhorn Street|32 |26696 |0.95395315
QUERY()
使用 QUERY
函数查询address中包含Street的记录.
POST /_sql?format=txt
{
"query": "SELECT account_number,address,age,balance,SCORE() FROM account WHERE QUERY('address:Street') LIMIT 10"
}
查询结果如下:
account_number | address | age | balance | SCORE()
---------------+-----------------------+---------------+---------------+---------------
6 |671 Bristol Street |36 |5686 |0.95395315
13 |789 Madison Street |28 |32838 |0.95395315
32 |702 Quentin Street |34 |48086 |0.95395315
49 |451 Humboldt Street |23 |29104 |0.95395315
51 |334 River Street |31 |14097 |0.95395315
63 |510 Sedgwick Street |30 |6077 |0.95395315
87 |446 Halleck Street |22 |1133 |0.95395315
107 |694 Jefferson Street |28 |48844 |0.95395315
138 |422 Malbone Street |39 |9006 |0.95395315
140 |878 Schermerhorn Street|32 |26696 |0.95395315
SQL CLI
如果你不想使用Kibana来使用ES SQL的话,也可以使用ES自带的SQL CLI来查询,该命令位于ES的bin目录下.
使用如下命令启动SQL CLI:
elasticsearch-sql-cli http://localhost:9200
然后直接输入SQL命令即可查询了,注意要加分号.
SELECT account_number,address,age,balance FROM account LIMIT 10;
ES SQL 的局限性
使用SQL查询ES有一定的局限性,没有原生的Query DSL那么强大,对于嵌套属性和某些函数的支持并不怎么好,但是平时用来查询下数据基本够用了.
ES SQL 使用实战
我们先准备数据,此处我们将使用 Kibana 提供的航班数据:
如下图,在 Kibana 中点击左边栏的 Analytics
下的 Overview
,右边的页面中选择 DashBoard
然后点击 Install some sample data
链接,
再点击 Sample flight data
即可加入航班的数据.
可以使用以下语句查看航班数据:
POST /kibana_sample_data_flights/_search
{
"query": {
"match_all": {}
}
}
下面来看看常用的 SQL 如何编写.
1. WHERE
我们过滤出目的地为 US 的数据:
POST /_sql?format=txt
{
"query": "SELECT FlightNum, OriginWeather, OriginCountry, Carrier FROM kibana_sample_data_flights WHERE DestCountry = 'US'"
}
查询结果如下:
FlightNum | OriginWeather | OriginCountry | Carrier
---------------+-------------------+---------------+----------------
R43CELD |Cloudy |US |JetBeats
3YAQM9U |Clear |US |JetBeats
8SHQI41 |Cloudy |US |JetBeats
HF9AP10 |Sunny |US |JetBeats
ZTL6FPB |Heavy Fog |IT |ES-Air
TF9BTQL |Clear |JP |Kibana Airlines
T9QK7GX |Clear |IN |Logstash Airways
4AHGESO |Rain |ZA |Kibana Airlines
J684XSR |Sunny |AR |JetBeats
T390OH4 |Cloudy |IN |ES-Air
Q33SYKK |Sunny |KR |JetBeats
JBQ50Y2 |Clear |IT |Logstash Airways
2. GROUP BY
可以使用 GROUP BY 语句对数据进行分组聚合统计操作,例如查询航班分组的平均飞行距离等.其示例如下:
POST /_sql?format=txt
{
"query": "SELECT count(*),max(DistanceMiles), avg(DistanceMiles) FROM kibana_sample_data_flights GROUP BY DestCountry"
}
如上示例,我们以目的地国家进行分组,然后统计每个分组的数量,最大的飞行距离,平均飞行距离.其结果如下:
count(*) |max(DistanceMiles)|avg(DistanceMiles)
---------------+------------------+------------------
46 |7600.7158203125 |3233.800320625305
305 |12140.8603515625 |6603.605808945953
377 |9917.6455078125 |3128.910634331741
416 |10832.3994140625 |7915.6610843951885
944 |10600.296875 |4077.664177652133
691 |10293.208984375 |2775.8247816469493
45 |12075.3935546875 |7542.028591579861
1096 |12353.7802734375 |5037.134736095902
91 |10000.7255859375 |5683.497867123111
278 |10030.87109375 |3448.2222546090325
48 |9670.9072265625 |3278.826272328695
237 |10575.1279296875 |5419.154288118902
15 |10346.84765625 |3214.9680114746093
3. HAVING
可以使用 HAVING 对分组的数据进行二次筛选,比如筛选分组中记录数大于 100 的数据,其结果如下:
POST /_sql?format=txt
{
"query": "SELECT count(*),max(DistanceMiles), avg(DistanceMiles) FROM kibana_sample_data_flights GROUP BY DestCountry HAVING COUNT(*) > 100"
}
我们过滤出了分组中记录数大于 100 的数据,其结果如下:
count(*) |max(DistanceMiles)|avg(DistanceMiles)
---------------+------------------+------------------
305 |12140.8603515625 |6603.605808945953
377 |9917.6455078125 |3128.910634331741
416 |10832.3994140625 |7915.6610843951885
944 |10600.296875 |4077.664177652133
691 |10293.208984375 |2775.8247816469493
1096 |12353.7802734375 |5037.134736095902
278 |10030.87109375 |3448.2222546090325
237 |10575.1279296875 |5419.154288118902
449 |10282.5048828125 |3213.2889483309536
373 |10774.0 |5064.675941446831
4. ORDER BY
我们可以使用 ORDER BY 进行排序,例如将平均飞行距离降序排序,其结果如下:
POST /_sql?format=txt
{
"query": "SELECT count(*),max(DistanceMiles), avg(DistanceMiles) as avgDistance FROM kibana_sample_data_flights GROUP BY DestCountry HAVING COUNT(*) > 100 ORDER BY avgDistance desc"
}
如上示例,我们将数据用平均距离排序,其结果为
count(*) |max(DistanceMiles)| avgDistance
---------------+------------------+------------------
416 |10832.3994140625 |7915.6610843951885
305 |12140.8603515625 |6603.605808945953
283 |10556.7587890625 |6030.0211101842015
237 |10575.1279296875 |5419.154288118902
214 |11447.2265625 |5323.084783429297
774 |11407.380859375 |5280.042444507589
116 |10553.98828125 |5118.16688169282
373 |10774.0 |5064.675941446831
5. 分页
分页有多种实现方式,可以使用 limit,top,fetch_size 来进行分页.
1,使用limit
分页操作
POST /_sql?format=txt
{
"query": "SELECT FlightNum, OriginWeather, OriginCountry, Carrier FROM kibana_sample_data_flights WHERE DestCountry = 'US' limit 10"
}
2,使用 top
进行分页
POST /_sql?format=txt
{
"query": "SELECT top 10 FlightNum, OriginWeather, OriginCountry, Carrier FROM kibana_sample_data_flights WHERE DestCountry = 'US'"
}
3,使用 fetch_size
进行分页
POST /_sql?format=txt
{
"query": "SELECT FlightNum, OriginWeather, OriginCountry, Carrier FROM kibana_sample_data_flights WHERE DestCountry = 'US'",
"fetch_size": 10
}
其结果如下:
FlightNum | OriginWeather | OriginCountry | Carrier
---------------+---------------+---------------+----------------
R43CELD |Cloudy |US |JetBeats
3YAQM9U |Clear |US |JetBeats
8SHQI41 |Cloudy |US |JetBeats
HF9AP10 |Sunny |US |JetBeats
ZTL6FPB |Heavy Fog |IT |ES-Air
TF9BTQL |Clear |JP |Kibana Airlines
T9QK7GX |Clear |IN |Logstash Airways
4AHGESO |Rain |ZA |Kibana Airlines
J684XSR |Sunny |AR |JetBeats
T390OH4 |Cloudy |IN |ES-Air
6. 子查询
ES SQL 是可以支持类似于 SELECT X FROM (SELECT * FROM Y)
这样简单的子查询的
POST /_sql?format=txt
{
"query": "SELECT avg(data.DistanceMiles) from (SELECT FlightNum, OriginWeather, OriginCountry, Carrier, DistanceMiles FROM kibana_sample_data_flights WHERE DestCountry = 'US') as data"
}
其结果如下:
avg(data.DistanceMiles)
-----------------------
4714.944895442431
参考资料
官方文档:xpack-sql{target=“_blank”}
更多推荐
所有评论(0)