index

index，译作索引。
我们依次讨论索引的创建、查看和删除。

创建

创建方法

创建index的方法为PUT /{indexName}。

示例代码：

PUT /kaka

运行结果：

#! Deprecation: the default number of shards will change from [5] to [1] in 7.0.0; if you wish to continue using the default of [5] shards, you must manage this on the create index request or with an index template
{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "index" : "kaka"
}

解释说明：

"acknowledged" : true：索引创建成功
"shards_acknowledged" : true：分片创建成功
"index" : "kaka"：索引名称

指定参数

我们再来解释一下#!开头的第一行。
在7版本之前，每创建一个索引，默认都会有5个分片。但是从7版本开始，默认的分片数是1。如果需要5个分片，需要在创建索引的时候加入明确的规则。
(我们用的是6版本，无需考虑这一点。)
加入明确规则的例子如下：

PUT /kk
{
  "settings": {
    "number_of_replicas": 1,
    "number_of_shards": 5
  }
}

Kibana有提示和补全功能。但如果第一个{没有换行，而是和/kk同一行的话，会导致提示补全失效。

所有字母必须小写

正如我们上一章《1.工具、概念和集群》所述，所有字母必须小写。

示例代码：

PUT /KKK

运行结果：

{
  "error": {
    "root_cause": [
      {
        "type": "invalid_index_name_exception",
        "reason": "Invalid index name [KKK], must be lowercase",
        "index_uuid": "_na_",
        "index": "KKK"
      }
    ],
    "type": "invalid_index_name_exception",
    "reason": "Invalid index name [KKK], must be lowercase",
    "index_uuid": "_na_",
    "index": "KKK"
  },
  "status": 400
}

解释说明：因为存在大写字母，所以报错了。

1 2	"type": "invalid_index_name_exception", "reason": "Invalid index name [KKK], must be lowercase",

查看

查看方法

查看索引的方法为GET /_cat/indices
示例代码：

1	GET /_cat/indices

运行结果：

green  open .kibana_1            azRP1WYcSlSKg_Iu8kim_Q 1 0 4 1 16.8kb 16.8kb
yellow open kk                   a3akwYNuTG2Bq0maLLE0-A 5 1 0 0  1.1kb  1.1kb
green  open .kibana_task_manager HsnZtouxT2i40qSFeeO9ug 1 0 2 0 12.6kb 12.6kb
yellow open kaka                 bOal4r2bTzCKfoR57mvWqg 5 1 0 0  1.1kb  1.1kb

解释说明：

在6.8.0及以上版本中，Kibana会创建两个索引.kibana_1和.kibana_task_manager。

显示表头

那么，上述运行结果的green和yellow，又代表什么呢？

我们在命令的结尾加上?v，显示表头。

示例代码：

1	GET /_cat/indices?v

运行结果：

health status index                uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   .kibana_1            azRP1WYcSlSKg_Iu8kim_Q   1   0          4            1     16.8kb         16.8kb
yellow open   kk                   a3akwYNuTG2Bq0maLLE0-A   5   1          0            0      1.2kb          1.2kb
green  open   .kibana_task_manager HsnZtouxT2i40qSFeeO9ug   1   0          2            0     12.6kb         12.6kb
yellow open   kaka                 bOal4r2bTzCKfoR57mvWqg   5   1          0            0      1.2kb          1.2kb

解释说明：

health：健康度
- yellow：不健壮的(可用)，因为索引被分成了5个分片，但是这5个分片又被放在了一个节点上。
- green：健壮的(可用)
- red：不可用的
status：状态
index：索引
uuid：唯一标识
pri：分片数
rep：副本数
docs.count：文档数量
docs.deleted：被删除文档数
store.size：存储大小
pri.store.size：主分片存储大小

删除

删除方法

删除索引的方法为DELETE /{indexName}
示例代码：

1	DELETE /kk

运行结果：

1
2
3

{
  "acknowledged" : true
}

特别提示：如果删除了Kibana的索引，会导致Kibana不可用。如果已经删除了，重启Kibana，会重新自建索引，然后可以恢复。

删除所有

删除所有索引的方法为DELETE /_all。

示例代码：

1	DELETE /_all

运行结果：

1
2
3

{
  "acknowledged" : true
}

有些资料会说，删除所有索引可以用DELETE /*，但这是在6.8.23的版本中，已经不可以这么做了。
示例代码：

DELETE /*

运行结果：

{
  "error": {
    "root_cause": [
      {
        "type": "illegal_argument_exception",
        "reason": "Wildcard expressions or all indices are not allowed"
      }
    ],
    "type": "illegal_argument_exception",
    "reason": "Wildcard expressions or all indices are not allowed"
  },
  "status": 400
}

mapping

如果把索引比拟为表的话，mapping就是表结构。

创建

示例代码：

PUT /est
{
  "mappings": {
    "_doc": {
      "properties": {
        "id": {
          "type": "keyword"
        },
        "name": {
          "type": "text"
        },
        "age": {
          "type": "integer"
        },
        "bir": {
          "type": "date"
        }
      }
    }
  }
}

运行结果：

#! Deprecation: the default number of shards will change from [5] to [1] in 7.0.0; if you wish to continue using the default of [5] shards, you must manage this on the create index request or with an index template
#! Deprecation: [types removal] The parameter include_type_name should be explicitly specified in create index requests to prepare for 7.0. In 7.0 include_type_name will default to 'false', and requests are expected to omit the type name in mapping definitions.
{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "index" : "est"
}

解释说明：

est：索引名
mappings：关键词。在6版本之前，一个index有多个type，所以是复数。虽然6版本之后，一个index一个type，但是复数形式被保留了。
_doc：类型名，这里遵循了ElasticSearch官方的建议，名字为_doc。
properties：关键词，说明接下来的是字段
id、name、age、bir：field，字段。
type：关键词，说明接下来的是数据类型
keyword、text、integer、date：数据类型

数据类型

在ElasticSearch中，数据类型八种：

text
keyword
date
integer
long
double
boolean
ip

其中text会被分词，keyword不会被分词。

查看

查看方法

查看方法为GET /{indexName}

示例代码：

GET /est

运行结果：

#! Deprecation: [types removal] The parameter include_type_name should be explicitly specified in get indices requests to prepare for 7.0. In 7.0 include_type_name will default to 'false', which means responses will omit the type name in mapping definitions.
{
  "est" : {
    "aliases" : { },
    "mappings" : {
      "_doc" : {
        "properties" : {
          "age" : {
            "type" : "integer"
          },
          "bir" : {
            "type" : "date"
          },
          "id" : {
            "type" : "keyword"
          },
          "name" : {
            "type" : "text"
          }
        }
      }
    },
    "settings" : {
      "index" : {
        "creation_date" : "1643078198880",
        "number_of_shards" : "5",
        "number_of_replicas" : "1",
        "uuid" : "eZxussN2RUaHxkqHOrObPg",
        "version" : {
          "created" : "6082399"
        },
        "provided_name" : "est"
      }
    }
  }
}

只看mapping

如上，返回了所有的内容。如果只想看mapping，方法为GET /{indexName}/_mapping。

示例代码：

1	GET /est/_mapping

运行结果：

#! Deprecation: [types removal] The parameter include_type_name should be explicitly specified in get mapping requests to prepare for 7.0. In 7.0 include_type_name will default to 'false', which means responses will omit the type name in mapping definitions.
{
  "est" : {
    "mappings" : {
      "_doc" : {
        "properties" : {
          "age" : {
            "type" : "integer"
          },
          "bir" : {
            "type" : "date"
          },
          "id" : {
            "type" : "keyword"
          },
          "name" : {
            "type" : "text"
          }
        }
      }
    }
  }
}

document

document，文档，可以理解为一行记录。

新增

新增方法

新增document的方法为POST /{indexName}/{typeName}【JSON格式的内容】。

示例代码：

POST /est/_doc
{
  "id": "一号",
  "name":"赵小六",
  "age":23,
  "bir":"2012-12-12"
}

运行结果：

{
  "_index" : "est",
  "_type" : "_doc",
  "_id" : "5Y0ij34BoH8Nsaao-4rh",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 0,
  "_primary_term" : 1
}

POST和PUT

新增文档用POST。
有部分资料说新增文档用PUT，但是根据我的实际测试，在6.8.23中，已经不支持PUT了，只能有POST。
(但如果指定ID的话，又可以用PUT。)

示例代码：

PUT /est/_doc
{
  "id": "一号",
  "name":"赵小六",
  "age":23,
  "bir":"2012-12-12"
}

运行结果：

{
  "error": "Incorrect HTTP method for uri [/est/_doc?pretty] and method [PUT], allowed: [POST]",
  "status": 405
}

在RestFul中，通常情况下，POST、GET、PUT、DELETE分别对应CRUD，但有时候POST和PUT会混用。

指定id

如果我们想指定id怎么办？格式如下：

1	POST /{indexName}/{typeName}/{id}

示例代码：

POST /est/_doc/1
{
  "id": "一号",
  "name":"赵小六",
  "age":23,
  "bir":"2012-12-12"
}

运行结果：

{
  "_index" : "est",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 1,
  "_primary_term" : 1
}

查询

查询方法为GET /{indexName}/{typeName}/{文档ID}

示例代码：

1	GET /est/_doc/1

运行结果：

{
  "_index" : "est",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 1,
  "_seq_no" : 1,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "id" : "一号",
    "name" : "赵小六",
    "age" : 23,
    "bir" : "2012-12-12"
  }
}

删除

删除document的方法为DELETE /{indexName}/{typeName}/{文档ID}
示例代码：

1	DELETE /est/_doc/1

运行结果：

{
  "_index" : "est",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 2,
  "result" : "deleted",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 2,
  "_primary_term" : 1
}

更新

不保留原始数据

更新？来吧！

示例代码：

POST /est/_doc/1
{
  "id": "一号",
  "name":"阿门",
  "age":23,
  "bir":"2012-12-12"
}

运行结果：

{
  "_index" : "est",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 7,
  "result" : "updated",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 7,
  "_primary_term" : 1
}

来看看结果。
示例代码：

1	GET /est/_doc/1

运行结果：

{
  "_index" : "est",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 9,
  "_seq_no" : 9,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "id" : "一号",
    "name" : "阿门",
    "age" : 23,
    "bir" : "2012-12-12"
  }
}

更新成功！

再来一个！

示例代码：

POST /est/emp/1
{
  "id": "天字第一号"
}

1	GET /est/emp/1

运行结果：

{
  "_index" : "est",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 10,
  "_seq_no" : 10,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "id" : "天字第一号"
  }
}

解释说明：
POST /{indexName}/{typeName}/{id}，这种方式类似于先删除，再增加。

保留原始数据更新

保留原始数据更新，需要添加关键词_update。

示例代码：

POST /est/_doc/1/_update
{
  "doc": {
    "name": "娃哈哈"
  }
}

1	GET /est/_doc/1

运行结果：

{
  "_index" : "est",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 12,
  "_seq_no" : 12,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "id" : "一号",
    "name" : "娃哈哈",
    "age" : 23,
    "bir" : "2012-12-12"
  }
}

更新时加字段

我们还可以在更新的时候，添加一个不存在字段。

示例代码：

POST /est/_doc/1/_update
{
  "doc": {
    "name": "张三疯",
    "age": 11,
    "dpet": "武当派"
  }
}

1	GET /est/_doc/1

运行结果：

{
  "_index" : "est",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 13,
  "_seq_no" : 13,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "id" : "一号",
    "name" : "张三疯",
    "age" : 11,
    "bir" : "2012-12-12",
    "dpet" : "武当派"
  }
}

居然成功了。
再来看看新的mapping。

示例代码：

1	GET /est/_mapping

运行结果：

#! Deprecation: [types removal] The parameter include_type_name should be explicitly specified in get mapping requests to prepare for 7.0. In 7.0 include_type_name will default to 'false', which means responses will omit the type name in mapping definitions.
{
  "est" : {
    "mappings" : {
      "_doc" : {
        "properties" : {
          "age" : {
            "type" : "integer"
          },
          "bir" : {
            "type" : "date"
          },
          "dpet" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            }
          },
          "id" : {
            "type" : "keyword"
          },
          "name" : {
            "type" : "text"
          }
        }
      }
    }
  }
}

脚本更新

最后一个操作，脚本更新。
示例代码：

POST /est/_doc/1/_update
{
  "script": "ctx._source.age += 5"
}

1	GET /est/_doc/1

运行结果：

{
  "_index" : "est",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 14,
  "_seq_no" : 14,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "id" : "一号",
    "name" : "张三疯",
    "age" : 16,
    "bir" : "2012-12-12",
    "dpet" : "武当派"
  }
}

批量

批量基于关键字_bulk。

示例代码：

POST /est/_doc/_bulk
{"index":{"_id":3}}
{"name":"张三三","age":11,"dpet":"武当派"}
{"delete":{"_id":2}}
{"update":{"_id":1}}
{"doc":{"name":"张三疯","age":11,"dpet":"武当派"}}
{"update":{"_id":2}}
{"doc":{"name":"张三疯","age":11,"dpet":"武当派"}}

运行结果：

{
  "took" : 9,
  "errors" : true,
  "items" : [
    {
      "index" : {
        "_index" : "est",
        "_type" : "_doc",
        "_id" : "3",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 0,
        "_primary_term" : 1,
        "status" : 201
      }
    },
    {
      "delete" : {
        "_index" : "est",
        "_type" : "_doc",
        "_id" : "2",
        "_version" : 1,
        "result" : "not_found",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 0,
        "_primary_term" : 1,
        "status" : 404
      }
    },
    {
      "update" : {
        "_index" : "est",
        "_type" : "_doc",
        "_id" : "1",
        "_version" : 15,
        "result" : "updated",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 15,
        "_primary_term" : 1,
        "status" : 200
      }
    },
    {
      "update" : {
        "_index" : "est",
        "_type" : "_doc",
        "_id" : "2",
        "status" : 404,
        "error" : {
          "type" : "document_missing_exception",
          "reason" : "[_doc][2]: document missing",
          "index_uuid" : "eZxussN2RUaHxkqHOrObPg",
          "shard" : "2",
          "index" : "est"
        }
      }
    }
  ]
}

解释说明：
在示例代码中：

第一行表示要操作的doc，以及操作类型
index的含义为新增
delete的含义为删除
update的含义为修改

在运行结果中：

运行结果会依次返回每一项操作的结果。
不会因为一个失败而全部失败。
(没有事务，本来就是做搜索数据库，搜索。)

Query

Query，译作高级搜索、高级查询、高级检索。

假设存在mapping和数据如下：

示例代码：

PUT /ems
{
  "mappings": {
    "_doc": {
      "properties": {
        "name": {
          "type": "text"
        },
        "age": {
          "type": "integer"
        },
        "bir": {
          "type": "date"
        },
        "content": {
          "type": "text"
        },
        "address": {
          "type": "keyword"
        }
      }
    }
  }
}

PUT /ems/_doc/_bulk
  {"index":{}}
  {"name":"亨利","age":32,"bir":"2012-12-12","content":"当时光的列车缓缓驶过酋长球场","address":"糖果盒"}
  {"index":{}}
  {"name":"范德萨","age":24,"bir":"2012-12-12","content":"再见，范德萨，不老的传说，曼联有你，一生有你。","address":"上海"}
  {"index":{}}
  {"name":"皮尔洛","age":8,"bir":"2012-12-12","content":"从你含泪向队友告别的那一刻起，红黑色的21号将不再是我们熟悉的身影","address":"北京"}
  {"index":{}}
  {"name":"卡洛斯","age":9,"bir":"2012-12-12","content":"卡洛斯把自己的金色岁月留在了伯纳乌，而伯纳乌也给卡洛斯留下了不可抹去的金色记忆。","address":"南京"}
  {"index":{}}
  {"name":"罗纳尔多","age":43,"bir":"2012-12-12","content":"世上只有一个罗纳尔多！","address":"杭州"}
  {"index":{}}
  {"name":"卡卡","age":59,"bir":"2012-12-12","content":"天空，寄托着我的信仰。张开双臂，仰望天空，是对上天恩赐的感激。","address":"北京"}

URL和DSL

ElasticSearch提供了两种Query方法。

URL
示例：GET /索引/类型/_search?参数
DSL(Domain Specified Language)
示例：GET /索引/类型/_search {}

其中官方更推荐第二种，该方法基于传递JSON作为请求体(request body)格式与ES进行交互，这种方式更强大，更简洁。

对于URL方法，我们只需要简单的了解即可。
示例代码：

1	GET /ems/emp/_search?q=*&sort=age:desc&size=5&from=0&_source=name,age,bir

接下来，我们主要讨论DSL。

match_all

match_all，查询所有，返回index中的所有document。

示例代码：

GET /ems/_doc/_search
{
  "query": { "match_all": {} }
}

运行结果：

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 6,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "9o1Bj34BoH8NsaaoSIo9",
        "_score" : 1.0,
        "_source" : {
          "name" : "罗纳尔多",
          "age" : 43,
          "bir" : "2012-12-12",
          "content" : "世上只有一个罗纳尔多！",
          "address" : "杭州"
        }
      },

【部分运行结果略】

      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "841Bj34BoH8NsaaoSIo9",
        "_score" : 1.0,
        "_source" : {
          "name" : "范德萨",
          "age" : 24,
          "bir" : "2012-12-12",
          "content" : "再见，范德萨，不老的传说，曼联有你，一生有你。",
          "address" : "上海"
        }
      }
    ]
  }
}

返回结果说明

took：查询耗时，单位是毫秒
timed_out：是否超时
_shards：分片
hits对象：击中的结果对象
total：击中对象的条数
max_score：搜索最大得分(相关度)
hits数组：符合条件的文档对象组成的数组

sort-order

sort和order，用于排序。
示例代码：

GET /ems/_doc/_search
{
  "query": {
    "match_all": {}
  },
  "sort": [
    {
      "age": {
        "order": "desc"
      }
    },
    {
      "bir": {
        "order": "desc"
      }
    }
  ]
}

运行结果：

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 6,
    "max_score" : null,
    "hits" : [

【部分运行结果略】

      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "9Y1Bj34BoH8NsaaoSIo9",
        "_score" : null,
        "_source" : {
          "name" : "卡洛斯",
          "age" : 9,
          "bir" : "2012-12-12",
          "content" : "卡洛斯把自己的金色岁月留在了伯纳乌，而伯纳乌也给卡洛斯留下了不可抹去的金色记忆。",
          "address" : "南京"
        },
        "sort" : [
          9,
          1355270400000
        ]
      },
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "9I1Bj34BoH8NsaaoSIo9",
        "_score" : null,
        "_source" : {
          "name" : "皮尔洛",
          "age" : 8,
          "bir" : "2012-12-12",
          "content" : "从你含泪向队友告别的那一刻起，红黑色的21号将不再是我们熟悉的身影",
          "address" : "北京"
        },
        "sort" : [
          8,
          1355270400000
        ]
      }
    ]
  }
}

解释说明：

我们进行了多字段排序，在"sort"数组中添加多个字段。
max_score和_score为null，因为我们指定了排序方式。

再来一个，我们再加上name进行排序。
示例代码：

GET /ems/_doc/_search
{
  "query": {
    "match_all": {}
  },
  "sort": [
    {
      "age": {
        "order": "desc"
      }
    },
    {
      "bir": {
        "order": "desc"
      }
    },
    {
      "name" :{
        "order": "desc"
      }
    }
  ]
}

运行结果：

{
  "error": {
    "root_cause": [
      {
        "type": "illegal_argument_exception",
        "reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [name] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead."
      }
    ],
    "type": "search_phase_execution_exception",
    "reason": "all shards failed",
    "phase": "query",
    "grouped": true,
    "failed_shards": [
      {
        "shard": 0,
        "index": "ems",
        "node": "4h-IhFgnQ4SG5s9XbJ1mBg",
        "reason": {
          "type": "illegal_argument_exception",
          "reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [name] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead."
        }
      }
    ],
    "caused_by": {
      "type": "illegal_argument_exception",
      "reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [name] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead.",
      "caused_by": {
        "type": "illegal_argument_exception",
        "reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [name] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead."
      }
    }
  },
  "status": 400
}

报错了！
解释说明：因为name的数据类型是text，会被分词，不支持排序。

from-size

size，指定查询结果中返回指定条数，默认返回值10条
from，指定起始返回位置。

如果再加上sort，就可以实现分页了。

示例代码：

GET /ems/_doc/_search
{
      "query": {"match_all": {}},
      "sort": [
        {
          "age": {
            "order": "desc"
          }
        }
      ],
      "size": 2, 
      "from": 1
}

运行结果：

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 6,
    "max_score" : null,
    "hits" : [
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "9o1Bj34BoH8NsaaoSIo9",
        "_score" : null,
        "_source" : {
          "name" : "罗纳尔多",
          "age" : 43,
          "bir" : "2012-12-12",
          "content" : "世上只有一个罗纳尔多！",
          "address" : "杭州"
        },
        "sort" : [
          43
        ]
      },
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "8o1Bj34BoH8NsaaoSIo9",
        "_score" : null,
        "_source" : {
          "name" : "亨利",
          "age" : 32,
          "bir" : "2012-12-12",
          "content" : "当时光的列车缓缓驶过酋长球场",
          "address" : "糖果盒"
        },
        "sort" : [
          32
        ]
      }
    ]
  }
}

_source

_source 关键字，可以是字符串，也是一个数组。字符串表示只看一个字段，数组表示有多个字段。
示例代码：

GET /ems/_doc/_search
{
      "query": { "match_all": {} },
      "_source": "name"
}

运行结果：

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 6,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "9o1Bj34BoH8NsaaoSIo9",
        "_score" : 1.0,
        "_source" : {
          "name" : "罗纳尔多"
        }
      },
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "941Bj34BoH8NsaaoSIo9",
        "_score" : 1.0,
        "_source" : {
          "name" : "卡卡"
        }
      },
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "9I1Bj34BoH8NsaaoSIo9",
        "_score" : 1.0,
        "_source" : {
          "name" : "皮尔洛"
        }
      },
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "9Y1Bj34BoH8NsaaoSIo9",
        "_score" : 1.0,
        "_source" : {
          "name" : "卡洛斯"
        }
      },
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "8o1Bj34BoH8NsaaoSIo9",
        "_score" : 1.0,
        "_source" : {
          "name" : "亨利"
        }
      },
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "841Bj34BoH8NsaaoSIo9",
        "_score" : 1.0,
        "_source" : {
          "name" : "范德萨"
        }
      }
    ]
  }
}

示例代码：

GET /ems/_doc/_search
{
  "query": {
    "match_all": {}
  },
  "_source": [
    "name",
    "age",
    "money"
  ]
}

运行结果：

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 6,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "9o1Bj34BoH8NsaaoSIo9",
        "_score" : 1.0,
        "_source" : {
          "name" : "罗纳尔多",
          "age" : 43
        }
      },
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "941Bj34BoH8NsaaoSIo9",
        "_score" : 1.0,
        "_source" : {
          "name" : "卡卡",
          "age" : 59
        }
      },
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "9I1Bj34BoH8NsaaoSIo9",
        "_score" : 1.0,
        "_source" : {
          "name" : "皮尔洛",
          "age" : 8
        }
      },
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "9Y1Bj34BoH8NsaaoSIo9",
        "_score" : 1.0,
        "_source" : {
          "name" : "卡洛斯",
          "age" : 9
        }
      },
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "8o1Bj34BoH8NsaaoSIo9",
        "_score" : 1.0,
        "_source" : {
          "name" : "亨利",
          "age" : 32
        }
      },
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "841Bj34BoH8NsaaoSIo9",
        "_score" : 1.0,
        "_source" : {
          "name" : "范德萨",
          "age" : 24
        }
      }
    ]
  }
}

解释说明：

对于没有的字段(例如money)，不会报错。

term

term: 使用关键词查询
示例代码：

GET /ems/_doc/_search
{
  "query": {
    "term": {
      "address": {
        "value": "糖果盒"
      }
    }
  }
}

运行结果：

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.6931472,
    "hits" : [
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "8o1Bj34BoH8NsaaoSIo9",
        "_score" : 0.6931472,
        "_source" : {
          "name" : "亨利",
          "age" : 32,
          "bir" : "2012-12-12",
          "content" : "当时光的列车缓缓驶过酋长球场",
          "address" : "糖果盒"
        }
      }
    ]
  }
}

再来一个，根据name查询。
示例代码：

GET /ems/_doc/_search
{
  "query": {
    "term": {
      "name": {
        "value": "亨利"
      }
    }
  }
}

运行结果：

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 0,
    "max_score" : null,
    "hits" : [ ]
  }
}

没有查到？但是明明有亨利啊。

示例代码：

GET /ems/_doc/_search
{
  "query": {
    "term": {
      "name": {
        "value": "亨"
      }
    }
  }
}

运行结果：

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.7549128,
    "hits" : [
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "8o1Bj34BoH8NsaaoSIo9",
        "_score" : 0.7549128,
        "_source" : {
          "name" : "亨利",
          "age" : 32,
          "bir" : "2012-12-12",
          "content" : "当时光的列车缓缓驶过酋长球场",
          "address" : "糖果盒"
        }
      }
    ]
  }
}

解释说明：

通过使用term查询，使用的是ElasticSearch中默认分词器，标准分词器(StandardAnalyzer)，该分词器对于英文单词分词，对于中文单字分 【字】 。
在ElasticSearch中的八种数据类型，text、keyword、date、integer、long、double、boolean和ip中，只有text会被分词。

特别的，我们可以看看标准分词器的效果。
示例代码：

GET /_analyze
{
  "text": [
    "haha is good",
    "微风"
  ]
}

运行结果：

{
  "tokens" : [
    {
      "token" : "haha",
      "start_offset" : 0,
      "end_offset" : 4,
      "type" : "<ALPHANUM>",
      "position" : 0
    },
    {
      "token" : "is",
      "start_offset" : 5,
      "end_offset" : 7,
      "type" : "<ALPHANUM>",
      "position" : 1
    },
    {
      "token" : "good",
      "start_offset" : 8,
      "end_offset" : 12,
      "type" : "<ALPHANUM>",
      "position" : 2
    },
    {
      "token" : "微",
      "start_offset" : 13,
      "end_offset" : 14,
      "type" : "<IDEOGRAPHIC>",
      "position" : 3
    },
    {
      "token" : "风",
      "start_offset" : 14,
      "end_offset" : 15,
      "type" : "<IDEOGRAPHIC>",
      "position" : 4
    }
  ]
}

中文被分成了一个一个的字。

match_phrase

我们还可以利用match_phrase，其首先将查询字符串解析成一个词项列表，然后对这些词项进行搜索，但只保留那些包含全部搜索词项，且位置与搜索词项相同的文档。

示例代码：

GET /ems/_doc/_search
{
  "query": {
    "match_phrase": {
      "name": "亨利"
    }
  }
}

运行结果：

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.2876821,
    "hits" : [
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "WBamj34BCoX_YrYqZTh2",
        "_score" : 0.2876821,
        "_source" : {
          "name" : "亨利",
          "age" : 32,
          "bir" : "2012-12-12",
          "content" : "当时光的列车缓缓驶过酋长球场",
          "address" : "糖果盒"
        }
      }
    ]
  }
}

terms

terms，类似于SQL中的in。

示例代码：

GET /ems/_doc/_search
{
  "query": {
    "terms": {
      "address": [
        "糖果盒",
        "上海"
      ]
    }
  }
}

运行结果：

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 2,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "WBamj34BCoX_YrYqZTh2",
        "_score" : 1.0,
        "_source" : {
          "name" : "亨利",
          "age" : 32,
          "bir" : "2012-12-12",
          "content" : "当时光的列车缓缓驶过酋长球场",
          "address" : "糖果盒"
        }
      },
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "WRamj34BCoX_YrYqZTh2",
        "_score" : 1.0,
        "_source" : {
          "name" : "范德萨",
          "age" : 24,
          "bir" : "2012-12-12",
          "content" : "再见，范德萨，不老的传说，曼联有你，一生有你。",
          "address" : "上海"
        }
      }
    ]
  }
}

range

range，查询指定范围内的文档
有四种比较规则：

lt：小于
lte：小于等于
gt：大于
gte：大于等于

示例代码：

GET /ems/_doc/_search
{
  "query": {
    "range": {
      "age": {
        "gte": 9,
        "lte": 30
      }
    }
  }
}

运行结果：

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 2,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "9Y1Bj34BoH8NsaaoSIo9",
        "_score" : 1.0,
        "_source" : {
          "name" : "卡洛斯",
          "age" : 9,
          "bir" : "2012-12-12",
          "content" : "卡洛斯把自己的金色岁月留在了伯纳乌，而伯纳乌也给卡洛斯留下了不可抹去的金色记忆。",
          "address" : "南京"
        }
      },
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "841Bj34BoH8NsaaoSIo9",
        "_score" : 1.0,
        "_source" : {
          "name" : "范德萨",
          "age" : 24,
          "bir" : "2012-12-12",
          "content" : "再见，范德萨，不老的传说，曼联有你，一生有你。",
          "address" : "上海"
        }
      }
    ]
  }
}

prefix

prefix，查找含有指定前缀的关键词的相关文档。

示例代码：

GET /ems/_doc/_search
{
  "query": {
    "prefix": {
      "address": {
        "value": "糖"
      }
    }
  }
}

运行结果：

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "8o1Bj34BoH8NsaaoSIo9",
        "_score" : 1.0,
        "_source" : {
          "name" : "亨利",
          "age" : 32,
          "bir" : "2012-12-12",
          "content" : "当时光的列车缓缓驶过酋长球场",
          "address" : "糖果盒"
        }
      }
    ]
  }
}

wildcard

wildcard，通配符查询

?，用来匹配一个任意字符
*，用来匹配多个任意字符

示例代码：

GET /ems/_doc/_search
{
  "query": {
    "wildcard": {
      "content": {
        "value": "当*"
      }
    }
  }
}

运行结果：

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "8o1Bj34BoH8NsaaoSIo9",
        "_score" : 1.0,
        "_source" : {
          "name" : "亨利",
          "age" : 32,
          "bir" : "2012-12-12",
          "content" : "当时光的列车缓缓驶过酋长球场",
          "address" : "糖果盒"
        }
      }
    ]
  }
}

有些资料会说，*和?可以不能写在前面，在实际测试中，是可以写在前面的。
当然，根据我们知道的倒排索引，这个写前面应该会导致查询性能不佳。
示例代码：

GET /ems/_doc/_search
{
  "query": {
    "wildcard": {
      "content": {
        "value": "*场"
      }
    }
  }
}

运行结果：

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "8o1Bj34BoH8NsaaoSIo9",
        "_score" : 1.0,
        "_source" : {
          "name" : "亨利",
          "age" : 32,
          "bir" : "2012-12-12",
          "content" : "当时光的列车缓缓驶过酋长球场",
          "address" : "糖果盒"
        }
      }
    ]
  }
}

fuzzy

fuzzy，用来模糊查询含有指定关键字的文档。

搜索关键词长度为 $2$ ，不允许存在模糊。最大模糊为 $0$ 。
搜索关键词长度为 $[3,5]$ ，允许一次模糊。最大模糊为 $1$ 。
搜索关键词长度大于 $5$ ，最大模糊为 $2$ 。

示例代码：

GET /ems/_doc/_search
{
  "query": {
    "fuzzy": {
      "address":"糖果果"
    }
  }
}

运行结果：

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.46209812,
    "hits" : [
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "8o1Bj34BoH8NsaaoSIo9",
        "_score" : 0.46209812,
        "_source" : {
          "name" : "亨利",
          "age" : 32,
          "bir" : "2012-12-12",
          "content" : "当时光的列车缓缓驶过酋长球场",
          "address" : "糖果盒"
        }
      }
    ]
  }
}

ids

ids，值为数组类型，用来根据一组id获取多个对应的文档

示例代码：

GET /ems/_doc/_search
{
  "query": {
    "ids": {
      "values": [
        "9Y1Bj34BoH8NsaaoSIo9",
        "841Bj34BoH8NsaaoSIo9"
      ]
    }
  }
}

运行结果：

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 2,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "9Y1Bj34BoH8NsaaoSIo9",
        "_score" : 1.0,
        "_source" : {
          "name" : "卡洛斯",
          "age" : 9,
          "bir" : "2012-12-12",
          "content" : "卡洛斯把自己的金色岁月留在了伯纳乌，而伯纳乌也给卡洛斯留下了不可抹去的金色记忆。",
          "address" : "南京"
        }
      },
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "841Bj34BoH8NsaaoSIo9",
        "_score" : 1.0,
        "_source" : {
          "name" : "范德萨",
          "age" : 24,
          "bir" : "2012-12-12",
          "content" : "再见，范德萨，不老的传说，曼联有你，一生有你。",
          "address" : "上海"
        }
      }
    ]
  }
}

bool

bool：用来组合多个条件实现复杂查询。

must：有点类似and，同时成立。
should：有点类似or，成立一个就行。
must_not：有点类似not，不能满足任何一个。

那么，为什么不直接取名为and和or呢？因为和and和or又不一样。稍后我们会看到区别。

示例代码：

GET /ems/_doc/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "range": {
            "age": {
              "gte": 0,
              "lte": 100
            }
          }
        }
      ],
      "must_not": [
        {
          "wildcard": {
            "address": {
              "value": "糖果?"
            }
          }
        }
      ]
    }
  },
  "sort": [
    {
      "age": {
        "order": "desc"
      }
    }
  ]
}

运行结果：

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 5,
    "max_score" : null,
    "hits" : [
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "941Bj34BoH8NsaaoSIo9",
        "_score" : null,
        "_source" : {
          "name" : "卡卡",
          "age" : 59,
          "bir" : "2012-12-12",
          "content" : "天空，寄托着我的信仰。张开双臂，仰望天空，是对上天恩赐的感激。",
          "address" : "北京"
        },
        "sort" : [
          59
        ]
      },
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "9o1Bj34BoH8NsaaoSIo9",
        "_score" : null,
        "_source" : {
          "name" : "罗纳尔多",
          "age" : 43,
          "bir" : "2012-12-12",
          "content" : "世上只有一个罗纳尔多！",
          "address" : "杭州"
        },
        "sort" : [
          43
        ]
      },
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "841Bj34BoH8NsaaoSIo9",
        "_score" : null,
        "_source" : {
          "name" : "范德萨",
          "age" : 24,
          "bir" : "2012-12-12",
          "content" : "再见，范德萨，不老的传说，曼联有你，一生有你。",
          "address" : "上海"
        },
        "sort" : [
          24
        ]
      },
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "9Y1Bj34BoH8NsaaoSIo9",
        "_score" : null,
        "_source" : {
          "name" : "卡洛斯",
          "age" : 9,
          "bir" : "2012-12-12",
          "content" : "卡洛斯把自己的金色岁月留在了伯纳乌，而伯纳乌也给卡洛斯留下了不可抹去的金色记忆。",
          "address" : "南京"
        },
        "sort" : [
          9
        ]
      },
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "9I1Bj34BoH8NsaaoSIo9",
        "_score" : null,
        "_source" : {
          "name" : "皮尔洛",
          "age" : 8,
          "bir" : "2012-12-12",
          "content" : "从你含泪向队友告别的那一刻起，红黑色的21号将不再是我们熟悉的身影",
          "address" : "北京"
        },
        "sort" : [
          8
        ]
      }
    ]
  }
}

接下来，我们就要解释，为什么是must和should，不是and和or了。

首先，满足a=1或b=2。
示例代码：

{
  "query": {

    "bool": {

      "should": [

        {
          "match": {

            "a": "1"

          },
        }

        {
          "match": {

            "b": "2"

          }
        }

      ]
    }
  }
}

这个没问题，再来一个。我们再加一个条件，“并且 c=3”。
即：“满足a=1或b=2，并且c=3”。

示例代码：

{
  "query": {

    "bool": {

      "must": [

        {
          "match": {

            "c": "3"

          }
        }

      ],

      "should": [

        {
          "match": {

            "a": "1"

          },
        } {
          "match": {

            "b": "2"

          }
        }

      ]

    }
  }
}

错了！
should在与must、filter同级时，默认是不需要满足should中的任何条件的，此时我们可以加上minimum_should_match参数，来达到我们的目的。

示例代码：

{
  "query": {

    "bool": {

      "must": [

        {
          "match": {

            "c": "3"

          }
        }

      ],

      "should": [

        {
          "match": {

            "a": "1"

          }
        },
        {
          "match": {

            "b": "2"

          }
        }

      ],

      "minimum_should_match": 1

    }
  }
}

highlight

高亮查询

highlight：可以让符合条件的文档中的关键词高亮

需要注意的是，这个不是查询的筛选条件，而是对查询结果做二次渲染。

示例代码：

GET /ems/_doc/_search
{
  "query": {
    "term": {
      "content": {
        "value": "时"
      }
    }
  },
  "highlight": {
    "fields": {
      "*": {}
    }
  }
}

运行结果：

{
  "took" : 40,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.73050237,
    "hits" : [
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "8o1Bj34BoH8NsaaoSIo9",
        "_score" : 0.73050237,
        "_source" : {
          "name" : "亨利",
          "age" : 32,
          "bir" : "2012-12-12",
          "content" : "当时光的列车缓缓驶过酋长球场",
          "address" : "糖果盒"
        },
        "highlight" : {
          "content" : [
            "当<em>时</em>光的列车缓缓驶过酋长球场"
          ]
        }
      }
    ]
  }
}

解释说明：em标签，斜体。

自定义高亮html标签

可以在highlight中使用pre_tags和post_tags

示例代码：
示例代码：

GET /ems/_doc/_search
{
  "query": {
    "term": {
      "content": {
        "value": "时"
      }
    }
  },
  "highlight": {
    "pre_tags": [
      "<span style='color:red'>"
    ],
    "post_tags": [
      "</span>"
    ],
    "fields": {
      "*": {}
    }
  }
}

运行结果：

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.73050237,
    "hits" : [
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "8o1Bj34BoH8NsaaoSIo9",
        "_score" : 0.73050237,
        "_source" : {
          "name" : "亨利",
          "age" : 32,
          "bir" : "2012-12-12",
          "content" : "当时光的列车缓缓驶过酋长球场",
          "address" : "糖果盒"
        },
        "highlight" : {
          "content" : [
            "当<span style='color:red'>时</span>光的列车缓缓驶过酋长球场"
          ]
        }
      }
    ]
  }
}

多字段高亮

多字段高亮，使用require_field_match开启多个字段高亮。

示例代码：

GET /ems/_doc/_search
{
  "query": {
    "term": {
      "content": "卡"
    }
  },
  "highlight": {
    "pre_tags": [
      "<span style='color:red'>"
    ],
    "post_tags": [
      "</span>"
    ],
    "require_field_match": false,
    "fields": {
      "*": {}
    }
  }
}

运行结果：

{
  "took" : 5,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.9266379,
    "hits" : [
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "9Y1Bj34BoH8NsaaoSIo9",
        "_score" : 0.9266379,
        "_source" : {
          "name" : "卡洛斯",
          "age" : 9,
          "bir" : "2012-12-12",
          "content" : "卡洛斯把自己的金色岁月留在了伯纳乌，而伯纳乌也给卡洛斯留下了不可抹去的金色记忆。",
          "address" : "南京"
        },
        "highlight" : {
          "name" : [
            "<span style='color:red'>卡</span>洛斯"
          ],
          "content" : [
            "<span style='color:red'>卡</span>洛斯把自己的金色岁月留在了伯纳乌，而伯纳乌也给<span style='color:red'>卡</span>洛斯留下了不可抹去的金色记忆。"
          ]
        }
      }
    ]
  }
}

注意！需要将require_field_match设置为false，在fields中填字段。

multi_match

multi_match，多字段查询。
特点为：

如果搜索的字段分词，会对关键词先分词，再搜索。
如果搜索的字段不分词，会直接使用关键词搜索。

所以在fields中，一般都是可分词字段。

示例代码：

GET /ems/_doc/_search
{
  "query": {
    "multi_match": {
      "query": "卡卡",
      "fields": ["name","content"]
    }
  }
}

运行结果：

{
  "took" : 5,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 2,
    "max_score" : 1.8532758,
    "hits" : [
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "9Y1Bj34BoH8NsaaoSIo9",
        "_score" : 1.8532758,
        "_source" : {
          "name" : "卡洛斯",
          "age" : 9,
          "bir" : "2012-12-12",
          "content" : "卡洛斯把自己的金色岁月留在了伯纳乌，而伯纳乌也给卡洛斯留下了不可抹去的金色记忆。",
          "address" : "南京"
        }
      },
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "941Bj34BoH8NsaaoSIo9",
        "_score" : 0.7911257,
        "_source" : {
          "name" : "卡卡",
          "age" : 59,
          "bir" : "2012-12-12",
          "content" : "天空，寄托着我的信仰。张开双臂，仰望天空，是对上天恩赐的感激。",
          "address" : "北京"
        }
      }
    ]
  }
}

query_string

query_string，多字段分词查询。处理在查询的时候能分词，还能指定分词器。

示例代码：

GET /dangdang/book/_search
{
  "query": {
    "query_string": {
      "query": "中国声音",
      "analyzer": "ik_max_word", 
      "fields": ["name","content"]
    }
  }
}

关于分词器，我们在下文会做更详细的讨论。

Filter

Filter，译作过滤。

过滤查询

ELasticSearch中的查询分为两种。

查询(query)：默认会计算每个返回文档的得分，然后根据得分排序
过滤(filter)：只会筛选出符合的文档，并不计算得分。

所以，单从性能考虑，过滤比查询更快。

过滤语法

示例代码：

GET /ems/_doc/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match_all": {}
        }
      ],
      "filter": {
        "term": {
          "age": 32
        }
      }
    }
  }
}

运行结果：

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "WBamj34BCoX_YrYqZTh2",
        "_score" : 1.0,
        "_source" : {
          "name" : "亨利",
          "age" : 32,
          "bir" : "2012-12-12",
          "content" : "当时光的列车缓缓驶过酋长球场",
          "address" : "糖果盒"
        }
      }
    ]
  }
}

在执行filter和query时，先执行filter，后执行query。
常见的过滤器类型有：
- term
- terms
- ranage
- exists：过滤存在指定字段，且字段不为空的index。

我们举一个exists的例子。

GET /ems/_doc/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "name": {
              "value": "中国"
            }
          }
        }
      ],
      "filter": {
        "exists": {
          "field": "haha"
        }
      }
    }
  }
}

讲了这么多查询？那么接下来，应该是关联查询了吧。
没有关联查询。
虽然ElasticSearch支持join，但是官方不建议我们这么做，因为性能极差。
如果一定要做join，应该从程序或者建宽表的角度处理。

IK分词器

ElasticSearch采取的默认分词器是标准分词器，该分词器对于中文是单字分词。

我们可以采用IK分词器
Github地址为：https://github.com/medcl/elasticsearch-analysis-ik

在线安装

在bin目录中执行如下命令，进行安装。

1	./elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.8.23/elasticsearch-analysis-ik-6.8.23.zip

重启生效

有一些资料说，安装分词器之后，需要把ElasticSearch中的历史索引数据删除，即删除ElasticSearch安装目录中的data文件夹。

实际测试，其实完全不需要！

而且，安装一个分词器，就需要删除历史索引数据？
ElasticSearch中可能有几百G，甚至1T的数据。安装一个分词器，就要把数据删了，重新导入？
不至于吧。

最后，我们论述一下elasticsearch-plugin的相关命令。

list：Lists installed elasticsearch plugins
install：Install a plugin
remove：removes a plugin from Elasticsearch

需要注意的是，在线安装的IK配置文件为

1	{ElasticSearcg安装目录}/config/analysis-ik/IKAnalyzer.cfg.xml

与本地安装IK的配置文件地址不同。

本地安装IK

本地安装：

将IK分词器传输至服务器。
解压。
1
unzip elasticsearch-analysis-ik-6.8.23.zip
- 如果提示没有的的话，先安装unzip。命令如下：
1
yum install -y unzip

移动至plugins文件夹

1 2	cd plugins/ cp -r ~/elasticsearch-analysis-ik-6.8.23 ./

重启生效。

需要注意的是，本地安装的IK配置文件为

1	{ElasticSearch安装目录中}/plugins/analysis-ik/config/IKAnalyzer.cfg.xml

与在线安装IK的配置文件地址不同。

测试IK分词器

IK分词器提供了两种分词方法：

ik_max_word: 会将文本做最细粒度的拆分。
ik_smart: 会做粗粒度的拆分。

我们直接看例子。
示例代码：

GET /_analyze
{
  "text": ["中华人民共和国国歌"],
  "analyzer": "ik_max_word"
}

运行结果：

{
  "tokens" : [
    {
      "token" : "中华人民共和国",
      "start_offset" : 0,
      "end_offset" : 7,
      "type" : "CN_WORD",
      "position" : 0
    },
    {
      "token" : "中华人民",
      "start_offset" : 0,
      "end_offset" : 4,
      "type" : "CN_WORD",
      "position" : 1
    },
    {
      "token" : "中华",
      "start_offset" : 0,
      "end_offset" : 2,
      "type" : "CN_WORD",
      "position" : 2
    },
    {
      "token" : "华人",
      "start_offset" : 1,
      "end_offset" : 3,
      "type" : "CN_WORD",
      "position" : 3
    },
    {
      "token" : "人民共和国",
      "start_offset" : 2,
      "end_offset" : 7,
      "type" : "CN_WORD",
      "position" : 4
    },
    {
      "token" : "人民",
      "start_offset" : 2,
      "end_offset" : 4,
      "type" : "CN_WORD",
      "position" : 5
    },
    {
      "token" : "共和国",
      "start_offset" : 4,
      "end_offset" : 7,
      "type" : "CN_WORD",
      "position" : 6
    },
    {
      "token" : "共和",
      "start_offset" : 4,
      "end_offset" : 6,
      "type" : "CN_WORD",
      "position" : 7
    },
    {
      "token" : "国",
      "start_offset" : 6,
      "end_offset" : 7,
      "type" : "CN_CHAR",
      "position" : 8
    },
    {
      "token" : "国歌",
      "start_offset" : 7,
      "end_offset" : 9,
      "type" : "CN_WORD",
      "position" : 9
    }
  ]
}

示例代码：

GET /_analyze
{
  "text": ["中华人民共和国国歌"],
  "analyzer": "ik_smart"
}

运行结果：

{
  "tokens" : [
    {
      "token" : "中华人民共和国",
      "start_offset" : 0,
      "end_offset" : 7,
      "type" : "CN_WORD",
      "position" : 0
    },
    {
      "token" : "国歌",
      "start_offset" : 7,
      "end_offset" : 9,
      "type" : "CN_WORD",
      "position" : 1
    }
  ]
}

创建index指定分词器

我们可以利用analyzer和search_analyzer，在创建index的时候指定分词器。

示例代码：

PUT /ems
{
  "mappings":{
    "_doc":{
      "properties":{
        "name":{
          "type":"text",
           "analyzer": "ik_max_word",
           "search_analyzer": "ik_max_word"
        },
        "age":{
          "type":"integer"
        },
        "bir":{
          "type":"date"
        },
        "content":{
          "type":"text",
          "analyzer": "ik_max_word",
          "search_analyzer": "ik_max_word"
        },
        "address":{
          "type":"keyword"
        }
      }
    }
  }
}

为name和content，指定了分词器。

PUT /ems/_doc/_bulk
  {"index":{}}
  {"name":"亨利","age":32,"bir":"2012-12-12","content":"当时光的列车缓缓驶过酋长球场","address":"糖果盒"}
  {"index":{}}
  {"name":"范德萨","age":24,"bir":"2012-12-12","content":"再见，范德萨，不老的传说，曼联有你，一生有你。","address":"上海"}
  {"index":{}}
  {"name":"皮尔洛","age":8,"bir":"2012-12-12","content":"从你含泪向队友告别的那一刻起，红黑色的21号将不再是我们熟悉的身影","address":"北京"}
  {"index":{}}
  {"name":"卡洛斯","age":9,"bir":"2012-12-12","content":"卡洛斯把自己的金色岁月留在了伯纳乌，而伯纳乌也给卡洛斯留下了不可抹去的金色记忆。","address":"南京"}
  {"index":{}}
  {"name":"罗纳尔多","age":43,"bir":"2012-12-12","content":"世上只有一个罗纳尔多！","address":"杭州"}
  {"index":{}}
  {"name":"卡卡","age":59,"bir":"2012-12-12","content":"天空，寄托着我的信仰。张开双臂，仰望天空，是对上天恩赐的感激。","address":"北京"}

试一下。
示例代码：

GET /ems/_doc/_search
{
  "query":{
    "term":{
      "content":"时光"
    }
  },
  "highlight": {
    "pre_tags": ["<span style='color:red'>"],
    "post_tags": ["</span>"],
    "fields": {
      "*":{}
    }
  }
}

运行结果：

{
  "took" : 40,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.2876821,
    "hits" : [
      {
        "_index" : "ems",
        "_type" : "_doc",
        "_id" : "WBamj34BCoX_YrYqZTh2",
        "_score" : 0.2876821,
        "_source" : {
          "name" : "亨利",
          "age" : 32,
          "bir" : "2012-12-12",
          "content" : "当时光的列车缓缓驶过酋长球场",
          "address" : "糖果盒"
        },
        "highlight" : {
          "content" : [
            "当<span style='color:red'>时光</span>的列车缓缓驶过酋长球场"
          ]
        }
      }
    ]
  }
}

配置扩展词

IK支持自定义扩展词典和停用词典

扩展词典：希望添加进词典的词
停用词典：希望从词典中移除的词

修改IKAnalyzer.cfg.xml，即可添加扩展词。

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
        <comment>IK Analyzer 扩展配置</comment>
        <!--用户可以在这里配置自己的扩展字典 -->
        <entry key="ext_dict"></entry>
         <!--用户可以在这里配置自己的扩展停止词字典-->
        <entry key="ext_stopwords"></entry>
        <!--用户可以在这里配置远程扩展字典 -->
        <!-- <entry key="remote_ext_dict">words_location</entry> -->
        <!--用户可以在这里配置远程扩展停止词字典-->
        <!-- <entry key="remote_ext_stopwords">words_location</entry> -->
</properties>