本文最后更新于：2024年3月18日凌晨

Elasticsearch 搜索

Elasticsearch 允许我们搜索存在于所有索引或一些特定索引中的文档。

1	`GET /users/_search?q=username:Tim`

{
    "took": 1,
    "timed_out": false,
    "_shards": {
        "total": 3,
        "successful": 3,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 1,
            "relation": "eq"
        },
        "max_score": 0.2876821,
        "hits": [
            {
                "_index": "users",
                "_type": "_doc",
                "_id": "1",
                "_score": 0.2876821,
                "_source": {
                    "username": "Tim",
                    "password": "123456",
                    "age": 20,
                    "date": "2000/1/1",
                    "desc": "A person"
                }
            }
        ]
    }
}

score：代表该条记录的权重，结果按照权重大小排序。
如下这些参数可以使用统一资源标识符在搜索操作中传递。

参数	说明
q	此参数用于指定查询字符串，
fields	此参数用于在响应中选择返回字段，
sort	可以通过使用这个参数获得排序结果，这个参数的可能值是 `fieldName`, `fieldName:asc` 和 `fieldname:desc`
timeout	使用此参数限定搜索时间，响应只包含指定时间内的匹配，默认情况下，无超时，
terminate_after	可以将响应限制为每个分片的指定数量的文档，当到达这个数量以后，查询将提前终止，默认情况下不设置 `terminate_after`,
size	它表示要返回的命中数，默认值为 `10`,

DSL 查询

在 Elasticsearch 中，通过使用基于 JSON 的查询进行搜索，查询由两个子句组成。
- 叶查询子句：这些子句是匹配，项或范围的，它们在特定字段中查找特定值。
- 复合查询子句：这些查询是叶查询子句和其他复合查询的组合，用于提取所需的信息。
Elasticsearch 支持大量查询，查询从查询关键字开始，然后以 JSON 对象的形式在其中包含条件和过滤器，以下描述了不同类型的查询。

match

模糊查询，会使用分词器解析，先分析再通过分析的文档进行查询。
match 查询 keyword 字段： match 会被分词，而 keyword 不会被分词， match 的需要跟 keyword 的完全匹配。
match 查询 text 字段： match 分词， text 也分词，只要 match 的分词结果和 text 的分词结果有相同的就匹配。

POST /users/_search

{
    "query": {
        "match": {
            "desc": "Tim Mike"
        }
    },
    "_source": [
        "username",
        "desc"
    ],
    "sort": [
        {
            "age": {
                "order": "desc"
            }
        }
    ],
    "from": 0,
    "size": 2
}

如果有多个查询字段用空格隔开。
_source：指定查询的属性。
sort：进行排序。
from 和 size：分页的起始位置和页面大小。

{
    "took": 3,
    "timed_out": false,
    "_shards": {
        "total": 3,
        "successful": 3,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 2,
            "relation": "eq"
        },
        "max_score": null,
        "hits": [
            {
                "_index": "users",
                "_type": "_doc",
                "_id": "2",
                "_score": null,
                "_source": {
                    "username": "Mike",
                    "desc": "I am Mike"
                },
                "sort": [
                    30
                ]
            },
            {
                "_index": "users",
                "_type": "_doc",
                "_id": "1",
                "_score": null,
                "_source": {
                    "username": "Tim",
                    "desc": "I am Tim"
                },
                "sort": [
                    20
                ]
            }
        ]
    }
}

match_all

返回所有内容，并每个对象 score 为 1.0

POST /users/_search

{
    "query": {
        "match_all": {}
    }
}

{
    "took": 1,
    "timed_out": false,
    "_shards": {
        "total": 3,
        "successful": 3,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 2,
            "relation": "eq"
        },
        "max_score": 1,
        "hits": [
            {
                "_index": "users",
                "_type": "_doc",
                "_id": "2",
                "_score": 1,
                "_source": {
                    "username": "Mike",
                    "password": "333333",
                    "age": 30,
                    "date": "1999/1/1",
                    "desc": "I am Mike"
                }
            },
            {
                "_index": "users",
                "_type": "_doc",
                "_id": "1",
                "_score": 1,
                "_source": {
                    "username": "Tim",
                    "password": "123456",
                    "age": 20,
                    "date": "2000/1/1",
                    "desc": "A person"
                }
            }
        ]
    }
}

multi_match

此查询将文本或短语与多个字段匹配。

POST /users/_search

{
    "query": {
        "multi_match": {
            "query": "Tim",
            "fields": [
                "username",
                "desc"
            ]
        }
    }
}

{
    "took": 1,
    "timed_out": false,
    "_shards": {
        "total": 3,
        "successful": 3,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 1,
            "relation": "eq"
        },
        "max_score": 0.2876821,
        "hits": [
            {
                "_index": "users",
                "_type": "_doc",
                "_id": "1",
                "_score": 0.2876821,
                "_source": {
                    "username": "Tim",
                    "password": "123456",
                    "age": 20,
                    "date": "2000/1/1",
                    "desc": "I am Tim"
                }
            }
        ]
    }
}

match_phrase

match_phrase 匹配 keyword 字段： match_phrase 会被分词，而 keyword 不会被分词， match_phrase 的需要跟 keyword 的完全匹配。
match_phrase 匹配 text 字段： match_phrase 是分词的， text 也是分词的， match_phrase 的分词结果必须在 text 字段分词中都包含，而且顺序必须相同，而且必须都是连续的。

POST /users/_search

{
    "query": {
        "match_phrase": {
            "desc": "I am"
        }
    }
}

{
    "took": 1,
    "timed_out": false,
    "_shards": {
        "total": 3,
        "successful": 3,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 2,
            "relation": "eq"
        },
        "max_score": 0.2876821,
        "hits": [
            {
                "_index": "users",
                "_type": "_doc",
                "_id": "2",
                "_score": 0.2876821,
                "_source": {
                    "username": "Mike",
                    "password": "333333",
                    "age": 30,
                    "date": "1999/1/1",
                    "desc": "I am Mike"
                }
            },
            {
                "_index": "users",
                "_type": "_doc",
                "_id": "1",
                "_score": 0.2876821,
                "_source": {
                    "username": "Tim",
                    "password": "123456",
                    "age": 20,
                    "date": "2000/1/1",
                    "desc": "I am Tim"
                }
            }
        ]
    }
}

query_string

query_string 用查询解析器解析输入并在运算符周围分割文本。
query_string 无法查询查询 key 类型的字段。
query_string 查询 text 类型的字段：和 match_phrase 区别的是，不需要连续，顺序还可以调换。

POST /users/_search

{
    "query":{
        "query_string":{
            "query": "Ti* OR *ike",
            "fields" : ["username"],
        }
    }
}

{
    "took": 1,
    "timed_out": false,
    "_shards": {
        "total": 3,
        "successful": 3,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 2,
            "relation": "eq"
        },
        "max_score": 0.2876821,
        "hits": [
            {
                "_index": "users",
                "_type": "_doc",
                "_id": "2",
                "_score": 0.2876821,
                "_source": {
                    "username": "Mike",
                    "password": "333333",
                    "age": 30,
                    "date": "1999/1/1",
                    "desc": "I am Mike"
                }
            },
            {
                "_index": "users",
                "_type": "_doc",
                "_id": "1",
                "_score": 0.2876821,
                "_source": {
                    "username": "Tim",
                    "password": "123456",
                    "age": 20,
                    "date": "2000/1/1",
                    "desc": "I am Tim"
                }
            }
        ]
    }
}

term

使用倒排索引，代表完全匹配，即不进行分词器分析，文档中必须包含整个搜索的词汇。
term 查询 keyword 字段： term 不会分词，而 keyword 字段也不分词，需要完全匹配才可。
term 查询 text 字段：因为 text 字段会分词，而 term 不分词，所以 term 查询的条件必须是 text 字段分词后的某一个。

POST /users/_search

{
    "query": {
        "term": {
            "age": 20
        }
    }
}

{
    "took": 2,
    "timed_out": false,
    "_shards": {
        "total": 3,
        "successful": 3,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 1,
            "relation": "eq"
        },
        "max_score": 1,
        "hits": [
            {
                "_index": "users",
                "_type": "_doc",
                "_id": "1",
                "_score": 1,
                "_source": {
                    "username": "Tim",
                    "password": "123456",
                    "age": 20,
                    "date": "2000/1/1",
                    "desc": "I am Tim"
                }
            }
        ]
    }
}

range

此查询用于查找值的范围之间的值的对象。
- gte：大于和等于。
- gt：大于。
- lte：小于和等于。
- lt：小于。

POST /users/_search

{
    "query": {
        "range": {
            "age": {
                "gte": 25
            }
        }
    }
}

{
    "took": 19,
    "timed_out": false,
    "_shards": {
        "total": 3,
        "successful": 3,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 1,
            "relation": "eq"
        },
        "max_score": 1,
        "hits": [
            {
                "_index": "users",
                "_type": "_doc",
                "_id": "2",
                "_score": 1,
                "_source": {
                    "username": "Mike",
                    "password": "333333",
                    "age": 30,
                    "date": "1999/1/1",
                    "desc": "I am Mike"
                }
            }
        ]
    }
}

bool

布尔查询，将子查询为真的放入查询结果中。
must：所有的语句都必须（must）匹配，与 AND 等价。
must_not：所有的语句都不能（must not）匹配，与 NOT 等价。
should：至少有一个语句要匹配，与 OR 等价。
filter：返回的文档必须满足 filter 子句的条件，但是跟 must 不一样的是，不会计算分值，并且可以使用缓存。

POST /users/_search

{
    "query": {
        "bool": {
            "must": [
                {
                    "match": {
                        "username": "Tim"
                    }
                },
                {
                    "match": {
                        "password": 123456
                    }
                }
            ],
            "filter": {
                "range": {
                    "age": {
                        "gte": 20
                    }
                }
            }
        }
    }
}

{
	"took": 2,
	"timed_out": false,
	"_shards": {
		"total": 3,
		"successful": 3,
		"skipped": 0,
		"failed": 0
	},
	"hits": {
		"total": {
			"value": 1,
			"relation": "eq"
		},
		"max_score": 0.5753642,
		"hits": [
			{
				"_index": "users",
				"_type": "_doc",
				"_id": "1",
				"_score": 0.5753642,
				"_source": {
					"username": "Tim",
					"password": "123456",
					"age": 20,
					"date": "2000/1/1",
					"desc": "I am Tim"
				}
			}
		]
	}
}

highlight

高亮查询，对匹配的关键字做高亮处理。

POST /users/_search

{
    "query": {
        "match": {
            "username": "Tim"
        }
    },
    "highlight": {
        "pre_tags": "<i style='color:red;'>",
        "post_tags": "</i>",
        "fields": {
            "name":{}
        }
    }
}

Software BackEnd Database ElasticSearch

本博客所有文章除特别声明外，均采用 CC BY-SA 4.0 协议，转载请注明出处！

Elasticsearch 文档上一篇

Elasticsearch 中文分词插件IK 下一篇