ES学习记录9.4——请求体搜索(字段折叠Filed Collapsing和搜索后Search After)

最新推荐文章于 2026-05-10 18:55:36 发布

原创

最新推荐文章于 2026-05-10 18:55:36 发布 · 2.5k 阅读

本文介绍了Elasticsearch中如何实现字段折叠，用于优化搜索结果，特别是基于字段值折叠排序后的文档。同时讲解了搜索后的处理方法Search After，作为分页的高效替代方案，避免了大页码时的性能问题。文章通过实例详细解析了字段折叠的使用，包括展开折叠、二级折叠以及Search After的工作原理和应用。

1. 字段折叠

ES允许基于字段值折叠搜索结果，ES仅对排序后文档的顶部文档执行成折叠操作，比如从每个推特用户获取它们最好的推文并通过其他用户的点赞数进行排序(升序)：

// 创建索引，这里一定要将user字段的类型设置为keyword或numeric
curl -X PUT "localhost:9200/twitter" -H 'Content-Type: application/json' -d'
{
   
     
	"mappings": {
   
   
	    "_doc": {
   
   
	      "properties": {
   
   
	        "user": {
   
   "type": "keyword"}
	      }
    	}
	}
}
'
// 模拟数据4个，分别是1-10，2-9，3-8，4-12(_id-likes)
curl -X POST "localhost:9200/twitter/_doc/1" -H 'Content-Type: application/json' -d'
{
   
   
	"user": "kimchy1",
    "likes": 10,
    // 使用postman自带的时间戳变量进行赋值
	"post_date": {
   
   {
   
   $timestamp}},
    "message" : "trying out Elasticsearch"
}
'

// 字段折叠搜索
curl -X GET "localhost:9200/twitter/_search" -H 'Content-Type: application/json' -d'
{
   
   
    "query": {
   
   
        "match": {
   
   
            "message": "elasticsearch"
        }
    },
    // 使用user字段折叠结果，注意这个字段的类型为keyword
    "collapse" : {
   
   
        "field" : "user"
    },
    // 选出点赞数的顶部文档
    "sort": ["likes"],
    // 定义第一个折叠结果的偏移量
    "from": 1
}
'

响应中的总命中数表示没有折叠的匹配文档的数量，不同组的总数是未知的。用于折叠的字段必须是激活doc_values的keyword或numeric字段(这点一定要注意，我在测试的时候就是在这里遭坑了)。最终的结果为：

{
   
   
    "took": 1,
    "timed_out": false,
    "_shards": {
   
   
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
   
   
        "total": 4,
        "max_score": null,
        // 命中4个结果，但偏移量为1，显示后面3个，即第一个命中的likes为8的缺省了
        "hits": [
            {
   
   
                "_index": "twitter",
                "_type": "_doc",
                "_id": "2",
                "_score": null