当前位置：首页 > news >正文

Elasticsearch中object类型与nested类型以及数组之间的区别

news 2026/2/7 16:32:45

一、区别：

0、一般情况下用object 类型来查es中为json对象的字段数据，用nested来查es中为JsonArray数组类型的字段数据。
1、默认情况下ES会把JSON对象直接映射为object类型，只有手动设置才会映射为nested类型

2、object类型可以直接使用普通的逗号（.）查询，比如

{"query": {"term": {"inspector.id": {"value": "222"}}}
}

3、nested类型的查询需要使用nested查询：

{"query": {"nested": {"path": "inspector","query": {"term": {"inspector.id": {"value": "222"}}}}    }
}

4、两种查询方式不兼容，如果大家想做改动的话，需要对java程序进行修改，风险比较大。

5、存储方式不同。对象数组在后台是扁平化存储，嵌套对象数组是每个对象独立成文档存储。因此，对象数据有时会有"且"条件查询出"或"结果，嵌套对象的文档聚合可能会多计数(除非加reverse_nested)，想保持数组中对象的独立性，就需要使用嵌套字段类型。

二、Object类型字段

1：mapping

注意，如果没有写明type，比如categoryObj，ES会默认object类型，并且就算查看mapping，也不会显示出来：

		// 所属类目，对象类型，注意这里没有写明type，ES则会默认为object"categoryObj": {//"type":"object","properties": {"class1": {"type": "keyword"},"class2": {"type": "keyword"},"class3": {"type": "keyword"}}}

2:增加数据

PUT test_index_20211220/e-com/1
{"id": "1","name": "L＇oreal/欧莱雅复颜玻尿酸水光充盈导入膨润精华液","brand": "欧莱雅","price": 279,"desc": "补水 提拉紧致 淡化细纹","categoryObj": {"class1": "欧莱雅","class2": "补水","class3": "面部护理"}
}

3：查询

在样例数据中，“categoryObj"字段被默认设置为object类型(没有显示设置type)，对于对象类型，在查询时需要用”."号连接整个字段：

GET test_index_20211220/_search
{"query": {"term": {"categoryObj.class1": "欧莱雅"}}
}

4：对象数组特性

我们知道了嵌套字段中的对象被ES存储为了独立的文档，那对象字段呢？ES在后台将对象字段进行打平处理，后台其实存储的是扁平结构，以categoryObj字段为例：

"categoryObj": [{"class1": "欧莱雅","class2": "补水","class3": "面部护理"},{"class1": "欧莱雅","class2": "补水","class3": "面部精华"},{"class1": "雅诗兰黛","class2": "美白","class3": "面霜"}
]

后台存储的其实是：

{"categoryObj.class1": ["欧莱雅","欧莱雅","雅诗兰黛"],"categoryObj.class2": ["补水","补水","美白"],"categoryObj.class3": ["面部护理","面部精华","面霜"]
}

这就牺牲了对象之间的独立性，有时候会带来一些影响，具体就是某些情况下，对对象数组的"且"查询可能会变成"或"查询。
这个时候我们去同时查询"欧莱雅"和"美白"这两个关键词，正常来说是不应该差出来任何文档的，因为categoryObj中没有任何一个对象同时具备"欧莱雅"和"美白"这两个关键词，可事实确不是这样：

GET test_index_20211220/_search
{"query": {"bool": {// filter上下文"filter": {"bool": {"must": [{"term": {"categoryObj.class1": "欧莱雅"}},{"term": {"categoryObj.class2": "美白"}}]}}}}
}

结果居然将文档查询出来了
所以当字段为数组的时候，建议使用nested类型字段。

三、Nested类型字段

1：mapping

 		// 所属类目，嵌套类型"categoryNst": {"type": "nested","properties": {"class1": {"type": "keyword"},"class2": {"type": "keyword"},"class3": {"type": "keyword"}}}

2：增加数据

PUT test_index_20211220/e-com/1
{"id": "1","name": "L＇oreal/欧莱雅复颜玻尿酸水光充盈导入膨润精华液","brand": "欧莱雅","price": 279,"desc": "补水 提拉紧致 淡化细纹","categoryNst": {"class1": "欧莱雅","class2": "补水","class3": "面部护理"}
}

3：查询

GET test_index_20211220/_search
{"query": {"nested": {"path": "categoryNst",       #nested对象的查询深度"query": {"term": {// 在以前的版本中直接写 "class2": "补水"也是可以的，因为已经在外部声明了path// 不知道从哪个版本改了，现在必须写 "categoryNst.class2": "补水"，否则报错"categoryNst.class2": "补水"}}}}
}

4：嵌套字段的特性

嵌套字段其实是把其内部成员当做了一条独立文档进行了索引。如何理解这句话呢？在上面的数据中，"categoryNst"数组已经有两个对象成员了
ES在后台其实将这两个对象成员当成了两条独立文档进行索引，所以ES一共索引了3条文档(一条外部文档，两条嵌套字段对象的文档)，这点可以从对嵌套字段的terms聚合中看出来：

GET test_index_20211220/_search
{"query": {"nested": {"path": "categoryNst","query": {"term": {"categoryNst.class2": "补水"}}}},"aggs": {"nestedAgg":{"nested": {"path": "categoryNst"},"aggs": {"termAgg": {"terms": {// 这里一样不能写成"class2"，否则虽不报错，但聚合无结果。"field": "categoryNst.class2"}}}}}
}

{"took": 6,"timed_out": false,"_shards": {"total": 2,"successful": 2,"skipped": 0,"failed": 0},"hits": {"total": 1,"max_score": 0.18232156,"hits": []},"aggregations": {"nestedAgg": {"doc_count": 2,"termAgg": {"doc_count_error_upper_bound": 0,"sum_other_doc_count": 0,"buckets": [{"key": "补水","doc_count": 2}]}}}
}

明明只有一条整体的文档，但聚合结果却是2，岂不是结果错误了？如何才能得到我们需要的结果呢？这个时候就要用到反转嵌套(reverse_nested)，改写上面查询语句的聚合部分：

GET test_index_20211220/_search
{"size":0,"query": {"nested": {"path": "categoryNst","query": {"term": {"categoryNst.class2": "补水"}}}},"aggs": {"nestedAgg":{"nested": {"path": "categoryNst"},"aggs": {"termAgg": {"terms": {"field": "categoryNst.class2"},"aggs": {"reverseAgg": {"reverse_nested": {}}}}}}}
}

5：java查询

 public static void main(String[] args) {//创建ES客户端RestHighLevelClient esClient = new RestHighLevelClient(RestClient.builder(new HttpHost("localhost", 9200, "http")));//创建搜索对象SearchRequest searchRequest = new SearchRequest();searchRequest.indices("user");//构建请求体SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();//查询数据-对nested类型数据查询searchSourceBuilder.query(QueryBuilders.nestedQuery("check_error", QueryBuilders.termsQuery("check_error.errtype", errtype), ScoreMode.None));//发送请求searchRequest.source(searchSourceBuilder);SearchResponse search = null;try {search = esClient.search(searchRequest, RequestOptions.DEFAULT);} catch (IOException e) {throw new RuntimeException(e);}//解析结果SearchHits hits = search.getHits();for (SearchHit hit : hits) {System.out.println(hit.getSourceAsString());}//关闭ES客户端esClient.close();}

四、数组类型字段

1：mapping

ES中没有专门的数组类型，默认情况下任何字段都可以包含一个或者多个值，但一个数组中的值必须是同一种类型。

 		// 数组类型"comments": {"type": "keyword"}

2：增加数据

当数组类型字段(comments)中只有一个数据时：

GET test_index_20211220/e-com/1
{"_index": "test_index_20211220","_type": "e-com","_id": "1","_version": 1,"found": true,"_source": {"id": "1","name": "L＇oreal/欧莱雅复颜玻尿酸水光充盈导入膨润精华液","brand": "欧莱雅","price": 279,"desc": "补水 提拉紧致 淡化细纹","comments": "还没有用，赠品跟欧莱雅旗舰店的同款赠品有差异。味道也不一样"}
}

可以看到此时的comments还不是数组，现在我们增加一条评论，覆盖写入一次：

PUT test_index_20211220/e-com/1
{"id": "1","name": "L＇oreal/欧莱雅复颜玻尿酸水光充盈导入膨润精华液","brand": "欧莱雅","price": 279,"desc": "补水 提拉紧致 淡化细纹","comments": ["还没有用，赠品跟欧莱雅旗舰店的同款赠品有差异。味道也不一样","只有这支玻璃尿酸水光充盈是真的"]
}

重新查询，可以看到，"commts"在索引的时候，如果有多个值，则会自动转化成了数组，且文档版本号+1：

GET test_index_20211220/e-com/1 
{"_index": "test_index_20211220","_type": "e-com","_id": "1","_version": 2,"found": true,"_source": {"id": "1","name": "L＇oreal/欧莱雅复颜玻尿酸水光充盈导入膨润精华液","brand": "欧莱雅","price": 279,"desc": "补水 提拉紧致 淡化细纹","comments": ["还没有用，赠品跟欧莱雅旗舰店的同款赠品有差异。味道也不一样","只有这支玻璃尿酸水光充盈是真的"]}
}

3：查询

此时数组类型就当做正常的字段进行查询即可

GET my_test_index/_search
{"query": {"bool": {"must": [{"terms": {"label": ["10","100"]}},{"term": {"name": {"value": "旺仔33333"}}}]}}
}

一、区别：

二、Object类型字段

1：mapping

2:增加数据

3：查询

4：对象数组特性

三、Nested类型字段

1：mapping

2：增加数据

3：查询

4： 嵌套字段的特性

5：java查询

四、数组类型字段

1：mapping

2：增加数据

3：查询

相关文章：

4：嵌套字段的特性