Elasticsearch 文档的基本CRUD与批量操作 电脑版发表于:2020/8/23 16:36 ![elasticsearch](https://img.tnblog.net/arcimg/hb/5f1adabe8df94fdb8331eb80e393c4a3.jpeg "elasticsearch") >#Elasticsearch 文档的基本CRUD与批量操作 [TOC] <br/> 文档的 CRUD ------------ | | | | ------------ | ------------ | | Index | PUT my_index/_doc/1 <br/> {"user":"mike","comment":"You know,for future"} | | Create | PUT my_index/_create/1 <br/> {"user":"mike","comment":"You know,for future"} <br/> POST my_index/_doc(不指定ID,自动生成) <br/> {"user":"mike","comment":"You know,for search"} | | Read | Get my_index/_doc/1 | | Update | POST my_index/_update/1 <br/> { "doc":{"user":"mike","comment":"You know,for search"}} | | Delete | Delete my_index/_doc/1 | >- Type 名,约定都用 _doc - Create - 如果ID已经存在,会失败 - Index - 如果ID不存在,创建新的文档。否则,先删除现有的文档,再创建新的文档,版本会增加 - Update - 文档必须已经存在,更新只会对相应字段做增量修改 Create 一个文档 ------------ ```bash PUT users/_create/1 { "user" : "Jack", "post_date" : "2020-08-23T14:40:12", "message" : "trying out Elasticsearch" } ``` >###第一次调用 ![](https://img.tnblog.net/arcimg/hb/223fd6c8e95b4720a6423919d175731f.png) >###第二次调用 ![](https://img.tnblog.net/arcimg/hb/5876e3db232549628478e056c4f3cd88.png) >###小结 - 支持自动生成文档ID和指定文档ID两种方式 - 通过调用 `post /users/_doc` - 系统会自动生成 document id - 使用 `HTTP PUT user/_create/1` 创建时,URL中显示指定`_create`,此时如果该id已经存在,操作失败。 Get 一个文档 ------------ ```bash GET users/_doc/1 ``` >###获取一个文档信息 ![](https://img.tnblog.net/arcimg/hb/2d7dfb6457814fa5b540272bde6faf3b.png) >##小结 - 找到文档,返回 HTTP 200 - _index/_type/ - 版本信息,同一个ID的文档,即使被删除,Version 号也会不断增加 - _scourc 中默认包含了文档的所有原始信息 - 找不到文档,返回 HTTP 404 Index 文档 ------------ ```bash PUT users/_doc/1 { "user" : "Bob He(MinYang He)" } ``` ![](https://img.tnblog.net/arcimg/hb/e4ca1359184b4edebb99920f7e90ac63.png) - Index 和 Create 不一样的地方:如果文档不存在,就索引新的文档。否则现有的文档会被删除,新的文档被索引。版本信息 +1。 Update 文档 ------------ ```bash POST users/_update/1 { "doc" : { "albums":["Album1","Album2"] } } ``` ![](https://img.tnblog.net/arcimg/hb/9189a48dbb4f42d69fb06308a7b97d66.png) - Update 方法不会删除原有的文档,而是实现真正的数据更新。 - POST 方法 /PayLoad 需要包含在"doc"中 Bulk API ------------ ```bash #执行第1次 POST _bulk { "index" : { "_index" : "test", "_id" : "1" } } { "field1" : "value1" } { "delete" : { "_index" : "test", "_id" : "2" } } { "create" : { "_index" : "test2", "_id" : "3" } } { "field1" : "value3" } { "update" : {"_id" : "1", "_index" : "test"} } { "doc" : {"field2" : "value2"} } ``` ![](https://img.tnblog.net/arcimg/hb/5dbd98ea5f8f4751a87765ba64264911.png) ![](https://img.tnblog.net/arcimg/hb/20b529e604c84e77a64e329d80dce2ef.png) - 支持一次API调用中,对不同的索引进行操作 - 支持四种类型操作 - index - Create - Update - Delete - 可以在URL中指定Index,也可以在请求的Payload中进行 - 操作中单条操作失败,并不会影响其他操作 - 返回结果包括了每一条操作执行的结果 批量读取 mget ------------ ```bash GET /_mget { "docs" : [ { "_index" : "test", "_id" : "1" }, { "_index" : "test", "_id" : "2" } ] } ``` ![](https://img.tnblog.net/arcimg/hb/b21dc4eb35a64922ad4dabd976e5930c.png) tn>批量操作,可以减少网络链接所产生的开销,提高性能。 批量查询 - msearch ------------ ```bash POST kibana_sample_data_ecommerce/_msearch {} {"query" : {"match_all" : {}},"size":1} {"index" : "kibana_sample_data_flights"} {"query" : {"match_all" : {}},"size":2} ``` ![](https://img.tnblog.net/arcimg/hb/5738e96cb4ea4a0aa8e75cc7d1179299.png) 完整Demo ------------ - Create Document (auto ID generate) - Get Document By Id - Create Document (指定ID) - Index Document - Update Document ```bash ############Create Document############ #create document. 自动生成 _id POST users/_doc { "user" : "Mike", "post_date" : "2019-04-15T14:12:12", "message" : "trying out Kibana" } #create document. 指定Id。如果id已经存在,报错 PUT users/_doc/1?op_type=create { "user" : "Jack", "post_date" : "2019-05-15T14:12:12", "message" : "trying out Elasticsearch" } #create document. 指定 ID 如果已经存在,就报错 PUT users/_create/1 { "user" : "Jack", "post_date" : "2019-05-15T14:12:12", "message" : "trying out Elasticsearch" } ### Get Document by ID #Get the document by ID GET users/_doc/1 ### Index & Update #Update 指定 ID (先删除,在写入) GET users/_doc/1 PUT users/_doc/1 { "user" : "Mike" } #GET users/_doc/1 #在原文档上增加字段 POST users/_update/1/ { "doc":{ "post_date" : "2019-05-15T14:12:12", "message" : "trying out Elasticsearch" } } ### Delete by Id # 删除文档 DELETE users/_doc/1 ### Bulk 操作 #执行两次,查看每次的结果 #执行第1次 POST _bulk { "index" : { "_index" : "test", "_id" : "1" } } { "field1" : "value1" } { "delete" : { "_index" : "test", "_id" : "2" } } { "create" : { "_index" : "test2", "_id" : "3" } } { "field1" : "value3" } { "update" : {"_id" : "1", "_index" : "test"} } { "doc" : {"field2" : "value2"} } #执行第2次 POST _bulk { "index" : { "_index" : "test", "_id" : "1" } } { "field1" : "value1" } { "delete" : { "_index" : "test", "_id" : "2" } } { "create" : { "_index" : "test2", "_id" : "3" } } { "field1" : "value3" } { "update" : {"_id" : "1", "_index" : "test"} } { "doc" : {"field2" : "value2"} } ### mget 操作 GET /_mget { "docs" : [ { "_index" : "test", "_id" : "1" }, { "_index" : "test", "_id" : "2" } ] } #URI中指定index GET /test/_mget { "docs" : [ { "_id" : "1" }, { "_id" : "2" } ] } GET /_mget { "docs" : [ { "_index" : "test", "_id" : "1", "_source" : false }, { "_index" : "test", "_id" : "2", "_source" : ["field3", "field4"] }, { "_index" : "test", "_id" : "3", "_source" : { "include": ["user"], "exclude": ["user.location"] } } ] } ### msearch 操作 POST kibana_sample_data_ecommerce/_msearch {} {"query" : {"match_all" : {}},"size":1} {"index" : "kibana_sample_data_flights"} {"query" : {"match_all" : {}},"size":2} ### 清除测试数据 #清除数据 DELETE users DELETE test DELETE test2 ``` 常见问题错误 ------------ | 问题 | 原因 | | ------------ | ------------ | | 无法连接 | 网络故障或集群内部出现问题 | | 连接无法关闭 | 网络故障或节点出错 | | 429 | 集群处于非常繁忙的状态,所以 Elasticsearch 无法处理当前的一些请求。我们可以尝试重试的方式或对Elasticsearch添加新的节点提高吞吐量 | | 4xx | 请求体格式有错 | | 500 | 集群内部错误 | tn>在请求API的时候也不要一次性发送过多的数据,因为这有可能导致Elasticsearch产生过大的压力反而造成性能的下降