elasticsearch 文檔_ElasticSearch系列04：索引與文檔的CURD

引言：上一節我們學習了ES的資料類型，有同學回報說，裡面的語句看不懂。今天，TeHero就為大家講解ES索引和文檔的CURD的操作。掌握了基本操作才能更好的系統學習，讓我們開始吧！

1、索引的CURD

1）新增

# 建立索引名為 tehero_index 的索引
PUT /tehero_index?pretty
{
# 索引設定
  "settings": {
    "index": {
      "number_of_shards": 1, # 分片數量設定為1，預設為5
      "number_of_replicas": 1 # 副本數量設定為1，預設為1
    }
  },
# 映射配置
  "mappings": {
    "_doc": { # 類型名，強烈建議設定為 _doc
      "dynamic": false, # 動态映射配置
# 字段屬性配置
      "properties": {
        "id": {
          "type": "integer"  # 表示字段id，類型為integer
        },
        "name": {
          "type": "text",
          "analyzer": "ik_max_word", # 存儲時的分詞器
          "search_analyzer": "ik_smart"  # 查詢時的分詞器
        },
        "createAt": {
          "type": "date"
        }
      }
    }
  }
}

注： dynamic：是動态映射的開關 ，有3種狀态：true 動态添加新的字段--預設；推薦使用）false 忽略新的字段,不會添加字段映射，但是會存在于_source中；（strict 如果遇到新字段抛出異常；

# 傳回值如下：
{
  "acknowledged": true, # 是否在叢集中成功建立了索引
  "shards_acknowledged": true,
  "index": "tehero_index"
}

2）查詢

GET /tehero_index  # 索引名，可以同時檢索多個索引或所有索引
如：GET /*    GET /tehero_index,other_index

GET /_cat/indices?v  #檢視所有 index

結果：

{
  "tehero_index": {
    "aliases": {},
    "mappings": {
      "_doc": {
        "dynamic": "false",
        "properties": {
          "createAt": {
            "type": "date"
          },
          "id": {
            "type": "integer"
          },
          "name": {
            "type": "text",
            "analyzer": "ik_max_word",
            "search_analyzer": "ik_smart"
          }
        }
      }
    },
    "settings": {
      "index": {
        "creation_date": "1589271136921",
        "number_of_shards": "1",
        "number_of_replicas": "1",
        "uuid": "xueDIxeUQnGBQTms65wA6Q",
        "version": {
          "created": "6050499"
        },
        "provided_name": "tehero_index"
      }
    }
  }
}

3）修改

ES提供了一系列對index修改的語句，包括 副本數量的修改、新增字段、refresh_interval值的修改、索引分析器的修改（後面重點講解）、别名的修改 （關于别名，TeHero後面會專門講解，這是一個在實踐中非常有用的操作）。

先學習常用的文法：

# 修改副本數
PUT /tehero_index/_settings
{
    "index" : {
        "number_of_replicas" : 2
    }
}

# 修改分片重新整理時間,預設為1s
PUT /tehero_index/_settings
{
    "index" : {
        "refresh_interval" : "2s"
    }
}

# 新增字段 age
PUT /tehero_index/_mapping/_doc 
{
  "properties": {
    "age": {
      "type": "integer"
    }
  }
}

更新完後，我們再次檢視索引配置：

GET /tehero_index
結果：
{
  "tehero_index": {
    "aliases": {},
    "mappings": {
      "_doc": {
        "dynamic": "false",
        "properties": {
          "age": {
            "type": "integer"
          },
          "createAt": {
            "type": "date"
          },
          "id": {
            "type": "integer"
          },
          "name": {
            "type": "text",
            "analyzer": "ik_max_word",
            "search_analyzer": "ik_smart"
          }
        }
      }
    },
    "settings": {
      "index": {
        "refresh_interval": "2s",
        "number_of_shards": "1",
        "provided_name": "tehero_index",
        "creation_date": "1589271136921",
        "number_of_replicas": "2",
        "uuid": "xueDIxeUQnGBQTms65wA6Q",
        "version": {
          "created": "6050499"
        }
      }
    }
  }
}
已經修改成功

4）删除

# 删除索引
DELETE /tehero_index
# 驗證索引是否存在
HEAD tehero_index
傳回：404 - Not Found

2、文檔的CURD

1）新增

# 新增單條資料，并指定es的id 為 1
PUT /tehero_index/_doc/1?pretty
{
  "name": "Te Hero"
}
# 新增單條資料，使用ES自動生成id
POST /tehero_index/_doc?pretty
{
  "name": "Te Hero2"
}

# 使用 op_type 屬性，強制執行某種操作
PUT tehero_index/_doc/1?op_type=create
{
     "name": "Te Hero3"
}
注意：op_type=create強制執行時，若id已存在，ES會報“version_conflict_engine_exception”。
op_type 屬性在實踐中同步資料時是有用的，後面講解資料庫與ES的資料同步問題時，TeHero再為大家詳細講解。

我們查詢資料，看下效果：GET /tehero_index/_doc/_search

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 1,
    "hits": [
      {
        "_index": "tehero_index",
        "_type": "_doc",
        "_id": "1",
        "_score": 1,
        "_source": {
          "name": "Te Hero"
        }
      },
      {
        "_index": "tehero_index",
        "_type": "_doc",
        "_id": "P7-FCHIBJxE1TMY0WNGN",
        "_score": 1,
        "_source": {
          "name": "Te Hero2"
        }
      }
    ]
  }
}

2）修改

# 根據id，修改單條資料
（ps：修改語句和新增語句相同，可以了解為根據ID，存在則更新；不存在則新增）
PUT /tehero_index/_doc/1?pretty
{
  "name": "Te Hero-update"
}

# 根據查詢條件id=10，修改name="更新後的name"
（版本沖突而不會導緻_update_by_query 中止）
POST tehero_index/_update_by_query
{
  "script": {
    "source": "ctx._source.name = params.name",
    "lang": "painless",
    "params":{
      "name":"更新後的name"
    }
  },
  "query": {
    "term": {
      "id": "10"
    }
  }
}

關于文檔的更新，Update By Query API，對于該API的使用，TeHero将其歸類為進階知識，後續章節将為大家更深入的講解。

3）查詢

# 1、根據id，擷取單個資料
GET /tehero_index/_doc/1
結果：
{
  "_index": "tehero_index",
  "_type": "_doc",
  "_id": "1",
  "_version": 5,
  "found": true,
  "_source": {
    "name": "Te Hero-update",
    "age": 18
  }
}

# 2、擷取索引下的所有資料
GET /tehero_index/_doc/_search
結果：
{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 1,
    "hits": [
      {
        "_index": "tehero_index",
        "_type": "_doc",
        "_id": "P7-FCHIBJxE1TMY0WNGN",
        "_score": 1,
        "_source": {
          "name": "Te Hero2"
        }
      },
      {
        "_index": "tehero_index",
        "_type": "_doc",
        "_id": "_update",
        "_score": 1,
        "_source": {
          "name": "Te Hero3"
        }
      },
      {
        "_index": "tehero_index",
        "_type": "_doc",
        "_id": "1",
        "_score": 1,
        "_source": {
          "name": "Te Hero-update",
          "age": 18
        }
      }
    ]
  }
}

# 3、條件查詢（下一節詳細介紹）
GET /tehero_index/_doc/_search
{
  "query": {
    "match": {
      "name": "2"
    }
  }
}
結果：
{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.9808292,
    "hits": [
      {
        "_index": "tehero_index",
        "_type": "_doc",
        "_id": "P7-FCHIBJxE1TMY0WNGN",
        "_score": 0.9808292,
        "_source": {
          "name": "Te Hero2"
        }
      }
    ]
  }
}

4）删除

# 1、根據id，删除單個資料
DELETE /tehero_index/_doc/1

# 2、delete by query
POST tehero_index/_delete_by_query
{
  "query": { 
    "match": {
     "name": "2"
    }
  }
}

3、批量操作 Bulk API

# 批量操作
POST _bulk
{ "index" : { "_index" : "tehero_test1", "_type" : "_doc", "_id" : "1" } }
{ "this_is_field1" : "this_is_index_value" }
{ "delete" : { "_index" : "tehero_test1", "_type" : "_doc", "_id" : "2" } }
{ "create" : { "_index" : "tehero_test1", "_type" : "_doc", "_id" : "3" } }
{ "this_is_field3" : "this_is_create_value" }
{ "update" : {"_id" : "1", "_type" : "_doc", "_index" : "tehero_test1"} }
{ "doc" : {"this_is_field2" : "this_is_update_value"} }

# 查詢所有資料
GET /tehero_test1/_doc/_search
結果：
{
  "took": 33,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 1,
    "hits": [
      {
        "_index": "tehero_test1",
        "_type": "_doc",
        "_id": "1",
        "_score": 1,
        "_source": {
          "this_is_field1": "this_is_index_value",
          "this_is_field2": "this_is_update_value"
        }
      },
      {
        "_index": "tehero_test1",
        "_type": "_doc",
        "_id": "3",
        "_score": 1,
        "_source": {
          "this_is_field3": "this_is_create_value"
        }
      }
    ]
  }
}

注：POST _bulk 都做了哪些操作呢？

1、若索引“tehero_test1”不存在，則建立一個名為“tehero_test1”的 index，同時若id = 1 的文檔存在，則更新；不存在則插入一條 id=1 的文檔；

2、删除 id=2 的文檔；

3、插入 id=3 的文檔；若文檔已存在，則報異常；

4、更新 id = 1 的文檔。

ps：

批量操作在實踐中使用是比較多的，因為減少了IO，提高了效率！

下節預告：倒排序索引是什麼？

最後附上小編自己學習總結的ElasticSearch知識腦圖，供大家參考：

elasticsearch 文檔_ElasticSearch系列04：索引與文檔的CURD

elasticsearch 文檔_ElasticSearch系列04：索引與文檔的CURD

1、索引的CURD

2、文檔的CURD

3、批量操作 Bulk API

繼續閱讀

elasticsearch 文檔_Spring Boot 整合 elasticsearch

c語言删除檔案第一行_ElasticSearch 文檔的删除和批量操作

elasticsearch_Elasticsearch系列（一）：Elasticsearch入門引言什麼是Elasticsearch？Elasticsearch基本概念Elasticsearch安裝Elasticsearch操作執行個體往期精彩文章