elk筆記8--index

1. index 建立的幾種方式

1.1 直接建立index
1.2 按照目前日期建立索引
1.3 建立帶有rollover功能的索引

2. 索引的常見設定

2.1 基本設定
2.2 index 為unassigned的常見處理方式
2.3 叢集為red處理
2.4 提高消費速度常見方法
2.5 常見問題

3. 說明

1. index 建立的幾種方式

1.1 直接建立index

第一種方式最普通，此類index 寫入時在logstash中直接指定index01即可；

其缺點是當日志量大的時候，會影響性能。

PUT index01
{}
out:
{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "index" : "index01"
}

1.2 按照目前日期建立索引

該方式建立日志，寫入時候每天會建立一個帶日期的index，此類寫入時在logstash中配置如下：

index => “index02-%{+YYYY.MM.dd}”

PUT /%3Cindex02-%7Bnow%2Fd%7D%3E
{}
out:
{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "index" : "index02-2020.06.21"
}

1.3 建立帶有rollover功能的索引

該方式建立一個index帶日期和序号，并設定一個_write别名；

每次執行rollover的時候，會按照目前日期和和序号+1的方式生成新索引，并将_write别名指向最新索引；

是以實際寫入的時候直接使用_write索引即可，該方式比較使用生産中使用；

實際使用的時候，若資料量過大，應該加一個cron定時按照政策進行rollover。

PUT /%3Cindex03-%7Bnow%2Fd%7D-000001%3E
{
  "aliases": {
  "index03_write": {}
  }
}
out:
{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "index" : "index03-2020.06.21-000001"
}
rollover:
POST index03_write/_rollover
out:
{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "old_index" : "index03-2020.06.21-000001",
  "new_index" : "index03-2020.06.21-000002",
  "rolled_over" : true,
  "dry_run" : false,
  "conditions" : { }
}

2. 索引的常見設定

2.1 基本設定

index.number_of_replicas 副本數量
index.routing.allocation.require.zone 索引存儲的zone
index.routing.allocation.include._name 索引存儲的節點名稱
index.routing.allocation.include._ip 索引存儲的節點ip
index.routing.allocation.total_shards_per_node 索引在每個節點上可以存儲多少個分片

2.2 index 為unassigned的常見處理方式

當叢集yellow時候，一般為配置設定未正常配置設定，包括如下常見4類處理方法：

zone限制

PUT index_name/_settings
{
  "index" : {
    "routing" : {
      "allocation" : {
        "require" : {
          "zone" : ""
        }
      }
    }
  }
}

include._ip 限制

PUT index_name/_settings
{
 "index" : {
 "routing" : {
   "allocation" : {
     "include" : {
       "_ip" : ""
        }
      }
    }
  }
}

include._name 限制

PUT index_name/_settings
{
 "index" : {
 "routing" : {
   "allocation" : {
     "include" : {
       "_name" : ""
        }
      }
    }
  }
}

total_shards_per_node 值過小導緻shard無法配置設定到所有節點上

PUT index_name/_settings
{
  "index" : {
    "routing" : {
      "allocation" : {
        "total_shards_per_node" : "3" #此值根據節點數量和分片數量更改即可
      }
    }
  }
}

data 節點數量不夠，導緻主備shard無法配置設定

案例說明：

使用者收到告警：叢集為yellow

在叢集中通過 GET _cluster/allocation/explain 檢視結果如下，即同一個索引的副本和主分片不能出現在同一個節點中, 推測可能節點數量不夠；

"explanation": "the shard cannot be allocated to the same node on which a copy of the shard already exists [[id5_equip_v1][1], node[Htz1RdjST-2gdvkti0xfCQ], [P], s[STARTED], a[id=NaGQ1AraS_SQwUKb-1j2xw]]"

繼續 GET _cat/nodes?v ，發現叢集中隻有一個data節點，3個master節點，是以可以判斷是叢集節點數量少，而索引設定了非0副本導緻yellow的。

解決方法1：
添加data節點，使其能夠配置設定shard；
解決方法2:
使用者可以接受的情況下，設定shard的副本為0，一般測試叢集可以設定副本為0，正式叢集建議設定1個副本
index_name
PUT index_name/_settings
{
  "index": {
    "number_of_replicas":"0"
  }
}

2.3 叢集為red處理

筆者由于磁盤異常，且某些測試index沒有設定副本，導緻節點為red。

此時處理方法包括：

1）删除red的index

所有的資料都會丢失

2）通過reroute對red對應的index配置設定空的primary shard

丢失的分片資料丢失，沒丢失的可以繼續使用

GET _cluster/allocation/explain
{
  "index" : "dmesg001-2020.07.15-000003",
  "shard" : 1,
  "primary" : true,
  "current_state" : "unassigned",
  "unassigned_info" : {
    "reason" : "NODE_LEFT",
    "at" : "2020-07-15T15:27:55.563Z",
    "details" : "node_left [J2H0nVv9T-uH-ZymDUJ2yQ]",
    "last_allocation_status" : "no_valid_shard_copy"
  },
  "can_allocate" : "no_valid_shard_copy",
  "allocate_explanation" : "cannot allocate because a previous copy of the primary shard existed but can no longer be found on the nodes in the cluster",
  ......
  }
  
配置設定分片異常的，可以先通過reroute加retry_failed 參數來初步修複， 也可以嘗試 _flush，無法修複的可以嘗試配置設定空分片。
POST /_cluster/reroute?retry_failed=true

POST red_index_name/_flush

POST _cluster/reroute
{
"commands" : [ {
"allocate_empty_primary" : {
"index" : "dmesg001-2020.07.15-000003",
"shard" : 1,
"node" : "node-3",
"accept_data_loss" : true
}
}
]
}

2.4 提高消費速度常見方法

适當增加分片數量

當logstash資源充足的時候，shard數量過少可能造成寫入瓶頸，是以此時可以适當增加shard分片資料。
适當增加logstash數量

當shard分片資料比較多，logstash數量較少的時候，可能造成消費瓶頸，此時可以通過增加logstash數量來提高消費速率。
适當增加kafka topic partion數量

當分片數量和logstash數量充足的時候，若topic partion過低，也會導緻logstash無法提高消費速率，此時可以适當提高topic的partion數量。
将索引寫入到ssd節點

對于有一定規模的叢集，可以設定hot、stale、freeze節點，hot節點用ssd節點，stale節點用普通的機械盤，freeze節點用最差的機械盤。

寫入時候一般将重要的索引直接寫入hot節點(保證最快的寫入速率)，普通的索引寫入到stale節點，當機的索引儲存在freeze節點。當stale節點中的索引寫入速度無法提上去的時候，就可以通過rollover将起配置設定到hot節點中，進而提高起寫入速率。

POST a01_logtail_write/_rollover
{
  "settings": {
  "index": {
  "number_of_shards" : "6",
  "routing": {
    "allocation": {
      "require": {
        "zone": "hot"
          }
        }
      }
    }
  }
}

去掉副本

去掉副本可以明顯提高寫入速率，但節點故障存在一定丢資料的風險；是以，當資料不太重要，緊急情況下可以通過去掉副本來快速提高寫入速率，資料無lag的時候可以再恢複副本。
使用hangout代替logstash

在同等資源下，hangout的消費速率明顯比logstash高一般是logstash的2-3倍的消費速率。

2.5 常見問題

index 建立了如何修改shard數量

正常情況是不允許修改的，對于可以rollover類型，或者按照日期的index，可以通過修改模闆來更新shard數量，進而建立下一個index的時候就自動更新了shard數量;

如下建立模闆後，更新settings的shard資料即可：

PUT _template/index_03
{
     "index_patterns": ["index03-*"], 
     "settings": {
       "number_of_shards": 2
     }
   }

判斷叢集狀态常用的指令

GET _cluster/allocation/explain
GET _cat/health?v
GET _cat/indices?v&health=yellow
GET _cat/shards?h=index,shard,prirep,state,unassigned.reason
GET index_name/_setting
預設 explain 不會區分primary和非primary，且隻能檢視1個shard；是以，當有多個shard異常的時候，一般先用_cat/shards?v 檢視哪些shard未配置設定且為主分片，然後再用條件式的explain檢視具體原因
GET _cluster/allocation/explain
{
 "index": "my-index-000001",
 "shard": 0,
 "primary": true
}

curl 查詢更新資訊

無認證的叢集: curl h01:9201/_cat/indices 
有認證的叢集：curl -u elastic:elastic [-XGET可以省略] h01:9204/_cat/indices ，需要 -u 參數加入使用者認證
使用curl put資料：curl -u elastic:elastic -H "Content-Type: application/json" -XPUT h01:9204/test1122/_settings -d '{"index": { "number_of_replicas": "0"}}' 
XPUT必須要加上，且需要加上-H "Content-Type: application/json"參數，否則會報如下406錯誤：
{"error":"Content-Type header [application/x-www-form-urlencoded] is not supported","status":406}

elk筆記8--index

elk筆記8--index

1. index 建立的幾種方式

1.1 直接建立index

1.2 按照目前日期建立索引

1.3 建立帶有rollover功能的索引

2. 索引的常見設定

2.1 基本設定

2.2 index 為unassigned的常見處理方式

2.3 叢集為red處理

2.4 提高消費速度常見方法

2.5 常見問題

3. 說明

繼續閱讀

Kafka：Topic概念與API介紹

5G小型蜂應用指南

PAT (Advanced Level) Practise 1012 The Best Rank (25)

mysql5.7的sql優化

線程通信和程序通信差別（線程程序差別）

Matlab随機波動率SV、GARCH用MCMC馬爾可夫鍊蒙特卡羅方法分析匯率時間序列

微信小程式前端解密擷取使用者資訊

Spring MVC 自學雜記（五） -- SpringMVC與前台的json資料互動

《MySQL技術内幕：InnoDB存儲引擎》筆記

擴容TIKV節點遇到的坑

PHP輔導代做程式設計：CS353 Database System

自學Zabbix3.10.2-事件通知Notifications upon events-Actions報警配置點選傳回：自學zabbix集錦

HDU 5678 ztr loves trees

拓端tecdat|R語言彈性網絡Elastic Net正則化懲罰回歸模型交叉驗證可視化

二叉樹及其應用--二叉樹建立

詳解STM32單片機的堆棧