天天看點

CentOS7安裝ELK日志分析系統

一、ELK簡介

通俗來講,ELK是由Elasticsearch、Logstash、Kibana三個開源軟體組成的一個組合體,這三個軟體當中,每個軟體用于完成不同的功能。ELK又稱為ELKstack,官方域名為stactic.co,ELKstack的主要優點有如下幾個:

處理方式靈活:elasticsearch是實時全文索引,具有強大的搜尋功能。

配置相對簡單:elasticsearch全部使用JSON接口,logstash使用子產品配置,kibana的配置檔案部分更簡單。

檢索性能高效:基于優秀的設計,雖然每次查詢都是實時,但是也可以達到百億級資料的查詢妙級響應。

叢集線性擴充:elasticsearch和logstash都可以靈活線性擴充。 前端操作絢麗:kibana的前端設計比較絢麗,而且操作簡單。

什麼是elasticsearch

是一個高度可擴充的開源全文搜尋和分析引擎,它可實作資料的實時全文搜尋、分布式實作高可用、提供API接口,可以處理大規模日志資料,比如nginx、tomcat、系統日志等功能。

架構:安裝兩台es作為叢集,收集日志采用logstash或者filebeat,用redis做消息隊列,最後在kibana中展示。

什麼是logstash

可以通過插件實作日志收集和轉發,支援日志 過濾,支援普通log、自定義json格式的日志解析。

什麼是kibana

主要是通過接口調用elasticsearch的資料,并進行前端資料可視化的展現。

二、整體架構說明

本次拟使用兩台CentOS7伺服器(192.168.0.21和192.168.0.22),安裝版本為5.6.1,整體架構如下圖

CentOS7安裝ELK日志分析系統

三、準備工作

1.https://www.elastic.co/downloads 下載下傳5.6.1版本的filebeat、logstash、elasticsearch、kibana

2.參考我的另一篇redis安裝文檔在192.168.0.22上安裝好redis

四、伺服器初始化

關閉防火牆和selinux

systemctl stop firewalld
systemctl disable firewalld
systemctl stop NetworkManager
systemctl disable NetworkManager
setenforce 0
sed -i s/SELINUX=enforcing/SELINUX=disabled/g /etc/selinux/config
打開檔案數設定大點
echo "* - nofile 265536">> /etc/security/limits.conf
           

五、安裝elasticsearch叢集

1)先安裝java環境

安裝好java之後,做軟連接配接到/usr/bin/下

ln -s /usr/java/jdk1.8.0_91/bin/java /usr/bin/java,不然elasticsearch啟動的時候會報錯:

elasticsearch: which: no java in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin)

2)yum -y install elasticsearch-5.6.1.rpm

3)修改配置 檔案

第一台

vim /etc/elasticsearch/elasticsearch.yml
cluster.name: es-cluster   #叢集名稱
node.name: es-1        #節點名稱
path.data: /data/esdata    #es存儲資料的目錄
path.logs: /var/log/elasticsearch   #es日志目錄
bootstrap.memory_lock: true     #不使用swap,記憶體足夠大時可以開啟
network.host: 192.168.0.21       #本機的監聽位址
http.port: 9200                 #監聽端口
discovery.zen.ping.unicast.hosts: ["192.168.0.21", "192.168.0.22"]   #避免産生廣播風暴,在公網上必須開啟
           

第二台與第一台隻有node.name: es-2,network.host: 192.168.0.22這兩處不同

mkdir –p /data/esdata
mkdir –p /var/log/elasticsearch
cd /data/
chown elasticsearch:elasticsearch esdata
           

如果伺服器的記憶體比較大,可以适當的調整下es的jvm記憶體,配置檔案為/etc/elasticsearch/jvm.options,建議最小和最大記憶體設施之成一樣大,官方配置文檔最大建議30G以内。

啟動es:

/etc/init.d/elasticsearch start
           

發現elasticsearch啟動不起來,日志報錯如下

CentOS7安裝ELK日志分析系統

解決該報錯的方法,參考位址為https://www.elastic.co/guide/en/elasticsearch/reference/5.6/setting-system-settings.html

解決方法如下

systemctl edit elasticsearch
在檔案中寫入下面兩行
[Service]
LimitMEMLOCK=infinity
然後執行systemctl daemon-reload 使配置生效
然後再啟動:/etc/init.d/elasticsearch start  能夠正常啟動
           

4)安裝elasticsearch插件之head

插件是為了完成不同的功能,官方提供了一些插件,但大部分是收費的,另外也有一些開發愛好者提供的插件,可以實作對elasticsearch叢集的狀态監控與管理配置等功能,head插件就是其中之一。

安裝5.x版本的head插件:

在elasticsearch5.x版本以後,不再支援直接安裝head插件,而是需要通過啟動一個服務方式,git位址:https://github.com/mobz/elasticsearch-head

yum –y install npm
#NPM的全稱是Node Package Manage,是随同NodeJS一起安裝的包管理和分發工具,他很友善讓JavaScript開發者下載下傳、安裝、上傳以及管理已經安裝的包。
git clone git://github.com/mobz/elasticsearch-head.git
cd elasticsearch-head
npm install grunt -save
ll node_modules/grunt #确認生成檔案
npm install #執行安裝
           
CentOS7安裝ELK日志分析系統
npm run start &  #背景啟動服務
           

修改elasticsearch服務配置檔案:

開啟跨域通路支援,然後重新開機elasticsearch服務:

vim /etc/elasticsearch/elasticsearch.yml
在最下方添加下面兩行
http.cors.enabled: true
http.cors.allow-origin: "*"
重新開機elasticsearch:/etc/init.d/elasticsearch restart
           

在浏覽器中通路http://192.168.0.21:9100/

然後再input框中輸入其中一個es的位址,點選“連接配接”即可看到叢集資訊

CentOS7安裝ELK日志分析系統

5)python腳本監控es叢集

import smtplib
from email.mime.text import MIMEText
from email.utils import formataddr
import subprocess
body = ""
false = "false"
obj = subprocess.Popen(("curl -sXGET http://192.168.56.11:9200/_cluster/health?pretty=true"),shell=True,stdout=subprocess.PIPE)
data = obj.stdout.read()
data1 = eval(data)
status = data1.get("status")
if status == "green":
    print("50")
else:
    print("100")
           

六、安裝filebeat

其實日志可以用軟體進行收集,甚至可以直接通過syslog将日志列印到日志過濾或者日志存儲的裝置上。用軟體進行日志收集,推薦的是使用logstash或者filebeat。

兩者的比較: logstash集日志收集和日志處理于一身,但是正常運作的logstash需占用幾百兆的記憶體

filebeat是一個ELK官方推出的輕量級日志收集工具,用go語言編寫,相比logstash占用資源更少,正常運作隻占用幾十兆記憶體,用于在沒有安裝java的伺服器上專門收集日志。缺點是不具備logstash的filter插件功能。

我們的想法是,在日志的收集階段,不進行日志的處理,隻負責收集日志,減輕用戶端伺服器的壓力,日志處理統一放到中心端的logstash中處理,故采用上面介紹的架構。

yum -y install filebeat-5.6.1-x86_64.rpm
           

filebeat也分為input和output,經過https://www.elastic.co/guide/en/beats/filebeat/5.6/configuration-filebeat-options.html官方文檔的參考之後,filebeat的配置檔案如下:

192.168.0.21的配置檔案/etc/filebeat/filebeat.yml如下

filebeat.prospectors:
- input_type: log
  paths:
    - /data/access.log        #指定推送日志檔案
  exclude_lines: ["^DBG","^$"]
  document_type: nginx-accesslog-021

output.redis:
  hosts: ["192.168.0.22"]
  port: "6379"
  datatype: "list"
  password: "123456"
  key: "nginx-accesslog-021"  #為了後期日志處理,建議自定義key名稱
  db: 0
  timeout: 5
           

192.168.0.22的配置檔案/etc/filebeat/filebeat.yml如下

filebeat.prospectors:
- input_type: log
  paths:
    - /data/access.log        #指定推送日志檔案
  exclude_lines: ["^DBG","^$"]
  document_type: nginx-accesslog-022

output.redis:
  hosts: ["192.168.0.22"]
  port: "6379"
  password: "123456"
  key: "nginx-accesslog-022"  #為了後期日志處理,建議自定義key名稱
  db: 1
  timeout: 5
           

七、安裝logstash

yum -y install logstash-5.6.1.rpm
           

我們先用logstash來處理非json格式的日志檔案

logstash用來過濾和處理日志的插件是filter插件,這個插件中,我們主要用的是grok、和geoip插件,其他的還有mutate和date等

logstash處理非json格式的日志,是按照正規表達式來進行比對出每個字段的。

logstash本身自帶了很多定義好的正則字段,在這個目錄下:

CentOS7安裝ELK日志分析系統

如apache日志解析:logstash過濾解析apache日志

filter {
    grok {
        match => { "message" => "%{COMBINEDAPACHELOG}"}
    }
}
           

Logstash内置的pattern的定義(嵌套調用)

CentOS7安裝ELK日志分析系統

再舉個例子

%{IP:client}這意思是:用IP這則去比對日志内容,比對到的内容存儲在key client裡。

先來看一下預設的nginx日志的格式:

192.168.0.200 - - [24/Dec/2018:14:38:14 +0800] "GET /index.html HTTP/1.1" 200 612 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36"
           

我們可以用官方的grok debugger來幫我們生成正則,http://grokdebug.herokuapp.com/discover?#

複制上面的這條nginx日志到網頁中的input框中,然後點選discover,系統幫我們生成了正規表達式,我們發現和apache的是一樣的。

CentOS7安裝ELK日志分析系統

增加修改logstash配置檔案/etc/logstash/conf.d/nginx-access.conf,内容如下

input {
  redis {
      data_type => "list"
          host => "192.168.0.22"
          db => "0"
          port => "6379"
          key => "nginx-accesslog-021"
          password => "123456"
  }
}

input {
  redis {
      data_type => "list"
          host => "192.168.0.22"
          db => "1"
          port => "6379"
          key => "nginx-accesslog-022"
          password => "123456"
  }
}

filter {
    grok {
        match => { "message" => "%{COMBINEDAPACHELOG}" }
    }
}

output {
  if [type] == "nginx-accesslog-021" {
  elasticsearch {
    hosts => ["192.168.0.21:9200"]
        index => "logstash-nginx-accesslog-021-%{+YYYY.MM.dd}"
  }
  stdout { codec => rubydebug }
  }
  
  if [type] == "nginx-accesslog-022" {
  elasticsearch {
    hosts => ["192.168.0.21:9200"]
        index => "logstash-nginx-accesslog-022-%{+YYYY.MM.dd}"
  }
  stdout { codec => rubydebug }
  }   
}
           

這裡用兩個input插件分别從redis的不同db的key中取出收集到的日志,再經過grok插件過濾,最後根據不同的type輸出到es的不同索引中。為了調試的友善,這裡還加上了标準輸出stdout。

八、安裝kibana

yum -y install kibana-5.6.1-x86_64.rpm

修改kibana配置檔案/etc/kibana/kibana.yml:

server.port: 5601
server.host: "192.168.0.21"
elasticsearch.url: "http://192.168.0.21:9200"
           

九、測試

1)先啟動filebeat和redis

/etc/init.d/filebeat start
./redis-server ../conf/redis.conf
           

2)生成nginx日志

為了友善,這裡直接往21和22的nginx的日志檔案中分别echo兩條nginx日志:

echo '223.99.202.178 - - [24/Dec/2018:11:39:07 +0800] "POST /user/doLogin HTTP/1.1" 200 111 "https://www.hehehe.com/" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36"' >> /data/access.log
           

3)檢視日志是否傳到redis中

192.168.0.22:6379> select 0
OK
192.168.0.22:6379> keys *
1) "nginx-accesslog-021"
192.168.0.22:6379> llen nginx-accesslog-021
(integer) 2
192.168.0.22:6379> select 1
OK
192.168.0.22:6379[1]> keys *
1) "nginx-accesslog-022"
192.168.0.22:6379[1]> llen nginx-accesslog-022
(integer) 2
192.168.0.22:6379[1]> 
           

可以看到兩台伺服器的nginx日志都已經傳到了redis中

4)用調試模式啟動logstash,檢視日志的處理情況

[[email protected] elasticsearch-head]# /usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/nginx-access.conf 
WARNING: Could not find logstash.yml which is typically located in $LS_HOME/config or /etc/logstash. You can specify the path using --path.settings. Continuing using the defaults
Could not find log4j2 configuration at path //usr/share/logstash/config/log4j2.properties. Using default config which logs errors to the console
{
        "request" => "/user/doLogin",
          "agent" => "\"Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36\"",
         "offset" => 3642,
           "auth" => "-",
          "ident" => "-",
     "input_type" => "log",
           "verb" => "POST",
         "source" => "/data/access.log",
        "message" => "223.99.202.178 - - [24/Dec/2018:11:39:07 +0800] \"POST /user/doLogin HTTP/1.1\" 200 111 \"https://www.hehehe.com/\" \"Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36\"",
           "type" => "nginx-accesslog-022",
       "referrer" => "\"https://www.hehehe.com/\"",
     "@timestamp" => 2018-12-27T05:56:15.319Z,
       "response" => "200",
          "bytes" => "111",
       "clientip" => "223.99.202.178",
           "beat" => {
            "name" => "localhost.localdomain",
        "hostname" => "localhost.localdomain",
         "version" => "5.6.1"
    },
       "@version" => "1",
    "httpversion" => "1.1",
      "timestamp" => "24/Dec/2018:11:39:07 +0800"
}
{
        "request" => "/user/doLogin",
          "agent" => "\"Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36\"",
         "offset" => 3870,
           "auth" => "-",
          "ident" => "-",
     "input_type" => "log",
           "verb" => "POST",
         "source" => "/data/access.log",
        "message" => "223.99.202.178 - - [24/Dec/2018:11:39:07 +0800] \"POST /user/doLogin HTTP/1.1\" 200 111 \"https://www.hehehe.com/\" \"Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36\"",
           "type" => "nginx-accesslog-022",
       "referrer" => "\"https://www.hehehe.com/\"",
     "@timestamp" => 2018-12-27T05:56:15.319Z,
       "response" => "200",
          "bytes" => "111",
       "clientip" => "223.99.202.178",
           "beat" => {
            "name" => "localhost.localdomain",
        "hostname" => "localhost.localdomain",
         "version" => "5.6.1"
    },
       "@version" => "1",
    "httpversion" => "1.1",
      "timestamp" => "24/Dec/2018:11:39:07 +0800"
}
{
        "request" => "/user/doLogin",
          "agent" => "\"Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36\"",
         "offset" => 21195,
           "auth" => "-",
          "ident" => "-",
     "input_type" => "log",
           "verb" => "POST",
         "source" => "/data/access.log",
        "message" => "223.99.202.178 - - [24/Dec/2018:11:39:07 +0800] \"POST /user/doLogin HTTP/1.1\" 200 111 \"https://www.hehehe.com/\" \"Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36\"",
           "type" => "nginx-accesslog-021",
       "referrer" => "\"https://www.hehehe.com/\"",
     "@timestamp" => 2018-12-27T05:54:18.986Z,
       "response" => "200",
          "bytes" => "111",
       "clientip" => "223.99.202.178",
           "beat" => {
            "name" => "localhost",
        "hostname" => "localhost",
         "version" => "5.6.1"
    },
       "@version" => "1",
    "httpversion" => "1.1",
      "timestamp" => "24/Dec/2018:11:39:07 +0800"
}
{
        "request" => "/user/doLogin",
          "agent" => "\"Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36\"",
         "offset" => 21423,
           "auth" => "-",
          "ident" => "-",
     "input_type" => "log",
           "verb" => "POST",
         "source" => "/data/access.log",
        "message" => "223.99.202.178 - - [24/Dec/2018:11:39:07 +0800] \"POST /user/doLogin HTTP/1.1\" 200 111 \"https://www.hehehe.com/\" \"Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36\"",
           "type" => "nginx-accesslog-021",
       "referrer" => "\"https://www.hehehe.com/\"",
     "@timestamp" => 2018-12-27T05:54:18.986Z,
       "response" => "200",
          "bytes" => "111",
       "clientip" => "223.99.202.178",
           "beat" => {
            "name" => "localhost",
        "hostname" => "localhost",
         "version" => "5.6.1"
    },
       "@version" => "1",
    "httpversion" => "1.1",
      "timestamp" => "24/Dec/2018:11:39:07 +0800"
}
           

可以看到,logstash已經從redis中取到資料,這時候去檢視redis

192.168.0.22:6379[1]> select 0
OK
192.168.0.22:6379> keys *
(empty list or set)
192.168.0.22:6379> select 1
OK
192.168.0.22:6379[1]> keys *
(empty list or set)
192.168.0.22:6379[1]> 
           

和預期一樣,redis中的資料已經被logstash取走了。

5)檢視es中的資料

這時候的 elasticsearch叢集如下圖

CentOS7安裝ELK日志分析系統

生成了兩個index索引,資料已經來到了es中。

6)啟動kinaba檢視資料

/etc/init.d/kibana start
           

用浏覽器通路kibana,http://192.168.0.21:5601

在index pattern中輸入192.168.0.21的nginx索引進行比對

CentOS7安裝ELK日志分析系統
CentOS7安裝ELK日志分析系統

這時候在kibana上已經找到了剛才收集的日志

十、用GeoIP插件展示用戶端IP位址

我們通過logstash收集的nginx access

log中已經包含了用戶端IP的資料,但是隻有這個IP還不夠,要在kibana中顯示請求來源的地理位置還需要借助GeoIP資料庫來實作。GeoIP是最常見的免費IP位址歸類查詢庫,當然也有收費的版本。GeoIP庫根據位址提供對應的地域資訊,包括國家,省市,經緯度等,對于可視化地圖和區域統計非常有用。

1)下載下傳并解壓位址資料檔案

在logstash2版本的時候使用的是

http://geolite.maxmind.com/download/geoip/database/GeoLiteCity.dat.gz

logstash5版本時候更換為了

https://geolite.maxmind.com/download/geoip/database/GeoLite2-City.tar.gz

cd /etc/logstash/
wget https://geolite.maxmind.com/download/geoip/database/GeoLite2-City.tar.gz
gunzip GeoLite2-City.tar.gz
tar -xvf GeoLite2-City.tar
           

修改logstash配置檔案/etc/logstash/conf.d/nginx-access.conf使用GeoIP插件,并把message字段用mutate插件移除(感覺message字段有點重複多餘)

input {
  redis {
      data_type => "list"
          host => "192.168.0.22"
          db => "0"
          port => "6379"
          key => "nginx-accesslog-021"
          password => "123456"
  }
}

input {
  redis {
      data_type => "list"
          host => "192.168.0.22"
          db => "1"
          port => "6379"
          key => "nginx-accesslog-022"
          password => "123456"
  }
}

filter {
    grok {
        match => { "message" => "%{COMBINEDAPACHELOG}" }
    }

        if [clientip] !~ "^127\.|^192\.168\.|^172\.1[6-9]\.|^172\.2[0-9]\.|^172\.3[01]\.|^10\." {   #排除内網位址
                geoip {
                        source => "clientip"   #設定解析IP位址的字段
                        target => "geoip"      #将geoip資料儲存到一個字段内
                        database => "/etc/logstash/GeoLite2-City_20181218/GeoLite2-City.mmdb"  #IP位址資料庫
                }
        }

    mutate {
                remove_field => "message"
    }
}

output {
  if [type] == "nginx-accesslog-021" {
  elasticsearch {
    hosts => ["192.168.0.21:9200"]
        index => "logstash-nginx-accesslog-021-%{+YYYY.MM.dd}"
  }
  stdout { codec => rubydebug }
  }
  
  if [type] == "nginx-accesslog-022" {
  elasticsearch {
    hosts => ["192.168.0.21:9200"]
        index => "logstash-nginx-accesslog-022-%{+YYYY.MM.dd}"
  }
  stdout { codec => rubydebug }
  }   
}
           

配置檔案我們配置好了,這時候我們再echo兩條日志到第一台伺服器的/data/access.log,一條IP為公網位址,另一條IP為内網位址

echo '223.99.202.178 - - [24/Dec/2018:11:39:07 +0800] "POST /user/doLogin HTTP/1.1" 200 111 "https://www.hehehe.com/" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36"' >> /data/access.log
echo '192.168.1.210 - - [24/Dec/2018:11:39:07 +0800] "POST /user/doLogin HTTP/1.1" 200 111 "https://www.hehehe.com/" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36"' >> /data/access.log
           

再以調試模式啟動logstash

[[email protected] elasticsearch-head]# /usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/nginx-access.conf 
WARNING: Could not find logstash.yml which is typically located in $LS_HOME/config or /etc/logstash. You can specify the path using --path.settings. Continuing using the defaults
Could not find log4j2 configuration at path //usr/share/logstash/config/log4j2.properties. Using default config which logs errors to the console
{
        "request" => "/user/doLogin",
          "agent" => "\"Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36\"",
          "geoip" => {
             "city_name" => "Jinan",
              "timezone" => "Asia/Shanghai",
                    "ip" => "223.99.202.178",
              "latitude" => 36.6683,
          "country_name" => "China",
         "country_code2" => "CN",
        "continent_code" => "AS",
         "country_code3" => "CN",
           "region_name" => "Shandong",
              "location" => {
            "lon" => 116.9972,
            "lat" => 36.6683
        },
           "region_code" => "SD",
             "longitude" => 116.9972
    },
         "offset" => 22106,
           "auth" => "-",
          "ident" => "-",
     "input_type" => "log",
           "verb" => "POST",
         "source" => "/data/access.log",
           "type" => "nginx-accesslog-021",
       "referrer" => "\"https://www.hehehe.com/\"",
     "@timestamp" => 2018-12-27T07:01:09.563Z,
       "response" => "200",
          "bytes" => "111",
       "clientip" => "223.99.202.178",
           "beat" => {
            "name" => "localhost",
        "hostname" => "localhost",
         "version" => "5.6.1"
    },
       "@version" => "1",
    "httpversion" => "1.1",
      "timestamp" => "24/Dec/2018:11:39:07 +0800"
}
{
        "request" => "/user/doLogin",
          "agent" => "\"Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36\"",
         "offset" => 22333,
           "auth" => "-",
          "ident" => "-",
     "input_type" => "log",
           "verb" => "POST",
         "source" => "/data/access.log",
           "type" => "nginx-accesslog-021",
       "referrer" => "\"https://www.hehehe.com/\"",
     "@timestamp" => 2018-12-27T07:01:09.563Z,
       "response" => "200",
          "bytes" => "111",
       "clientip" => "192.168.1.210",
           "beat" => {
            "name" => "localhost",
        "hostname" => "localhost",
         "version" => "5.6.1"
    },
       "@version" => "1",
    "httpversion" => "1.1",
      "timestamp" => "24/Dec/2018:11:39:07 +0800"
}
           

可以看到,公網位址是有geoip這個字段的,字段裡包含了該源IP位址的所屬的大洲、國家、省份、城市、經緯度等資訊。内網位址則沒有geoip字段。而且兩條資料的message字段都被移除了,證明上面的logstash配置檔案沒有問題。

然後去kibana中展示這個公網ip位址(這裡隻用了1條進行展示,如果是線上的比較多的日志,效果會更好)

CentOS7安裝ELK日志分析系統

至此,ELK收集非josn格式的日志就完成了。

十一、nginx日志格式為json的日志收集

之是以要把日志寫成json格式再收集是因為json格式的日志不需要正規表達式去比對,可以省去logstash比對正則所用的時間,也降低了logstash伺服器的硬體開銷。nginx的日志可以寫成json格式,是以最好将nginx的日志配置成json格式的再進行收集。

nginx的配置檔案中增加一條如下的日志配置:

log_format access_json '{"clientip":"$remote_addr","timestamp":"$time_local","host":"$server_addr","request":"$request","size":"$body_bytes_sent","responsetime":"$request_time","upstreamaddr":"$upstream_addr","upstreamtime":"$upstream_response_time","http_host":"$http_host","url":"$uri","xff":"$http_x_forwarded_for","referer":"$http_referer","status":"$status","useragent":"$http_user_agent"}';
           

再調用這個日志格式

access_log  /data/access.log  access_json;
           

然後重新加載nginx

/usr/local/nginx/sbin/nginx -s reload
           

具體的參數說明

參數                      說明                                         示例
$remote_addr             用戶端位址                                    211.28.65.253
$time_local              通路時間和時區                                27/Dec/2018:17:07:47 +0800
$server_addr             請求的伺服器位址                              192.168.0.21
$request                 請求的URI和HTTP協定                           "GET /index.html HTTP/1.1"
$body_bytes_sent         發送給用戶端檔案内容大小                      612
$request_time            整個請求的總時間                              0.205
$upstream_addr           背景upstream的位址,即真正提供服務的主機位址  10.10.10.100:80
$upstream_response_time  請求過程中,upstream響應時間                  0.002
$http_host               請求位址,即浏覽器中你輸入的位址(IP或域名)  www.wang.com 192.168.100.100
$uri                     請求URI                                       /index.html
$http_x_forwarded_for    當你使用了代理時,web伺服器就不知道你的真實IP了,為了避免這個情況,代理伺服器通常會增加一個叫做x_forwarded_for的頭資訊,把連接配接它的用戶端IP(即你的上網機器IP)加到這個頭資訊裡,這樣就能保證網站的web伺服器能擷取到真實IP
$http_referer            url跳轉來源                                   https://www.baidu.com/
$status                  HTTP請求狀态                                  200
$http_user_agent         使用者終端浏覽器等資訊                          "useragent":"Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36"
           

filebeat的配置檔案/etc/filebeat/filebeat.yml不變

filebeat.prospectors:
- input_type: log
  paths:
    - /data/access.log        #指定推送日志檔案
  exclude_lines: ["^DBG","^$"]
  document_type: nginx-accesslog-021

output.redis:

  hosts: ["192.168.0.22"]
  port: "6379"
  datatype: "list"
  password: "123456"
  key: "nginx-accesslog-021"  #為了後期日志處理,建議自定義key名稱
  db: 0
  timeout: 5
           

這裡我們先把filebeat給關閉,因為下面需要手動修改日志的IP位址。

/etc/init.d/filebeat stop
           

新增logstash的配置檔案/etc/logstash/conf.d/nginx-access-json.conf

input {
  redis {
      data_type => "list"
          host => "192.168.0.22"
          db => "0"
          port => "6379"
          key => "nginx-accesslog-021"
          password => "123456"
  }
}

input {
  redis {
      data_type => "list"
          host => "192.168.0.22"
          db => "1"
          port => "6379"
          key => "nginx-accesslog-022"
          password => "123456"
  }
}

filter {
    json {
        source => "message"
    }
    
    mutate {
        remove_field => "message"
    }
}

output {
  if [type] == "nginx-accesslog-021" {
  elasticsearch {
    hosts => ["192.168.0.21:9200"]
        index => "logstash-nginx-accesslog-021-%{+YYYY.MM.dd}"
  }
  stdout { codec => rubydebug }
  }
  
  if [type] == "nginx-accesslog-022" {
  elasticsearch {
    hosts => ["192.168.0.21:9200"]
        index => "logstash-nginx-accesslog-022-%{+YYYY.MM.dd}"
  }
  stdout { codec => rubydebug }
  }   
}
           

用浏覽器通路一下nginx的index.html,在/data/access.log中生成一條新日志,我們手動把這條日志的IP改成公網IP,如下:

{"clientip":"106.9.119.46","timestamp":"28/Dec/2018:15:48:22 +0800","host":"192.168.0.21","request":"GET /favicon.ico HTTP/1.1","size":"572","responsetime":"0.000","upstreamaddr":"-","upstreamtime":"-","http_host":"abc.def.com","url":"/favicon.ico","xff":"-","referer":"http://abc.def.com/index.html","status":"200","useragent":"Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36"}
           

啟動filebeat:

/etc/init.d/filebeat start
           

以debug模式啟動logstash

[[email protected] elasticsearch-head]# /usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/nginx-access-json.conf 
WARNING: Could not find logstash.yml which is typically located in $LS_HOME/config or /etc/logstash. You can specify the path using --path.settings. Continuing using the defaults
Could not find log4j2 configuration at path //usr/share/logstash/config/log4j2.properties. Using default config which logs errors to the console
{
         "request" => "GET /favicon.ico HTTP/1.1",
    "upstreamaddr" => "-",
         "referer" => "http://abc.def.com/index.html",
           "geoip" => {
                    "ip" => "106.9.119.46",
              "latitude" => 34.7725,
          "country_name" => "China",
         "country_code2" => "CN",
        "continent_code" => "AS",
         "country_code3" => "CN",
              "location" => {
            "lon" => 113.7266,
            "lat" => 34.7725
        },
             "longitude" => 113.7266
    },
          "offset" => 40236,
      "input_type" => "log",
       "useragent" => "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36",
          "source" => "/data/access.log",
            "type" => "nginx-accesslog-021",
       "http_host" => "abc.def.com",
             "url" => "/favicon.ico",
      "@timestamp" => 2018-12-28T07:58:33.594Z,
            "size" => "572",
        "clientip" => "106.9.119.46",
            "beat" => {
            "name" => "localhost",
        "hostname" => "localhost",
         "version" => "5.6.1"
    },
        "@version" => "1",
            "host" => "192.168.0.21",
    "responsetime" => "0.000",
             "xff" => "-",
    "upstreamtime" => "-",
       "timestamp" => "28/Dec/2018:15:48:22 +0800",
          "status" => "200"
}
           

接下去的kibana的展示就像之前的一樣操作

elk