golang logrus 記錄錯誤堆棧_golang 使用pprof和go-torch做性能分析

軟體開發過程中，項目上線并不是終點。上線後，還要對程式的取樣分析運作情況，并重構現有的功能，讓程式執行更高效更穩寫。 golang的工具包内自帶pprof功能，使找出程式中占記憶體和CPU較多的部分功能友善了不少。加上uber的火焰圖，可視化顯示，讓我們在分析程式時更簡單明了。

pprof有兩個包用來分析程式一個是net/http/pprof另一個是runtime/pprof，net/http/pprof隻是對runtime/pprof包進行封裝并用http暴露出來，如下圖源碼所示：

使用net/http/pprof分析web服務

pprof分析web項目，非常的簡單隻需要導入表即可。

_ "net/http/pprof"

編寫一個小的web伺服器

package mainimport (    _  "net/http/pprof"    "net/http"    "time"    "math/rand"    "fmt")var Count int64 = 0func main() {    go calCount()    http.HandleFunc("/test", test)    http.HandleFunc("/data", handlerData)    err := http.ListenAndServe(":9909", nil )    if err != nil {        panic(err)    }}func handlerData(w http.ResponseWriter, r *http.Request) {    qUrl := r.URL    fmt.Println(qUrl)    fibRev := Fib()    var fib uint64    for i:= 0; i < 5000; i++ {        fib = fibRev()        fmt.Println("fib = ", fib)    }    str := RandomStr(RandomInt(100, 500))    str =  fmt.Sprintf("Fib = %d; String = %s", fib, str)    w.Write([]byte(str))}func test(w http.ResponseWriter, r *http.Request) {    fibRev := Fib()    var fib uint64    index := Count    arr := make([]uint64, index)    var i int64    for ; i < index; i++ {        fib = fibRev()        arr[i] = fib        fmt.Println("fib = ", fib)    }    time.Sleep(time.Millisecond * 500)    str :=  fmt.Sprintf("Fib = %v", arr)    w.Write([]byte(str))}func Fib() func() uint64 {    var x, y uint64 = 0, 1    return func() uint64 {        x, y = y, x + y        return x    }}var letterRunes = []rune("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890")func RandomStr(num int) string {    seed := time.Now().UnixNano()    if seed <= 0 {        seed = time.Now().UnixNano()    }    rand.Seed(seed)    b := make([]rune, num)    for i := range b {        b[i] = letterRunes[rand.Intn(len(letterRunes))]    }    return string(b)}func RandomInt(min, max int) int {    rand.Seed(time.Now().UnixNano())    return rand.Intn(max - min + 1) + min}func calCount() {    timeInterval := time.Tick(time.Second)    for {        select {        case i :=

web服務監聽9909端口

web伺服器有兩個http方法 test：根據目前的秒數做斐波那契計算 data：做一個5000的斐波那契計算并傳回一個随機的字元串

運作程式,通過通路 http://192.168.3.34:9909/debug/pprof/可以檢視web版的profiles相關資訊

golang logrus 記錄錯誤堆棧_golang 使用pprof和go-torch做性能分析

這幾個路徑表示的是

/debug/pprof/profile：通路這個連結會自動進行 CPU profiling，持續 30s，并生成一個檔案供下載下傳

/debug/pprof/block：Goroutine阻塞事件的記錄。預設每發生一次阻塞事件時取樣一次。

/debug/pprof/goroutines：活躍Goroutine的資訊的記錄。僅在擷取時取樣一次。

/debug/pprof/heap：堆記憶體配置設定情況的記錄。預設每配置設定512K位元組時取樣一次。

/debug/pprof/mutex: 檢視争用互斥鎖的持有者。

/debug/pprof/threadcreate: 系統線程建立情況的記錄。僅在擷取時取樣一次。

除了這些golang為我提供了更多友善的方法，用于分析，下面我們來用指令去通路詳細的資訊

我們用wrk來通路我們的兩個方法，這樣我們的服務會處在高速運作狀态，取樣的結果會更準确

wrk -c 20 -t 5 -d 3m http://192.168.3.34:9909/datawrk -c 20 -t 5 -d 3m http://192.168.3.34:9909/test

分析CPU使用情況

使用指令分析CPU使用情況

go tool pprof httpdemo http://192.168.3.34:9909/debug/pprof/profile

在預設情況下，Go語言的運作時系統會以100 Hz的的頻率對CPU使用情況進行取樣。也就是說每秒取樣100次，即每10毫秒會取樣一次。為什麼使用這個頻率呢？因為100 Hz既足夠産生有用的資料，又不至于讓系統産生停頓。并且100這個數上也很容易做換算，比如把總取樣計數換算為每秒的取樣數。實際上，這裡所說的對CPU使用情況的取樣就是對目前的Goroutine的堆棧上的程式計數器的取樣。

預設的取樣時間是30s 你可以通過-seconds 指令來指定取樣時間。取樣完成後會進入指令行狀态：

golang logrus 記錄錯誤堆棧_golang 使用pprof和go-torch做性能分析

可以輸入help檢視相關的指令.這裡說幾個常用的指令

top指令，輸入top指令預設是返加前10的占用cpu的方法。當然人可以在指令後面加數字指定top數

golang logrus 記錄錯誤堆棧_golang 使用pprof和go-torch做性能分析

list指令根據你的正則輸出相關的方法.直接跟可選項o 會輸出所有的方法。也可以指定方法名

golang logrus 記錄錯誤堆棧_golang 使用pprof和go-torch做性能分析

如： handlerData方法占cpu的74.81%

web指令：以網頁的形式展現：更直覺的顯示cpu的使用情況

golang logrus 記錄錯誤堆棧_golang 使用pprof和go-torch做性能分析

分析記憶體使用情況

和分析cpu差不多使用指令

go tool pprof httpdemo http://192.168.3.34:9909/debug/pprof/heap

預設情況下取樣時隻取目前記憶體使用情況，可以加可選指令alloc_objects，将從程式開始時的記憶體取樣

go tool pprof -alloc_objects httpdemo http://192.168.3.34:9909/debug/pprof/heap

和cpu的指令一樣，top list web。不同的是這裡顯示的是記憶體使用情況而已。這裡我就不示範了。

golang logrus 記錄錯誤堆棧_golang 使用pprof和go-torch做性能分析

安裝go-torch

還有更友善的工具就是uber的 go-torch了

安裝很簡單

go get github.com/uber/go-torchcd $GOPATH/src/github.com/uber/go-torchgit clone https://github.com/brendangregg/FlameGraph.git

然後運作FlameGraph下的拷貝 flamegraph.pl 到 /usr/local/bin

火焰圖分析CPU

使用指令

go-torch -u http://192.168.3.34:9909  --seconds 60 -f cpu.svg

會在目前目錄下生成cpu.svg檔案，使用浏覽器打開

golang logrus 記錄錯誤堆棧_golang 使用pprof和go-torch做性能分析

更直覺的看到應用程式的問題。handlerData方法占用的cpu時間過長。然後就是去代碼裡分析并優化了。

火焰圖分析記憶體

使用指令

go-torch  http://192.168.3.34:9909/debug/pprof/heap --colors mem  -f mem.svg

會在目前目錄下生成cpu.svg檔案，使用浏覽器打開

golang logrus 記錄錯誤堆棧_golang 使用pprof和go-torch做性能分析

使用runtime/pprof分析項目

如果你的項目不是web服務，比如是rpc服務等，就要使用runtime/pprof。他提供了很多方法，有時間可以看一下源碼

golang logrus 記錄錯誤堆棧_golang 使用pprof和go-torch做性能分析

我寫了一個簡單的工具類。用于調用分析

package profappimport (    "os"    "rrnc_im/lib/zaplogger"    "go.uber.org/zap"    "runtime/pprof"    "runtime")func StartCpuProf() {    f, err := os.Create("cpu.prof")    if err != nil {        zaplogger.Error("create cpu profile file error: ", zap.Error(err))        return    }    if err := pprof.StartCPUProfile(f); err != nil {        zaplogger.Error("can not start cpu profile,  error: ", zap.Error(err))        f.Close()    }}func StopCpuProf() {    pprof.StopCPUProfile()}//--------Memfunc ProfGc() {    runtime.GC() // get up-to-date statistics}func SaveMemProf() {    f, err := os.Create("mem.prof")    if err != nil {        zaplogger.Error("create mem profile file error: ", zap.Error(err))        return    }    if err := pprof.WriteHeapProfile(f); err != nil {        zaplogger.Error("could not write memory profile: ", zap.Error(err))    }    f.Close()}// goroutine blockfunc SaveBlockProfile() {    f, err := os.Create("block.prof")    if err != nil {        zaplogger.Error("create mem profile file error: ", zap.Error(err))        return    }    if err := pprof.Lookup("block").WriteTo(f, 0); err != nil {        zaplogger.Error("could not write block profile: ", zap.Error(err))    }    f.Close()}

在需要分析的方法内調用這些方法就可以比如我是用rpc開放了幾個方法

type TestProf struct {}func (*TestProf) StartCpuProAct(context.Context, *im_test.TestRequest, *im_test.TestRequest) error {    profapp.StartCpuProf()    return nil}func (*TestProf) StopCpuProfAct(context.Context, *im_test.TestRequest, *im_test.TestRequest) error {    profapp.StopCpuProf()    return nil}func (*TestProf) ProfGcAct(context.Context, *im_test.TestRequest, *im_test.TestRequest) error {    profapp.ProfGc()    return nil}func (*TestProf) SaveMemAct(context.Context, *im_test.TestRequest, *im_test.TestRequest) error {    profapp.SaveMemProf()    return nil}func (*TestProf) SaveBlockProfileAct(context.Context, *im_test.TestRequest, *im_test.TestRequest) error {    profapp.SaveBlockProfile()    return nil}

調用

profTest.StartCpuProAct(context.TODO(), &im_test.TestRequest{})    time.Sleep(time.Second * 30)    profTest.StopCpuProfAct(context.TODO(), &im_test.TestRequest{})    profTest.SaveMemAct(context.TODO(), &im_test.TestRequest{})    profTest.SaveBlockProfileAct(context.TODO(), &im_test.TestRequest{})

思想是一樣的，會在目前檔案夾内導出profile檔案。然後用火焰圖去分析，就不能指定域名了，要指定檔案

go-torch  httpdemo cpu.prof  go-torch  httpdemo mem.prof

golang logrus 記錄錯誤堆棧_golang 使用pprof和go-torch做性能分析

使用net/http/pprof分析web服務

分析CPU使用情況

分析記憶體使用情況

安裝go-torch

火焰圖分析CPU

使用runtime/pprof分析項目

繼續閱讀

golang debug 配置_Golang 程式遇到性能問題該怎麼辦？01前言02想做性能分析03一個簡單的例子一、通過 Web 界面二、通過互動式終端使用三、PProf 可視化界面四、PProf 火焰圖04總結05思考題

confluence搭建完出現系統錯誤_十天搭建一套前端監控系統(三) JavaScript錯誤歸類與堆棧分析...了解線上項目的報錯趨勢JavaScript錯誤分類聚合JavaScript錯誤的堆棧分析