Transformer模型--Attention機制

2023-04-29 12:07:27

Transformer模型來源于谷歌2017年的一篇文章（Attention is all you need）。在現有的Encoder-Decoder架構中，都是基于CNN或者RNN來實作的。而Transformer模型彙中抛棄了CNN和RNN，隻使用了Attention來實作。是以Transformer是一個完全基于注意力機制的Encoder-Decoder模型。

在Transformer模型中引入了self-Attention這一概念，Transformer的整個架構就是疊層的self-Attention和全連接配接層。具體的結構如下：

上面結構中的左半部分是Encoder，右半部分是Decoder。

創新點：Transformer隻采用了attention機制。不像傳統的encoder-decoder的模型需要結合RNN或者CNN來使用。創新之處在于使用了scaled Dot-product Attention和Multi-Head Attention。

将Transformer解釋的最容易懂的還是The illustrated transformer

然後哈佛大學也給出了詳細的pytorch版本的代碼，有jupyter notebook詳細的解釋，看完也會有别樣的收獲。

附：深度學習中的注意力機制(2017版)

《Attention is All You Need》淺讀（簡介+代碼）

未完待續，還會再更這一部分的内容，因為我還沒咋了解透。。。

Transformer模型--Attention機制

繼續閱讀

【CV中的Attention機制】ShuffleAttention

【CV中的Attention機制】融合Non-Local和SENet的GCNet

【CV中的Attention機制】易于內建的Convolutional Block Attention Module(CBAM子產品)

從大表哥的相親會了解Transformer中的self-attention機制

《NAIS: Neural Attentive Item Similarity Model for Recommendation》論文閱讀及解析論文翻譯NAIS模型論文解析NAIS模型論文實驗資料NAIS模型論文下載下傳論文中資料集詳細介紹的論文：NCF

attention機制_Attention機制與Transformer算法

attention機制_幹貨|了解attention機制本質及selfattention

attention機制_【CV中的Attention機制】SENet中的SE子產品

神經網絡DNN--詳解

transformer模型_Transformer模型細節了解及Tensorflow實作

transformer中attention計算方式_Transformer在推薦模型中的應用總結

Vison Transformer模型源碼閱讀

transformer模型原理