天天看點

收藏 | Tensorflow實作的深度NLP模型集錦(附資源)

來源:深度學習與NLP

文章來源:微信公衆号 資料派THU

本文收集整理了一批基于Tensorflow實作的深度學習/機器學習的深度NLP模型。

收藏 | Tensorflow實作的深度NLP模型集錦(附資源)

收集整理了一批基于Tensorflow實作的深度學習/機器學習的深度NLP模型。

基于Tensorflow的自然語言處理模型,為自然語言處理問題收集機器學習和Tensorflow深度學習模型,100%Jupeyter NoteBooks且内部代碼極為簡潔。

資源整理自網絡,源位址:

https://github.com/huseinzol05

目錄

  • Text classification
  • Chatbot
  • Neural Machine Translation
  • Embedded
  • Entity-Tagging
  • POS-Tagging
  • Dependency-Parser
  • Question-Answers
  • Supervised Summarization
  • Unsupervised Summarization
  • Stemming
  • Generator
  • Language detection
  • OCR (optical character recognition)
  • Speech to Text
  • Text to Speech
  • Text Similarity
  • Miscellaneous
  • Attention

目标

原始的實作稍微有點複雜,對于初學者來說有點難。是以我嘗試将其中大部分内容簡化,同時,還有很多論文的内容亟待實作,一步一步來。

内容

文本分類:

連結:

https://github.com/huseinzol05/NLP-Models-Tensorflow/tree/master/text-classification
  1. Basic cell RNN
  2. Bidirectional RNN
  3. LSTM cell RNN
  4. GRU cell RNN
  5. LSTM RNN + Conv2D
  6. K-max Conv1d
  7. LSTM RNN + Conv1D + Highway
  8. LSTM RNN with Attention
  9. Neural Turing Machine
  10. Seq2Seq
  11. Bidirectional Transformers
  12. Dynamic Memory Network
  13. Residual Network using Atrous CNN + Bahdanau Attention
  14. Transformer-XL

完整清單包含(66 notebooks)

聊天機器人:

https://github.com/huseinzol05/NLP-Models-Tensorflow/tree/master/chatbot
  1. Seq2Seq-manual
  2. Seq2Seq-API Greedy
  3. Bidirectional Seq2Seq-manual
  4. Bidirectional Seq2Seq-API Greedy
  5. Bidirectional Seq2Seq-manual + backward Bahdanau + forward Luong
  6. Bidirectional Seq2Seq-API + backward Bahdanau + forward Luong + Stack Bahdanau Luong Attention + Beam Decoder
  7. Bytenet
  8. Capsule layers + LSTM Seq2Seq-API + Luong Attention + Beam Decoder
  9. End-to-End Memory Network
  10. Attention is All you need
  11. Transformer-XL + LSTM
  12. GPT-2 + LSTM

完整清單包含(51 notebooks)

機器翻譯(英語到越南語):

https://github.com/huseinzol05/NLP-ModelsTensorflow/tree/master/neural-machine-translation

完整清單包含(49 notebooks)

詞向量:

https://github.com/huseinzol05/NLP-Models-Tensorflow/tree/master/embedded
  1. Word Vector using CBOW sample softmax
  2. Word Vector using CBOW noise contrastive estimation
  3. Word Vector using skipgram sample softmax
  4. Word Vector using skipgram noise contrastive estimation
  5. Lda2Vec Tensorflow
  6. Supervised Embedded
  7. Triplet-loss + LSTM
  8. LSTM Auto-Encoder
  9. Batch-All Triplet-loss LSTM
  10. Fast-text
  11. ELMO (biLM)

詞性标注:

https://github.com/huseinzol05/NLP-Models-Tensorflow/tree/master/pos-tagging
  1. Bidirectional RNN + Bahdanau Attention + CRF
  2. Bidirectional RNN + Luong Attention + CRF
  3. Bidirectional RNN + CRF

實體識别:

https://github.com/huseinzol05/NLP-Models-Tensorflow/tree/master/entity-tagging
  1. Char Ngrams + Bidirectional RNN + Bahdanau Attention + CRF
  2. Char Ngrams + Residual Network + Bahdanau Attention + CRF

依存分析:

https://github.com/huseinzol05/NLP-ModelsTensorflow/tree/master/dependency-parser
  1. Residual Network + Bahdanau Attention + CRF
  2. Residual Network + Bahdanau Attention + Char Embedded + CRF

問答:

https://github.com/huseinzol05/NLP-Models-Tensorflow/tree/master/question-answer
  1. End-to-End Memory Network + Basic cell
  2. End-to-End Memory Network + GRU cell
  3. End-to-End Memory Network + LSTM cell

詞幹抽取:

https://github.com/huseinzol05/NLP-Models-Tensorflow/tree/master/stemming
  1. LSTM + Seq2Seq + Beam
  2. GRU + Seq2Seq + Beam
  3. LSTM + BiRNN + Seq2Seq + Beam
  4. GRU + BiRNN + Seq2Seq + Beam
  5. DNC + Seq2Seq + Greedy

有監督摘要抽取:

https://github.com/huseinzol05/NLP-Models-Tensorflow/tree/master/summarization
  1. LSTM Seq2Seq using topic modelling
  2. LSTM Seq2Seq + Luong Attention using topic modelling
  3. LSTM Seq2Seq + Beam Decoder using topic modelling
  4. LSTM Bidirectional + Luong Attention + Beam Decoder using topic modelling
  5. LSTM Seq2Seq + Luong Attention + Pointer Generator

無監督摘要抽取:

https://github.com/huseinzol05/NLP-Models-Tensorflow/tree/master/unsupervised-summarization
  1. Skip-thought Vector (unsupervised)
  2. Residual Network using Atrous CNN (unsupervised)
  3. Residual Network using Atrous CNN + Bahdanau Attention (unsupervised)

**OCR (字元識别):

**

https://github.com/huseinzol05/NLP-Models-Tensorflow/tree/master/ocr
  1. CNN + LSTM RNN

語音識别:

https://github.com/huseinzol05/NLP-Models-Tensorflow/tree/master/speech-to-text
  1. Tacotron
  2. Bidirectional RNN + Greedy CTC
  3. Bidirectional RNN + Beam CTC
  4. Seq2Seq + Bahdanau Attention + Beam CTC
  5. Seq2Seq + Luong Attention + Beam CTC
  6. Bidirectional RNN + Attention + Beam CTC
  7. Wavenet

語音合成:

https://github.com/huseinzol05/NLP-Models-Tensorflow/tree/master/text-to-speech
  1. Seq2Seq + Luong Attention
  2. Seq2Seq + Bahdanau Attention

生成器:

https://github.com/huseinzol05/NLP-Models-Tensorflow/tree/master/generator
  1. Character-wise RNN + LSTM
  2. Character-wise RNN + Beam search
  3. Character-wise RNN + LSTM + Embedding
  4. Word-wise RNN + LSTM
  5. Word-wise RNN + LSTM + Embedding
  6. Character-wise + Seq2Seq + GRU
  7. Word-wise + Seq2Seq + GRU
  8. Character-wise RNN + LSTM + Bahdanau Attention
  9. Character-wise RNN + LSTM + Luong Attention

語言檢測:

https://github.com/huseinzol05/NLP-Models-Tensorflow/tree/master/language-detection
  1. Fast-text Char N-Grams

文本相似性:

https://github.com/huseinzol05/NLP-Models-Tensorflow/tree/master/text-similarity
  1. Character wise similarity + LSTM + Bidirectional
  2. Word wise similarity + LSTM + Bidirectional
  3. Character wise similarity Triplet loss + LSTM
  4. Word wise similarity Triplet loss + LSTM

注意力機制:

https://github.com/huseinzol05/NLP-Models-Tensorflow/tree/master/attention
  1. Bahdanau
  2. Luong
  3. Hierarchical
  4. Additive
  5. Soft
  6. Attention-over-Attention
  7. Bahdanau API
  8. Luong API

其他:

https://github.com/huseinzol05/NLP-Models-Tensorflow/tree/master/misc
  1. Attention heatmap on Bahdanau Attention
  2. Attention heatmap on Luong Attention

非深度學習:

https://github.com/huseinzol05/NLP-Models-Tensorflow/tree/master/not-deep-learning
  1. Markov chatbot
  2. Decomposition summarization (3 notebooks)

繼續閱讀