同步公衆号(arXiv每日學術速遞)
【1】 SNR-Based Features and Diverse Training Data for Robust DNN-Based Speech Enhancement
标題:用于魯棒DNN語音增強的基于SNR的特征和多樣化的訓練資料
作者: Robert Rehr, Timo Gerkmann
連結:https://arxiv.org/abs/2004.03512
【2】 Homophone-based Label Smoothing in End-to-End Automatic Speech Recognition
标題:端到端自動語音識别中基于同音字的标簽平滑
作者: Yi Zheng, Xuyong Dang
連結:https://arxiv.org/abs/2004.03437
【3】 Learning to fool the speaker recognition
标題:學習愚弄說話人識别
作者: Jiguo Li, Wen Gao
備注:Accepted by ICASSP2020
連結:https://arxiv.org/abs/2004.03434
【4】 Universal Adversarial Perturbations Generative Network for Speaker Recognition
标題:用于說話人識别的通用對抗性擾動生成網絡
作者: Jiguo Li, Wen Gao
備注:Accepted by ICME2020
連結:https://arxiv.org/abs/2004.03428
【5】 Direct Speech-to-image Translation
标題:直接語音到圖像翻譯
作者: Jiguo Li, Wen Gao
連結:https://arxiv.org/abs/2004.03413
【6】 Multi-Scale Aggregation Using Feature Pyramid Module for Text-Independent Speaker Verification
标題:使用特征金字塔子產品進行文本無關說話人确認的多尺度聚合
作者: Youngmoon Jung, Hoirin Kim
備注:Submitted to Interspeech 2020
連結:https://arxiv.org/abs/2004.0319
【7】 Emotional Video to Audio Transformation Using Deep Recurrent Neural Networks and a Neuro-Fuzzy System
标題:基于深層遞歸神經網絡和神經模糊系統的情感視訊到音頻轉換
作者: Gwenaelle Cunha Sergio, Minho Lee
連結:https://arxiv.org/abs/2004.02113
【8】 Simultaneous Denoising and Dereverberation Using Deep Embedding Features
标題:基于深度嵌入特征的同時去噪和去混響
作者: Cunhang Fan, Zhengqi Wen
連結:https://arxiv.org/abs/2004.0242
【9】 Temporarily-Aware Context Modelling using Generative Adversarial Networks for Speech Activity Detection
标題:使用生成對抗網絡的臨時感覺上下文模組化用于語音活動檢測
作者: Tharindu Fernando, Clinton Fookes
連結:https://arxiv.org/abs/2004.01546
【10】 Towards democratizing music production with AI-Design of Variational Autoencoder-based Rhythm Generator as a DAW plugin
标題:用AI實作音樂生産的民主化-基于可變自動編碼器的節奏生成器作為DAW插件的設計
作者: Nao Tokui
連結:https://arxiv.org/abs/2004.01525
【11】 Can Machine Learning Be Used to Recognize and Diagnose Coughs?
标題:機器學習可以用來識别和診斷咳嗽嗎?
作者: Charles Bales, Ali Imran
連結:https://arxiv.org/abs/2004.01495
【12】 AI4COVID-19: AI Enabled Preliminary Diagnosis for COVID-19 from Cough Samples via an App
标題:AI4COVID-19:AI通過App從咳嗽樣本中啟用了COVID-19的初步診斷
作者: Ali Imran, Muhammad Nabeel
連結:https://arxiv.org/abs/2004.01275
【13】 Towards Relevance and Sequence Modeling in Language Recognition
标題:語言識别中的相關性和序列模組化
作者: Bharat Padi, Sriram Ganapathy
連結:https://arxiv.org/abs/2004.0122
【14】 Multi-Modal Video Forensic Platform for Investigating Post-Terrorist Attack Scenarios
标題:用于調查後恐怖襲擊場景的多模式視訊驗證平台
作者: Alexander Schindler, Ross King
連結:https://arxiv.org/abs/2004.01023
【15】 Full-Sum Decoding for Hybrid HMM based Speech Recognition using LSTM Language Model
标題:基于LSTM語言模型的混合HMM語音識别的全和譯碼
作者: Wei Zhou, Hermann Ney
備注:accepted at ICASSP 2020
連結:https://arxiv.org/abs/2004.00967
【16】 The RWTH ASR System for TED-LIUM Release 2: Improving Hybrid HMM with SpecAugment
标題:用于TED-lium Release 2的RWTH ASR系統:使用SpecAugment改進混合HMM
作者: Wei Zhou, Hermann Ney
備注:accepted at ICASSP 2020
連結:https://arxiv.org/abs/2004.00960
【17】 iMetricGAN: Intelligibility Enhancement for Speech-in-Noise using Generative Adversarial Network-based Metric Learning
标題:iMetricGAN:使用基于生成性對抗網絡的度量學習增強噪聲中語音的可懂度
作者: Haoyu Li, Junichi Yamagishi
備注:5 pages, Submitted to INTERSPEECH 2020
連結:https://arxiv.org/abs/2004.00932
【18】 Improving auditory attention decoding performance of linear and non-linear methods using state-space model
标題:利用狀态空間模型改善線性和非線性方法的聽覺注意解碼性能
作者: Ali Aroudi, Simon Doclo
連結:https://arxiv.org/abs/2004.0091
【19】 Improving Perceptual Quality of Drum Transcription with the Expanded Groove MIDI Dataset
标題:使用擴充的Groove MIDI資料集提高鼓轉錄的感覺品質
作者: Lee Callender, Jesse Engel
連結:https://arxiv.org/abs/2004.00188
【20】 AM-MobileNet1D: A Portable Model for Speaker Recognition
标題:AM-MobileNet1D:一種可移植的說話人識别模型
作者: João Antônio Chagas Nunes, Cleber Zanchettin
連結:https://arxiv.org/abs/2004.00132
【21】 VaPar Synth – A Variational Parametric Model for Audio Synthesis
标題:VaPar Synth-一種音頻合成的變分參數模型
作者: Krishna Subramani, Alexandre D’Hooge
備注:this https URL , Accepted in ICASSP 2020
連結:https://arxiv.org/abs/2004.0000
【22】 Characterizing Speech Adversarial Examples Using Self-Attention U-Net Enhancement
标題:使用自我注意U-Net增強來表征語音對抗執行個體
作者: Chao-Han Huck Yang, Chin-Hui Lee
備注:The first draft was finished in August 2019. Accepted to IEEE ICASSP 2020
連結:https://arxiv.org/abs/2003.1391