同步公众号(arXiv每日学术速递)
【1】 The Perceptimatic English Benchmark for Speech Perception Models
标题:言语感知模型的感性英语基准
作者: Juliette Millet, Ewan Dunbar
备注:Accepted to CogSci Conference 2020
链接:https://arxiv.org/abs/2005.03418
【2】 Crop Aggregating for short utterances speaker verification using raw waveforms
标题:使用原始波形的短话语的裁剪聚集说话人验证
作者: Seung-bin Kim, Ha-Jin Yu
链接:https://arxiv.org/abs/2005.03329
【3】 Cotatron: Transcription-Guided Speech Encoder for Any-to-Many Voice Conversion without Parallel Data
标题:Coatron:转录导向语音编码器,用于无并行数据的任意对多语音转换
作者: Seung-won Park, Myun-chul Joe
备注:Submitted to Interspeech 2020
链接:https://arxiv.org/abs/2005.03295
【4】 ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context
标题:ContextNet:用于全局上下文自动语音识别的改进卷积神经网络
作者: Wei Han, Yonghui Wu
链接:https://arxiv.org/abs/2005.0319
【5】 Study of human phonation in a full body domain
标题:人体全身发声的研究
作者: Shakti Saurabh, Daniel Bodony
链接:https://arxiv.org/abs/2005.02168
【6】 End-to-end Whispered Speech Recognition with Frequency-weighted Approaches and Layer-wise Transfer Learning
标题:基于频率加权和分层转移学习的端到端耳语语音识别
作者: Heng-Jui Chang, Lin-shan Lee
备注:submitted to INTERSPEECH 2020
链接:https://arxiv.org/abs/2005.0197
【7】 Study of human phonation in a full body domain
标题:人体全身发声的研究
作者: Shakti Saurabh, Daniel Bodony
链接:https://arxiv.org/abs/2005.02168
【8】 End-to-end Whispered Speech Recognition with Frequency-weighted Approaches and Layer-wise Transfer Learning
标题:基于频率加权和分层转移学习的端到端耳语语音识别
作者: Heng-Jui Chang, Lin-shan Lee
备注:submitted to INTERSPEECH 2020
链接:https://arxiv.org/abs/2005.0197