天天看点

arXiv每日推荐-5.9:语音/音频每日论文速递

同步公众号(arXiv每日学术速递)

【1】 The Perceptimatic English Benchmark for Speech Perception Models

标题:言语感知模型的感性英语基准

作者: Juliette Millet, Ewan Dunbar

备注:Accepted to CogSci Conference 2020

链接:https://arxiv.org/abs/2005.03418

【2】 Crop Aggregating for short utterances speaker verification using raw waveforms

标题:使用原始波形的短话语的裁剪聚集说话人验证

作者: Seung-bin Kim, Ha-Jin Yu

链接:https://arxiv.org/abs/2005.03329

【3】 Cotatron: Transcription-Guided Speech Encoder for Any-to-Many Voice Conversion without Parallel Data

标题:Coatron:转录导向语音编码器,用于无并行数据的任意对多语音转换

作者: Seung-won Park, Myun-chul Joe

备注:Submitted to Interspeech 2020

链接:https://arxiv.org/abs/2005.03295

【4】 ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context

标题:ContextNet:用于全局上下文自动语音识别的改进卷积神经网络

作者: Wei Han, Yonghui Wu

链接:https://arxiv.org/abs/2005.0319

【5】 Study of human phonation in a full body domain

标题:人体全身发声的研究

作者: Shakti Saurabh, Daniel Bodony

链接:https://arxiv.org/abs/2005.02168

【6】 End-to-end Whispered Speech Recognition with Frequency-weighted Approaches and Layer-wise Transfer Learning

标题:基于频率加权和分层转移学习的端到端耳语语音识别

作者: Heng-Jui Chang, Lin-shan Lee

备注:submitted to INTERSPEECH 2020

链接:https://arxiv.org/abs/2005.0197

【7】 Study of human phonation in a full body domain

标题:人体全身发声的研究

作者: Shakti Saurabh, Daniel Bodony

链接:https://arxiv.org/abs/2005.02168

【8】 End-to-end Whispered Speech Recognition with Frequency-weighted Approaches and Layer-wise Transfer Learning

标题:基于频率加权和分层转移学习的端到端耳语语音识别

作者: Heng-Jui Chang, Lin-shan Lee

备注:submitted to INTERSPEECH 2020

链接:https://arxiv.org/abs/2005.0197

继续阅读