V
主页
语音文本技术论文阅读 OpenAI最新的Whisper ASR也会像GPT-3一样火起来吗?
发布人
Is OpenAI's Whisper ASR be as successful as GPT-3 in NLP domain? Whisper is an automatic speech recognition (ASR) system trained on 680,000 hours of multilingual and multitask supervised data collected from the web. We show that the use of such a large and diverse dataset leads to improved robustness to accents, background noise and technical language. Moreover, it enables transcription in multiple languages, as well as translation from those languages into English. We are open-sourcing models and inference code to serve as a foundation for building useful applications and for further research on robust speech processing. #openai #whisper #asr #gpt3 #nlp #wav2vec #hubert #transformer #google #meta #microsoft #icml #nips
打开封面
下载高清视频
观看高清视频
视频下载器
十分钟看懂脸书太极拳法Wav2Vec2.0 -- 语音预训练模型就像绝命毒师老白教杰西
十分钟告诉你为什么OpenAI的Whisper语音识别没ChatGPT那么好用 [语音语言论文阅读]
【2024最新完整版】不愧是李宏毅教授!一口气学完机器学习、深度学习、强化学习、NLP、生成式AI等课程!一套全解决!
[Long Review] Cascaded Diffusion Models for High Fidelity Image Generation
解锁天顶星科技ChatGPT
详解OpenAI GPT-3: Language Models are Few-Shot Learners(1/3)
语音NLP论文阅读 Token-level Sequence Labeling for SLU using Compositional E2E Models
详解微软零样本语音合成VALL-E
语音文本技术论文阅读 XLS-R: Self-supervised Cross-lingual Speech Representation Learning a
语音文本技术论文阅读 Exploring Wav2vec 2.0 fine-tuning for improved speech emotion recogni
发现了个逐行解读87个深度学习前沿网络架构和模块代码的项目,看完太赚了!
语音文本技术论文阅读 RNN-T: Sequence Transduction with Recurrent Neural Networks
入门到进阶!一口气学完CNN、RNN、GAN、transformer、ResNet、BERT、GPT、YOLO等八大深度学习神经网络算法模型!
[Long Review] Axial Attention in Multidimensional Transformers
NLP学起来太难了吧!迪哥带你高效入门NLP自然语言处理,从原理到分类实战,3小时完全吃透!
十分钟看懂谷歌易筋经BERT
[Long Review] Conformer: Convolution-augmented Transformer for Speech Recogniti
CV论文阅读OPENAI CLIP(1/3):Learning Transferable Visual Models From Natural Language
十分钟看懂谷歌铁布衫BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised ...
CV论文阅读OpenAI CLIP(2/3):Learning Transferable Visual Models From Natural Language
语音文本技术论文阅读 Joint Unsupervised and Supervised Training for Multilingual ASR
从OpenAI's Whisper模型到你自主研发的语音识别服务: 总论 (第一部分)
详解OpenAI GPT-3: Language Models are Few-Shot Learners(2/3)
语音文本技术论文阅读 SNRi Target Training for Joint Speech Enhancement and Recognition
三分钟搞定微软零样本语音合成VALL-E
[Long Review] Xception: Deep Learning with Depthwise Separable Convolution
神马!只用60行Numpy代码手搓出GPT大模型!这老哥简直太牛啦
语音文本技术论文阅读 One-Edit-Distance Network (OEDN) in Mispronunciation Detection & ASR
2024强推!终于有教程把【深度学习时间序列预测】讲透彻了!LSTM、Informer、ARIMA模型、Pandas从零详解,迪哥半天带你搞定时间序列任务实战!
知识图谱实战系列:华东理工博士精讲知识图谱核心知识点,带你实战练手Neo4j图数据、医疗智能问答助手、NLP关系抽取核心等!
强烈推荐!台大李宏毅自注意力机制和Transformer详解!从入门到入神,小白看完也能轻松学会!!
[Long Review] Fully Sharded Data Parallel: faster AI training with fewer GPUs
深度篇:谷歌“万能”语音识别大模型USM全面碾压了OpenAI的Whisper模型
[Long Review] GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
十分钟看懂微软大力金刚掌WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack
强推!三位大牛合作发表在Nature上深度学习论文,建议所有深度学习初学者把它作为第一篇论文来阅读!
Boris Johnson约翰逊辞职演讲 - 附双麦克风使用分析
[Long Review] Transfer Learning from Speaker Verification to Multispeaker TTS
【清华NLP】刘知远团队大模型公开课,从入门到实战完整版!|带你从入门到实战!
论文终于有救啦!比导师讲的还清楚,这个逐行解读代码、公式的神仙网站你一定要知道!