GitHub Trends

#python #callcenter #conformer #ctc_decode #deepspeech #fastspeech2 #language_model #mandarin_language #ngram #parallel_wavegan #punctuation_restoration #speech_alignment #speech_recognition #speech_to_text #speech_translation #streaming_asr #text_frontend #text_to_speech #transformer

https://github.com/PaddlePaddle/PaddleSpeech

GitHub

GitHub - PaddlePaddle/PaddleSpeech: Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with…

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translatio...

662 views10:56

GitHub Trends

#python #asr #deeplearning #generative_ai #large_language_models #machine_translation #multimodal #neural_networks #speaker_diariazation #speaker_recognition #speech_synthesis #speech_translation #tts

NVIDIA NeMo is a powerful, easy-to-use platform for building, customizing, and deploying generative AI models like large language models (LLMs), vision language models, and speech AI. It lets you quickly train and fine-tune models using pre-built code and checkpoints, supports the latest model architectures, and works on cloud, data center, or edge environments. NeMo 2.0 is even more flexible and scalable, with Python-based configuration and modular design, making it simple to experiment and scale up. The main benefit is that you can create advanced AI applications faster, with less effort, and at lower cost, while getting high performance and easy deployment options[1][2][3].

https://github.com/NVIDIA/NeMo

GitHub

GitHub - NVIDIA-NeMo/NeMo: A scalable generative AI framework built for researchers and developers working on Large Language Models…

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech) - NVIDIA-NeMo/NeMo

548 views13:00

About

Blog

Apps

Platform