GitHub Trends
10.1K subscribers
15.3K links
See what the GitHub community is most excited about today.

A bot automatically fetches new repositories from https://github.com/trending and sends them to the channel.

Author and maintainer: https://github.com/katursis
Download Telegram
#python #audio_generation #audio_synthesis #audioldm #audit #fastspeech2 #hifi_gan #music_generation #naturalspeech2 #singing_voice_conversion #speech_synthesis #text_to_audio #text_to_speech #vall_e #vits #voice_conversion

Amphion is a toolkit for generating audio, music, and speech. It helps researchers and engineers, especially beginners, by providing tools for various tasks like turning text into speech (TTS), singing voice conversion (SVC), and text to audio (TTA). Amphion includes visualizations to help understand how these models work, which is very useful for learning. It also offers different vocoders to produce high-quality audio and evaluation metrics to ensure the generated audio is good. This toolkit is free to use under the MIT License and can be installed easily using Python or Docker. Using Amphion, you can create high-quality audio and music with advanced features, making it a powerful tool for both research and practical applications.

https://github.com/open-mmlab/Amphion
👍1
#python #deep_learning #glow_tts #hifigan #melgan #multi_speaker_tts #python #pytorch #speaker_encoder #speaker_encodings #speech #speech_synthesis #tacotron #text_to_speech #tts #tts_model #vocoder #voice_cloning #voice_conversion #voice_synthesis

The new version of TTS (Text-to-Speech) from Coqui.ai, called TTSv2, is now available with several improvements. It supports 16 languages and has better performance overall. You can fine-tune the models using the provided code and examples. The TTS system can now stream audio with less than 200ms latency, making it very responsive. Additionally, you can use over 1,100 Fairseq models and new features like voice cloning and voice conversion. This update also includes faster inference with the Tortoise model and support for multiple speakers and languages. These enhancements make it easier and more efficient to generate high-quality speech from text.

https://github.com/coqui-ai/TTS
#python #singing_voice_conversion #voice_conversion

This tool helps you change voices in real-time or offline. It supports voice conversion, singing voice conversion, and can clone a voice with just 1-30 seconds of reference speech. You can use it for online meetings, gaming, or live streaming. The model is easy to fine-tune with custom data, requiring only one utterance per speaker. This makes it useful for creating personalized voice effects quickly and efficiently.

https://github.com/Plachtaa/seed-vc
👍2