GitHub Trends

#typescript #ai #azure_openai_api #chat #chatglm #chatgpt #claude #dalle_3 #function_calling #gemini #gpt #gpt_4 #gpt_4_vision #knowledge_base #nextjs #ollama #openai #qwen2 #rag #tts

LobeChat is an open-source, modern chatbot framework that supports ChatGPT and other Large Language Models (LLMs). It offers several key features Works with multiple AI model providers like OpenAI, Google AI, and more.
- **Speech Synthesis and Voice Conversation** Can recognize and respond to images using models like GPT-4 Vision.
- **Text to Image Generation** Extends functionality with plugins for tasks like web searches and document management.
- **One-Click Deployment** Offers customizable themes and optimized mobile experience.

These features make LobeChat highly flexible and user-friendly, allowing you to create a personalized and powerful chatbot with minimal setup.

https://github.com/lobehub/lobe-chat

GitHub

GitHub - lobehub/lobe-chat: 🤯 LobeHub - an open-source, modern design AI Agent Workspace. Supports multiple AI providers, Knowledge…

🤯 LobeHub - an open-source, modern design AI Agent Workspace. Supports multiple AI providers, Knowledge Base (file upload / RAG ), one click install MCP Marketplace and Artifacts / Thinking. One-cl...

417 views12:00

GitHub Trends

#python #speech_synthesis #text_to_speech #tts

The `edge-tts` module lets you use Microsoft Edge's text-to-speech service in your Python code or through commands. You can install it using `pip install edge-tts`. With this module, you can convert text to speech, change the voice and language, adjust the speech rate, volume, and pitch, and even play back the speech immediately. This is useful because it allows you to easily create audio files from text and customize how they sound, making it handy for various applications like automated announcements or educational tools.

https://github.com/rany2/edge-tts

GitHub

GitHub - rany2/edge-tts: Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows…

Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key - rany2/edge-tts

364 views14:00

GitHub Trends

#cplusplus #ai #api #audio_generation #distributed #gemma #gpt4all #image_generation #kubernetes #llama #llama3 #llm #mamba #mistral #musicgen #p2p #rerank #rwkv #stable_diffusion #text_generation #tts

LocalAI is a free, open-source alternative to OpenAI that you can run on your own computer or server. It allows you to generate text, images, and audio locally without needing a GPU. You can use it with various models and it supports multiple functionalities like text-to-audio, audio-to-text, and image generation. LocalAI is easy to set up using an installer script or Docker, and it has a user-friendly web interface. This tool is beneficial because it saves you money by not requiring cloud services and gives you full control over your data privacy. Plus, it's community-driven, so there are many resources and integrations available to help you get started and customize it to your needs.

https://github.com/mudler/LocalAI

GitHub

GitHub - mudler/LocalAI: :robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop…

:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf,...

416 views12:00

GitHub Trends

#python #llama #transformer #tts #valle #vits #vqgan #vqvae

Fish Speech is a powerful tool that converts text into speech in many languages, including English, Japanese, Korean, Chinese, and more. You can use it by inputting a short vocal sample to generate high-quality speech. It supports multiple languages without needing phonemes and is highly accurate with low error rates. The tool is fast, with real-time processing on various devices, and has a user-friendly web and GUI interface. You can try the demo online or set it up locally. It's released under a CC BY-NC-SA 4.0 license, which means you can use and modify it freely, but you must give credit and share any changes under the same license. This tool helps you create realistic speech quickly and easily, making it useful for various applications like voice cloning and multilingual communication.

https://github.com/fishaudio/fish-speech

GitHub

GitHub - fishaudio/fish-speech: SOTA Open Source TTS

SOTA Open Source TTS. Contribute to fishaudio/fish-speech development by creating an account on GitHub.

434 views13:00

GitHub Trends

#python #ai #alexa #amazon_echo #anyq #asr #bci #chatgpt #google_home #gpt3 #homeassistant #muse #openai #raspeberry_pi #snowboy #speaker #tts #unit

wukong-robot is a simple, flexible, and elegant Chinese voice dialogue robot/smart speaker project. It allows makers and hackers in China to quickly create personalized smart speakers. Here are the key benefits You can customize and develop your own plugins for speech recognition, synthesis, and dialogue management.
- **Chinese Support** It supports integration with smart home protocols like Siri, 小爱音箱, and HomeAssistant, allowing voice control of smart devices.
- **Easy Installation** You can customize the robot's name, choose different speech recognition and synthesis plugins, and even use brain-computer interface (BCI) for wake-up.
- **Open API**: It provides an open API for more advanced functionalities.

Overall, wukong-robot offers a highly customizable and flexible solution for creating smart speakers, making it a great choice for those who want to personalize their smart home experience.

https://github.com/wzpan/wukong-robot

GitHub

GitHub - wzpan/wukong-robot: 🤖 wukong-robot 是一个简单、灵活、优雅的中文语音对话机器人/智能音箱项目，支持ChatGPT多轮对话能力，还可能是首个支持脑机交互的开源智能音箱项目。

🤖 wukong-robot 是一个简单、灵活、优雅的中文语音对话机器人/智能音箱项目，支持ChatGPT多轮对话能力，还可能是首个支持脑机交互的开源智能音箱项目。 - wzpan/wukong-robot

❤1

449 views11:30

GitHub Trends

#python #agent #ai #asr #cpp #gemini #golang #gpt_4 #gpt_4o #llm #low_latency #multimodal #nextjs14 #openai #python #rag #real_time #realtime #tts #vision #voice_assistant

The TEN Agent is a powerful tool that helps you create and manage AI agents with various capabilities like real-time vision, screen detection, and integration with services like Google Gemini Multimodal Live API, Weather Check, and Web Search. To use it, you need to set up your environment with Docker, Node.js, and specific API keys. You can follow simple steps to configure and start your agent locally. The benefits include easy integration of advanced AI features, a supportive community through Discord and GitHub discussions, and the ability to customize and extend your agents with ready-to-use extensions. This makes it easier to develop and deploy sophisticated AI applications quickly.

https://github.com/TEN-framework/TEN-Agent

GitHub

GitHub - TEN-framework/ten-framework: Open-source framework for conversational voice AI agents

Open-source framework for conversational voice AI agents - TEN-framework/ten-framework

470 views12:00

GitHub Trends

#python #deep_learning #glow_tts #hifigan #melgan #multi_speaker_tts #python #pytorch #speaker_encoder #speaker_encodings #speech #speech_synthesis #tacotron #text_to_speech #tts #tts_model #vocoder #voice_cloning #voice_conversion #voice_synthesis

The new version of TTS (Text-to-Speech) from Coqui.ai, called TTSv2, is now available with several improvements. It supports 16 languages and has better performance overall. You can fine-tune the models using the provided code and examples. The TTS system can now stream audio with less than 200ms latency, making it very responsive. Additionally, you can use over 1,100 Fairseq models and new features like voice cloning and voice conversion. This update also includes faster inference with the Tortoise model and support for multiple speakers and languages. These enhancements make it easier and more efficient to generate high-quality speech from text.

https://github.com/coqui-ai/TTS

GitHub

GitHub - coqui-ai/TTS: 🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production - coqui-ai/TTS

477 views13:00

About

Blog

Apps

Platform