GitHub Trends
10.1K subscribers
15.3K links
See what the GitHub community is most excited about today.

A bot automatically fetches new repositories from https://github.com/trending and sends them to the channel.

Author and maintainer: https://github.com/katursis
Download Telegram
#python #agents #ai #multimodal #real_time #video #voice #voice_assistant

The Agents framework helps you build AI-driven programs that can interact with users in real-time through text, audio, images, or video. It integrates with OpenAI's Realtime API for ultra-low latency interactions and supports various plugins for speech-to-text, text-to-speech, and other AI services. You can use it to create voice assistants, transcription agents, and more, with easy deployment across local, self-hosted, or cloud environments. This makes it easier to develop interactive AI applications quickly and efficiently.

https://github.com/livekit/agents
#python #beit #beit_3 #bitnet #deepnet #document_ai #foundation_models #kosmos #kosmos_1 #layoutlm #layoutxlm #llm #minilm #mllm #multimodal #nlp #pre_trained_model #textdiffuser #trocr #unilm #xlm_e

Microsoft is developing advanced AI models through large-scale self-supervised pre-training across various tasks, languages, and modalities. These models, such as Foundation Transformers (Magneto) and Kosmos-2.5, are designed to be highly generalizable and capable of handling multiple tasks like language understanding, vision, speech, and multimodal interactions. The benefit to users includes state-of-the-art performance in document AI, speech recognition, machine translation, and more, making these models highly versatile and efficient for a wide range of applications. Additionally, tools like TorchScale and Aggressive Decoding enhance stability, efficiency, and speed in model training and deployment.

https://github.com/microsoft/unilm
#javascript #agent_framework_javascript #ai_agents #crewai #custom_ai_agents #desktop_app #llama3 #llm #llm_application #llm_webui #lmstudio #local_llm #localai #multimodal #nodejs #ollama #rag #vector_database #webui

AnythingLLM is an all-in-one AI app that lets you chat with your documents, use AI agents, and manage multiple users without complicated setup. You can choose from various large language models (LLMs) and vector databases, and it supports different document types like PDF, TXT, and DOCX. It also has a simple chat interface with drag-and-drop functionality and clear citations. You can run it locally or host it remotely, and it includes features like custom AI agents, multi-modal support, and cost-saving measures for managing large documents. This makes it easy to use AI with your documents in a flexible and efficient way.

https://github.com/Mintplex-Labs/anything-llm
#python #ai #cv #data_analytics #data_wrangling #embeddings #llm #llm_eval #machine_learning #mlops #multimodal

DataChain is a powerful tool for managing and processing large amounts of data, especially useful for artificial intelligence tasks. It helps you organize unstructured data from various sources like cloud storage or local files into structured datasets. You can process this data efficiently using Python, without needing SQL or Spark, and even use local AI models or APIs to enrich your data. Key benefits include parallel processing, out-of-memory computing, and optimized vector searches, making it faster and more efficient. Additionally, DataChain integrates well with popular libraries like PyTorch and TensorFlow, allowing you to easily export data for further analysis or training models. This makes it easier to handle complex data tasks and improves your overall workflow.

https://github.com/iterative/datachain
#rust #ai #computer_vision #llm #machine_learning #ml #multimodal #vision

ScreenPipe is an AI assistant that records your screen and voice 24/7, giving you all the context you need. It's like having a personal recorder that helps you remember everything. You can use it as a desktop app, command line tool, or even integrate it into other applications. The benefit is that you'll never miss important details again, and you can prepare for the future where data is crucial. Plus, it's open-source, so you can customize it to your needs. Downloading ScreenPipe can help you stay organized and prepared in the age of super intelligence.

https://github.com/mediar-ai/screenpipe
1