GitHub Trends
10.1K subscribers
15.3K links
See what the GitHub community is most excited about today.

A bot automatically fetches new repositories from https://github.com/trending and sends them to the channel.

Author and maintainer: https://github.com/katursis
Download Telegram
#python #chinese #clip #computer_vision #contrastive_loss #coreml_models #deep_learning #image_text_retrieval #multi_modal #multi_modal_learning #nlp #pretrained_models #pytorch #transformers #vision_and_language_pre_training #vision_language

This project is about a Chinese version of the CLIP (Contrastive Language-Image Pretraining) model, trained on a large dataset of Chinese text and images. Here’s what you need to know This model helps you quickly perform tasks like calculating text and image features, cross-modal retrieval (finding images based on text or vice versa), and zero-shot image classification (classifying images without any labeled examples).
- **Ease of Use** The model has been tested on various datasets and shows strong performance in zero-shot image classification and cross-modal retrieval tasks.
- **Resources**: The project includes pre-trained models, training and testing codes, and detailed tutorials on how to use the model for different tasks.

Overall, this project makes it easy to work with Chinese text and images using advanced AI techniques, saving you time and effort.

https://github.com/OFA-Sys/Chinese-CLIP
#python #bert #deep_learning #flax #hacktoberfest #jax #language_model #language_models #machine_learning #model_hub #natural_language_processing #nlp #nlp_library #pretrained_models #python #pytorch #pytorch_transformers #seq2seq #speech_recognition #tensorflow #transformer

The Hugging Face Transformers library provides thousands of pretrained models for various tasks like text, image, and audio processing. These models can be used for tasks such as text classification, image detection, speech recognition, and more. The library supports popular deep learning frameworks like JAX, PyTorch, and TensorFlow, making it easy to switch between them.

The benefit to the user is that you can quickly download and use these pretrained models with just a few lines of code, saving time and computational resources. You can also fine-tune these models on your own datasets and share them with the community. Additionally, the library offers a simple `pipeline` API for immediate use on different inputs, making it user-friendly for both researchers and practitioners. This helps in reducing compute costs and carbon footprint while enabling high-performance results across various machine learning tasks.

https://github.com/huggingface/transformers
#jupyter_notebook #computer_vision #deep_learning #drug_discovery #forecasting #large_language_models #mxnet #nlp #paddlepaddle #pytorch #recommender_systems #speech_recognition #speech_synthesis #tensorflow #tensorflow2 #translation

This repository provides top-quality deep learning examples that are easy to train and deploy on NVIDIA GPUs. It includes a wide range of models for computer vision, natural language processing, recommender systems, speech to text, and more. These examples are updated monthly and come in Docker containers with the latest NVIDIA software, ensuring the best performance. The models support multiple GPUs and nodes, and some are optimized for Tensor Cores, which can significantly speed up training. This makes it easier for users to achieve high accuracy and performance in their deep learning projects.

https://github.com/NVIDIA/DeepLearningExamples
#python #agent #agents #ai_search #chatbot #chatgpt #data_pipelines #deep_learning #document_parser #document_understanding #genai #graph #graphrag #llm #nlp #pdf_to_text #preprocessing #rag #retrieval_augmented_generation #table_structure_recognition #text2sql

RAGFlow is an open-source tool that helps businesses answer questions accurately using large language models and deep document understanding. It extracts information from various complex data formats, such as Word documents, Excel files, and web pages, and provides grounded citations to support its answers. You can try a demo online or set it up on your own server using Docker. The setup is relatively straightforward, requiring a few steps like cloning the repository, building the Docker image, and configuring the system settings. RAGFlow offers key features like template-based chunking, reduced hallucinations, and compatibility with multiple data sources, making it a powerful tool for truthful question-answering capabilities. This benefits users by providing reliable and explainable answers, streamlining their workflow, and supporting integration with their business systems.

https://github.com/infiniflow/ragflow
#python #emnlp2024 #knowledge_curation #large_language_models #naacl #nlp #report_generation #retrieval_augmented_generation

STORM is a system that helps you write articles like those on Wikipedia by using internet searches. Here’s how it benefits you STORM conducts internet research, collects references, and generates an outline for your topic.
- **Collaborative Feature** You can install STORM using `pip install knowledge-storm` and customize it according to your needs.
- **User-Friendly**: Over 70,000 people have used STORM, and it helps experienced Wikipedia editors in their pre-writing stage.

This system makes researching and writing articles much easier and more efficient.

https://github.com/stanford-oval/storm
#mdx #deep_learning #hacktoberfest #nlp #transformers

The Hugging Face course teaches you how to use Transformers for natural language processing tasks. You'll learn about the Hugging Face ecosystem, including tools like Transformers, Datasets, Tokenizers, and Accelerate, as well as the Hugging Face Hub. This free course helps you understand how to fine-tune models and share your results. It's beneficial because it provides hands-on experience with popular AI libraries and allows you to build and showcase your own projects on the Hugging Face platform.

https://github.com/huggingface/course
1
#python #ai #artificial_intelligence #cython #data_science #deep_learning #entity_linking #machine_learning #named_entity_recognition #natural_language_processing #neural_network #neural_networks #nlp #nlp_library #python #spacy #text_classification #tokenization

spaCy is a powerful tool for understanding and processing human language. It helps computers analyze text by breaking it into parts like words, sentences, and entities (like names or places). This makes it useful for tasks such as identifying who is doing what in a sentence or finding specific information from large texts. Using spaCy can save time and improve accuracy compared to manual analysis. It supports many languages and integrates well with advanced models like BERT, making it ideal for real-world applications.

https://github.com/explosion/spaCy
#python #bot #bot_framework #botkit #bots #chatbot #chatbots #chatbots_framework #conversation_driven_development #conversational_agents #conversational_ai #conversational_bots #machine_learning #machine_learning_library #mitie #natural_language_processing #nlp #nlu #rasa #spacy #wit

Rasa is an open-source framework that helps build advanced chatbots. It allows developers to create contextual assistants that can have layered conversations, making interactions more natural. Rasa supports integration with various platforms like Facebook Messenger, Slack, and Google Home Actions. This flexibility and customization capability make it a popular choice for businesses to automate customer support and enhance user experience. By using Rasa, users can create intelligent chatbots that understand and respond to user inputs effectively, improving communication and engagement.

https://github.com/RasaHQ/rasa
#other #automl #chatgpt #data_analysis #data_science #data_visualization #data_visualizations #deep_learning #gpt #gpt_3 #jax #keras #machine_learning #ml #nlp #python #pytorch #scikit_learn #tensorflow #transformer

This is a comprehensive, regularly updated list of 920 top open-source Python machine learning libraries, organized into 34 categories like frameworks, data visualization, NLP, image processing, and more. Each project is ranked by quality using GitHub and package manager metrics, helping you find the best tools for your needs. Popular libraries like TensorFlow, PyTorch, scikit-learn, and Hugging Face transformers are included, along with specialized ones for time series, reinforcement learning, and model interpretability. This resource saves you time by guiding you to high-quality, actively maintained libraries for building, optimizing, and deploying machine learning models efficiently.

https://github.com/ml-tooling/best-of-ml-python
#other #artificial_intelligence #artificial_intelligence_projects #awesome #computer_vision #computer_vision_project #data_science #deep_learning #deep_learning_project #machine_learning #machine_learning_projects #nlp #nlp_projects #python

You can access a huge, constantly updated list of over 500 artificial intelligence projects with ready-to-use code covering machine learning, deep learning, computer vision, and natural language processing. This collection includes projects for beginners and advanced users, with links to tutorials, datasets, and real-world applications like chatbots, healthcare, and time series forecasting. Using this resource helps you learn AI by doing practical projects, speeding up your coding skills, and building a strong portfolio for jobs or research. It saves you time searching for quality projects and gives you tested, working code to study and modify.

https://github.com/ashishpatel26/500-AI-Machine-learning-Deep-learning-Computer-vision-NLP-Projects-with-code
#jupyter_notebook #chatgpt #finance #fingpt #fintech #large_language_models #machine_learning #nlp #prompt_engineering #pytorch #reinforcement_learning #robo_advisor #sentiment_analysis #technical_analysis

FinGPT is an open-source AI tool designed specifically for finance, helping you analyze financial news, predict stock prices, and get personalized investment advice quickly and affordably. Unlike costly models like BloombergGPT, FinGPT can be updated frequently with new data at a low cost, making it more accessible and timely. It uses advanced techniques like reinforcement learning from human feedback to tailor advice to your preferences, such as risk tolerance. You can use FinGPT for tasks like sentiment analysis, robo-advising, fraud detection, and portfolio optimization, helping you make smarter financial decisions with up-to-date insights.

https://github.com/AI4Finance-Foundation/FinGPT
#python #ai #context #embedded #faiss #knowledge_base #knowledge_graph #llm #machine_learning #memory #nlp #offline_first #opencv #python #rag #retrieval_augmented_generation #semantic_search #vector_database #video_processing

Memvid lets you store millions of text pieces inside a single MP4 video file using QR codes, making your data 50-100 times smaller than usual databases. You can search this video instantly in under 100 milliseconds without needing servers or internet after setup. It works offline, is easy to use with simple Python code, and supports PDFs and chat with your data. The upcoming version 2 will add features like continuous memory updates, shareable capsules, fast local caching, and better video compression, making your AI memory smarter, faster, and more flexible. This means you get a powerful, portable, and efficient way to manage and search huge knowledge bases quickly and easily.

https://github.com/Olow304/memvid
#python #agent_framework #data_analysis #deep_research #deep_search #llms #multi_agent_system #nlp #public_opinion_analysis #python3 #sentiment_analysis

You can use the "Weibo Public Opinion Analysis System" (called "微舆") to automatically analyze public opinion from over 30 major social media platforms and millions of comments. It uses AI agents working together to monitor, search, analyze text and videos, and generate detailed reports based on real-time data. The system supports easy setup, custom models, and integration with your own databases, helping you understand public sentiment, trends, and make better decisions. It offers continuous monitoring, deep multi-angle analysis, and flexible report generation, all accessible by simply asking questions like chatting. This saves you time and gives clear insights into public opinion dynamics.

https://github.com/666ghj/BettaFish
1
#python #deep_learning #inference #llm #nlp #pytorch #transformer

Nano-vLLM is a small, fast, and easy-to-understand tool for running large language models offline. It matches the speed of bigger systems like vLLM but uses only about 1,200 lines of clean Python code, making it simple to read and modify. It includes smart features like prefix caching and tensor parallelism to boost performance. You can install it easily and run models like Qwen3-0.6B on your own GPU. This tool is great if you want fast, efficient AI inference without complex setups, ideal for learning, research, or small deployments on limited hardware.

https://github.com/GeeeekExplorer/nano-vllm
#python #language_models #linux #machine_translation #nlp #open_source #python #transformers #translation

Argos Translate is a free, open-source tool that lets you translate text offline using your own computer. It works as a Python library, command-line tool, or with a graphical interface, and supports many languages. You can install language packages for direct translations, and it can even translate between languages that don’t have a direct package by using a middle language. This means you can translate more language pairs, though the quality might be a little lower. Argos Translate is fast, private, and does not need an internet connection after setup, making it useful for secure or offline translation needs.

https://github.com/argosopentech/argos-translate
#python #gemini #gemini_ai #gemini_api #gemini_flash #gemini_pro #information_extration #large_language_models #llm #nlp #python #structured_data

**LangExtract** is a free Python library that uses AI models like Gemini to pull structured data—like names, emotions, or meds—from messy text such as reports or books. It links every fact to its exact spot in the original, creates interactive visuals for easy checks, handles huge files fast with chunking and parallel runs, and works with cloud or local models without fine-tuning. You benefit by quickly turning unstructured docs into reliable, organized data for analysis, saving time and boosting accuracy in fields like healthcare or research.

https://github.com/google/langextract