GitHub Trends
10.1K subscribers
15.3K links
See what the GitHub community is most excited about today.

A bot automatically fetches new repositories from https://github.com/trending and sends them to the channel.

Author and maintainer: https://github.com/katursis
Download Telegram
#go #approximate_nearest_neighbor_search #generative_search #grpc #hnsw #hybrid_search #image_search #information_retrieval #mlops #nearest_neighbor_search #neural_search #recommender_system #search_engine #semantic_search #semantic_search_engine #similarity_search #vector_database #vector_search #vector_search_engine #vectors #weaviate

Weaviate is a powerful, open-source vector database that uses machine learning to make your data searchable. It's fast, scalable, and flexible, allowing you to vectorize your data at import or upload your own vectors. Weaviate supports various modules for integrating with popular AI services like OpenAI, Cohere, and Hugging Face. It's designed for production use with features like scaling, replication, and security. You can use Weaviate for tasks beyond search, such as recommendations, summarization, and integration with neural search frameworks. It offers APIs in GraphQL, REST, and gRPC and has client libraries for several programming languages. This makes it easy to build applications like chatbots, recommendation systems, and image search tools quickly and efficiently. Joining the Weaviate community provides access to tutorials, demos, blogs, and forums to help you get started and stay updated.

https://github.com/weaviate/weaviate
#javascript #annotation #annotation_tool #annotations #boundingbox #computer_vision #data_labeling #dataset #datasets #deep_learning #image_annotation #image_classification #image_labeling #image_labelling_tool #label_studio #labeling #labeling_tool #mlops #semantic_segmentation #text_annotation #yolo

Label Studio is a free, open-source tool that helps you label different types of data like images, audio, text, videos, and more. It has a simple and user-friendly interface that makes it easy to prepare or improve your data for machine learning models. You can customize it to fit your needs and export labeled data in various formats. It supports multi-user labeling, multiple projects, and integration with machine learning models for pre-labeling and active learning. You can install it locally using Docker, pip, or other methods, or deploy it in cloud services like Heroku or Google Cloud Platform. This tool streamlines your data labeling process and helps you create more accurate ML models.

https://github.com/HumanSignal/label-studio
#python #analytics #dagster #data_engineering #data_integration #data_orchestrator #data_pipelines #data_science #etl #metadata #mlops #orchestration #python #scheduler #workflow #workflow_automation

Dagster is a tool that helps you manage and automate your data workflows. You can define your data assets, like tables or machine learning models, using Python functions. Dagster then runs these functions at the right time and keeps your data up-to-date. It offers features like integrated lineage and observability, making it easier to track and manage your data. This tool is useful for every stage of data development, from local testing to production, and it integrates well with other popular data tools. Using Dagster, you can build reusable components, spot data quality issues early, and scale your data pipelines efficiently. This makes your work more productive and helps maintain control over complex data systems.

https://github.com/dagster-io/dagster
👍1
#jupyter_notebook #aws #data_science #deep_learning #examples #inference #jupyter_notebook #machine_learning #mlops #reinforcement_learning #sagemaker #training

SageMaker-Core is a new Python SDK for Amazon SageMaker that makes it easier to work with machine learning resources. It provides an object-oriented interface, which means you can manage resources like training jobs, models, and endpoints more intuitively. The SDK simplifies code by allowing resource chaining, eliminating the need to manually specify parameters. It also includes features like auto code completion, comprehensive documentation, and type hints, making it faster and less error-prone to write code. This helps developers customize their ML workloads more efficiently and streamline their development process.

https://github.com/aws/amazon-sagemaker-examples
#python #amd #cuda #gpt #inference #inferentia #llama #llm #llm_serving #llmops #mlops #model_serving #pytorch #rocm #tpu #trainium #transformer #xpu

vLLM is a library that makes it easy, fast, and cheap to use large language models (LLMs). It is designed to be fast with features like efficient memory management, continuous batching, and optimized CUDA kernels. vLLM supports many popular models and can run on various hardware including NVIDIA GPUs, AMD CPUs and GPUs, and more. It also offers seamless integration with Hugging Face models and supports different decoding algorithms. This makes it flexible and easy to use for anyone needing to serve LLMs, whether for research or other applications. You can install vLLM easily with `pip install vllm` and find detailed documentation on their website.

https://github.com/vllm-project/vllm
1