GitHub Trends
10.1K subscribers
15.3K links
See what the GitHub community is most excited about today.

A bot automatically fetches new repositories from https://github.com/trending and sends them to the channel.

Author and maintainer: https://github.com/katursis
Download Telegram
#cplusplus #compiler #cuda #jax #machine_learning #mlir #pytorch #runtime #spirv #tensorflow #vulkan

IREE is a tool that helps run Machine Learning (ML) models on different devices, from big data centers to small mobile and edge devices. It uses a special way to convert ML models into a uniform format, making it easier to deploy them anywhere. This tool is still in the early stages but is being actively improved. Using IREE can help you scale your ML models efficiently across various platforms, making it beneficial for developers who need to deploy models in different environments.

https://github.com/iree-org/iree
#python #amd #cuda #gpt #inference #inferentia #llama #llm #llm_serving #llmops #mlops #model_serving #pytorch #rocm #tpu #trainium #transformer #xpu

vLLM is a library that makes it easy, fast, and cheap to use large language models (LLMs). It is designed to be fast with features like efficient memory management, continuous batching, and optimized CUDA kernels. vLLM supports many popular models and can run on various hardware including NVIDIA GPUs, AMD CPUs and GPUs, and more. It also offers seamless integration with Hugging Face models and supports different decoding algorithms. This makes it flexible and easy to use for anyone needing to serve LLMs, whether for research or other applications. You can install vLLM easily with `pip install vllm` and find detailed documentation on their website.

https://github.com/vllm-project/vllm
1
#cplusplus #cuda #d3d12 #glsl #hlsl #shaders #vulkan

Slang is a shading language that helps developers create and manage large shader codebases easily and efficiently. It allows you to write shaders once and run them on various platforms like D3D12, Vulkan, Metal, and more, without needing to rewrite the code. Slang also lets you use the latest GPU features and supports neural graphics with automatic differentiation, making it useful for machine learning. It has a module system for organizing code, generics for specializing shaders, and easy integration with existing HLSL and GLSL codebases. Additionally, Slang offers comprehensive tooling support, including IntelliSense and debugging capabilities. This makes it easier to develop high-performance graphics applications across different platforms.

https://github.com/shader-slang/slang
5
#cplusplus #cublas #cuda #cudnn #gpu #mlops #networking #nvml #remote_access

SCUDA is a tool that lets you use GPUs from other computers over the internet. This means you can run programs that need powerful GPUs on your local machine, even if it doesn't have one. Here’s how it helps: You can test and develop applications using remote GPUs, train machine learning models from your laptop, perform complex data processing tasks, and even fine-tune pre-trained models without needing a powerful GPU locally. This makes it easier to work with GPUs without having to physically have one, saving time and resources.

https://github.com/kevmo314/scuda
#python #cuda #deepseek #deepseek_llm #deepseek_v3 #inference #llama #llama2 #llama3 #llama3_1 #llava #llm #llm_serving #moe #pytorch #transformer #vlm

SGLang is a tool that makes working with large language models and vision language models much faster and more manageable. It has a fast backend runtime that optimizes model performance with features like prefix caching, continuous batching, and quantization. The frontend language is flexible and easy to use, allowing for complex tasks like chained generation calls and multi-modal inputs. SGLang supports many different models and has an active community behind it. This means you can get your models running quickly and efficiently, saving time and resources. Additionally, the extensive documentation and community support make it easier to get started and resolve any issues.

https://github.com/sgl-project/sglang
#cplusplus #cpp #cuda #deep_learning #deep_learning_library #gpu #nvidia

CUTLASS is a powerful tool for high-performance matrix operations on NVIDIA GPUs. It helps developers create efficient code by breaking down complex tasks into reusable parts, making it easier to build custom applications. CUTLASS supports various data types and architectures, including the new Blackwell SM100 architecture, which means users can optimize their programs for different hardware. This flexibility and support for advanced features like Tensor Cores improve performance significantly, benefiting users who need fast computations in fields like AI and scientific computing.

https://github.com/NVIDIA/cutlass
👍1
#cplusplus #cuda #cutlass #gpu #pytorch

Flux is a library that helps speed up machine learning on GPUs by overlapping communication and computation tasks. It supports various parallelisms in model training and inference, making it compatible with PyTorch and different Nvidia GPU architectures. This means you can train models faster because Flux combines the steps of sending data between GPUs (communication) and doing calculations (computation), allowing them to happen at the same time. This overlap reduces overall training time, which is beneficial for users working with large or complex models.

https://github.com/bytedance/flux