GitHub Trends

#python #billion_parameters #compression #data_parallelism #deep_learning #gpu #inference #machine_learning #mixture_of_experts #model_parallelism #pipeline_parallelism #pytorch #trillion_parameters #zero

DeepSpeed is a powerful tool for training and using large artificial intelligence models quickly and efficiently. It allows you to train models with billions or even trillions of parameters, which is much faster and cheaper than other methods. With DeepSpeed, you can achieve significant speedups, reduce costs, and improve the performance of your models. For example, it can train ChatGPT-like models 15 times faster than current state-of-the-art systems. This makes it easier to work with large language models without needing massive resources, making AI more accessible and efficient for everyone.

https://github.com/microsoft/DeepSpeed

GitHub

GitHub - deepspeedai/DeepSpeed: DeepSpeed is a deep learning optimization library that makes distributed training and inference…

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective. - deepspeedai/DeepSpeed

369 views16:30

GitHub Trends

#swift #inference #ios #macos #pretrained_models #speech_recognition #swift #transformers #visionos #watchos #whisper

WhisperKit is a tool that helps your Apple devices recognize speech from audio files or live recordings using OpenAI's Whisper model. It works locally on your device, which means it doesn't need internet connection once set up. To use it, you can add WhisperKit to your Swift project easily through the Swift Package Manager or install a command-line version using Homebrew. This tool is beneficial because it allows you to transcribe audio quickly and efficiently right on your device, making it useful for various applications like voice assistants or transcription services.

https://github.com/argmaxinc/WhisperKit

GitHub

GitHub - argmaxinc/WhisperKit: On-device Speech Recognition for Apple Silicon

On-device Speech Recognition for Apple Silicon. Contribute to argmaxinc/WhisperKit development by creating an account on GitHub.

390 views13:30

GitHub Trends

#cplusplus #android #audio_processing #c_plus_plus #calculator #computer_vision #deep_learning #framework #graph_based #graph_framework #inference #machine_learning #mediapipe #mobile_development #perception #pipeline_framework #stream_processing #video_processing

MediaPipe is a tool that helps you add smart machine learning features to your apps and devices. It works on mobile, web, desktop, and other devices. You can use pre-made solutions for tasks like vision, text, and audio processing, or customize the models to fit your needs. MediaPipe also offers tools like Model Maker and Studio to help you create and test your solutions easily. This makes it easier to delight your customers with innovative features without needing deep machine learning expertise.

https://github.com/google-ai-edge/mediapipe

GitHub

GitHub - google-ai-edge/mediapipe: Cross-platform, customizable ML solutions for live and streaming media.

Cross-platform, customizable ML solutions for live and streaming media. - google-ai-edge/mediapipe

333 views20:30

GitHub Trends

#jupyter_notebook #aws #data_science #deep_learning #examples #inference #jupyter_notebook #machine_learning #mlops #reinforcement_learning #sagemaker #training

SageMaker-Core is a new Python SDK for Amazon SageMaker that makes it easier to work with machine learning resources. It provides an object-oriented interface, which means you can manage resources like training jobs, models, and endpoints more intuitively. The SDK simplifies code by allowing resource chaining, eliminating the need to manually specify parameters. It also includes features like auto code completion, comprehensive documentation, and type hints, making it faster and less error-prone to write code. This helps developers customize their ML workloads more efficiently and streamline their development process.

https://github.com/aws/amazon-sagemaker-examples

GitHub

GitHub - aws/amazon-sagemaker-examples: Example 📓 Jupyter notebooks that demonstrate how to build, train, and deploy machine learning…

Example 📓 Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using 🧠 Amazon SageMaker. - GitHub - aws/amazon-sagemaker-examples: Example 📓 Jupyter notebooks...

348 views16:30

GitHub Trends

#python #amd #cuda #gpt #inference #inferentia #llama #llm #llm_serving #llmops #mlops #model_serving #pytorch #rocm #tpu #trainium #transformer #xpu

vLLM is a library that makes it easy, fast, and cheap to use large language models (LLMs). It is designed to be fast with features like efficient memory management, continuous batching, and optimized CUDA kernels. vLLM supports many popular models and can run on various hardware including NVIDIA GPUs, AMD CPUs and GPUs, and more. It also offers seamless integration with Hugging Face models and supports different decoding algorithms. This makes it flexible and easy to use for anyone needing to serve LLMs, whether for research or other applications. You can install vLLM easily with `pip install vllm` and find detailed documentation on their website.

https://github.com/vllm-project/vllm

GitHub

GitHub - vllm-project/vllm: A high-throughput and memory-efficient inference and serving engine for LLMs

A high-throughput and memory-efficient inference and serving engine for LLMs - vllm-project/vllm

❤1

373 views13:00

GitHub Trends

#cplusplus #caffe #convolution #deep_learning #deep_neural_networks #diy #graph_algorithms #inference #inference_engine #maxpooling #ncnn #pnnx #pytorch #relu #resnet #sigmoid #yolo #yolov5

This course, "_动手自制大模型推理框架_" (Handcrafting Large Model Inference Framework), is a valuable resource for those interested in deep learning and model inference. It teaches you how to build a modern C++ project from scratch, focusing on designing and implementing a deep learning inference framework. The course supports latest models like LLama3.2 and Qwen2.5, and uses CUDA acceleration and Int8 quantization for better performance.

By taking this course, you will learn how to write efficient C++ code, manage projects with CMake and Git, design computational graphs, implement common operators like convolution and pooling, and optimize them for speed. This knowledge will be highly beneficial for job interviews and advancing your skills in deep learning. The course also includes practical demos on models like Unet and YoloV5, making it a hands-on learning experience.

https://github.com/zjhellofss/KuiperInfer

GitHub

GitHub - zjhellofss/KuiperInfer: 校招、秋招、春招、实习好项目！带你从零实现一个高性能的深度学习推理库，支持大模型 llama2 、Unet、Yolov5、Resnet等模型的推理。Implement a high-performance…

校招、秋招、春招、实习好项目！带你从零实现一个高性能的深度学习推理库，支持大模型 llama2 、Unet、Yolov5、Resnet等模型的推理。Implement a high-performance deep learning inference library step by step - zjhellofss/KuiperInfer

481 views11:30

GitHub Trends

#shell #ai #containers #inference_server #llamacpp #llm #podman #vllm

RamaLama is a tool that makes working with AI models easy by using containers. It checks your system for GPU support and uses CPU if no GPU is found. RamaLama uses container engines like Podman or Docker to run AI models, so you don't need to configure your system. You can pull and run AI models from various registries with simple commands, and it supports multiple types of hardware including CPUs and GPUs. This makes it convenient for users as they don't have to set up complex environments, and they can interact with different models easily.

https://github.com/containers/ramalama

GitHub

GitHub - containers/ramalama: RamaLama is an open-source developer tool that simplifies the local serving of AI models from any…

RamaLama is an open-source developer tool that simplifies the local serving of AI models from any source and facilitates their use for inference in production, all through the familiar language of ...

381 views13:00

GitHub Trends

#python #cuda #deepseek #deepseek_llm #deepseek_v3 #inference #llama #llama2 #llama3 #llama3_1 #llava #llm #llm_serving #moe #pytorch #transformer #vlm

SGLang is a tool that makes working with large language models and vision language models much faster and more manageable. It has a fast backend runtime that optimizes model performance with features like prefix caching, continuous batching, and quantization. The frontend language is flexible and easy to use, allowing for complex tasks like chained generation calls and multi-modal inputs. SGLang supports many different models and has an active community behind it. This means you can get your models running quickly and efficiently, saving time and resources. Additionally, the extensive documentation and community support make it easier to get started and resolve any issues.

https://github.com/sgl-project/sglang

GitHub

GitHub - sgl-project/sglang: SGLang is a fast serving framework for large language models and vision language models.

SGLang is a fast serving framework for large language models and vision language models. - sgl-project/sglang

508 views12:00

GitHub Trends

#c_lang #convolutional_neural_network #convolutional_neural_networks #cpu #inference #inference_optimization #matrix_multiplication #mobile_inference #multithreading #neural_network #neural_networks #simd

XNNPACK is a powerful tool that helps make neural networks run faster on various devices like smartphones, computers, and Raspberry Pi boards. It supports many different types of processors and operating systems, making it very versatile. XNNPACK doesn't work directly with users but instead helps other machine learning frameworks like TensorFlow Lite, PyTorch, and ONNX Runtime to perform better. This means your apps and programs that use these frameworks can run neural networks more quickly and efficiently, which is beneficial because it saves time and improves performance.

https://github.com/google/XNNPACK

GitHub

GitHub - google/XNNPACK: High-efficiency floating-point neural network inference operators for mobile, server, and Web

High-efficiency floating-point neural network inference operators for mobile, server, and Web - google/XNNPACK

486 views13:00

About

Blog

Apps

Platform