#jupyter_notebook #asr #asr_benchmark #colab #english #enterprise_grade_stt #german #pretrained_models #pytorch #silero_models #spanish #speech_recognition #speech_to_text #stt #stt_benchmark
https://github.com/snakers4/silero-models
https://github.com/snakers4/silero-models
GitHub
GitHub - snakers4/silero-models: Silero Models: pre-trained text-to-speech models made embarrassingly simple
Silero Models: pre-trained text-to-speech models made embarrassingly simple - snakers4/silero-models
#python #image_classification #image_recognition #pretrained_models #knowledge_distillation #product_recognition #autoaugment #cutmix #randaugment #gridmask #deit #repvgg #swin_transformer #image_retrieval_system
https://github.com/PaddlePaddle/PaddleClas
https://github.com/PaddlePaddle/PaddleClas
GitHub
GitHub - PaddlePaddle/PaddleClas: A treasure chest for visual classification and recognition powered by PaddlePaddle
A treasure chest for visual classification and recognition powered by PaddlePaddle - PaddlePaddle/PaddleClas
#python #caffe #computer_vision #coreml #edgetpu #keras #mediapipe #model #model_zoo #models #onnx #openvino #pretrained_models #pytorch #tensorflow #tensorflow_lite #tensorflowjs #tf_trt #tfjs #tflite #tflite_models
https://github.com/PINTO0309/PINTO_model_zoo
https://github.com/PINTO0309/PINTO_model_zoo
GitHub
GitHub - PINTO0309/PINTO_model_zoo: A repository for storing models that have been inter-converted between various frameworks.…
A repository for storing models that have been inter-converted between various frameworks. Supported frameworks are TensorFlow, PyTorch, ONNX, OpenVINO, TFJS, TFTRT, TensorFlowLite (Float32/16/INT8...
#python #bert #embedding #ernie #information_extraction #neural_search #nlp #paddlenlp #pretrained_models #question_answering #search_engine #semantic_analysis #sentiment_analysis #seq2seq #transformer #transformers #uie
https://github.com/PaddlePaddle/PaddleNLP
https://github.com/PaddlePaddle/PaddleNLP
GitHub
GitHub - PaddlePaddle/PaddleNLP: Easy-to-use and powerful LLM and SLM library with awesome model zoo.
Easy-to-use and powerful LLM and SLM library with awesome model zoo. - PaddlePaddle/PaddleNLP
#python #computer_vision #contrastive_loss #deep_learning #language_model #multi_modal_learning #pretrained_models #pytorch #zero_shot_classification
https://github.com/mlfoundations/open_clip
https://github.com/mlfoundations/open_clip
GitHub
GitHub - mlfoundations/open_clip: An open source implementation of CLIP.
An open source implementation of CLIP. Contribute to mlfoundations/open_clip development by creating an account on GitHub.
#jupyter_notebook #overlapped_speech_detection #pretrained_models #pytorch #speaker_change_detection #speaker_diarization #speaker_embedding #speaker_recognition #speaker_verification #speech_activity_detection #speech_processing #voice_activity_detection
https://github.com/pyannote/pyannote-audio
https://github.com/pyannote/pyannote-audio
GitHub
GitHub - pyannote/pyannote-audio: Neural building blocks for speaker diarization: speech activity detection, speaker change detection…
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding - GitHub - pyannote/pyannote-audio: Neural build...
#jupyter_notebook #computer_vision #deep_learning #image_classification #imagenet #neural_network #object_detection #pretrained_models #pretrained_weights #pytorch #semantic_segmentation #transfer_learning
https://github.com/Deci-AI/super-gradients
https://github.com/Deci-AI/super-gradients
GitHub
GitHub - Deci-AI/super-gradients: Easily train or fine-tune SOTA computer vision models with one open source training library.…
Easily train or fine-tune SOTA computer vision models with one open source training library. The home of Yolo-NAS. - Deci-AI/super-gradients
#python #deeplab_v3_plus #deeplabv3 #fpn #hacktoberfest #image_processing #image_segmentation #imagenet #linknet #models #pretrained_backbones #pretrained_models #pretrained_weights #pspnet #pytorch #segmentation #segmentation_models #semantic_segmentation #unet #unet_pytorch #unetplusplus
https://github.com/qubvel/segmentation_models.pytorch
https://github.com/qubvel/segmentation_models.pytorch
GitHub
GitHub - qubvel-org/segmentation_models.pytorch: Semantic segmentation models with 500+ pretrained convolutional and transformer…
Semantic segmentation models with 500+ pretrained convolutional and transformer-based backbones. - qubvel-org/segmentation_models.pytorch
#python #augmix #convnext #distributed_training #dual_path_networks #efficientnet #image_classification #imagenet #maxvit #mixnet #mobile_deep_learning #mobilenet_v2 #mobilenetv3 #nfnets #normalization_free_training #pretrained_models #pretrained_weights #pytorch #randaugment #resnet #vision_transformer_models
PyTorch Image Models (`timm`) is a comprehensive library that includes a wide range of state-of-the-art image models, layers, utilities, optimizers, and training scripts. Here are the key benefits `timm` offers over 300 pre-trained models from various families like Vision Transformers, ResNets, EfficientNets, and more, allowing you to choose the best model for your task.
- **Pre-trained Weights** You can easily extract features at different levels of the network using `features_only=True` and `out_indices`, making it versatile for various applications.
- **Optimizers and Schedulers** It provides several augmentation techniques like AutoAugment, RandAugment, and regularization methods like DropPath and DropBlock to enhance model performance.
- **Reference Training Scripts**: Included are high-performance training, validation, and inference scripts that support multiple GPUs and mixed-precision training.
Overall, `timm` simplifies the process of working with deep learning models for image tasks by providing a unified interface and extensive tools for training and evaluation.
https://github.com/huggingface/pytorch-image-models
PyTorch Image Models (`timm`) is a comprehensive library that includes a wide range of state-of-the-art image models, layers, utilities, optimizers, and training scripts. Here are the key benefits `timm` offers over 300 pre-trained models from various families like Vision Transformers, ResNets, EfficientNets, and more, allowing you to choose the best model for your task.
- **Pre-trained Weights** You can easily extract features at different levels of the network using `features_only=True` and `out_indices`, making it versatile for various applications.
- **Optimizers and Schedulers** It provides several augmentation techniques like AutoAugment, RandAugment, and regularization methods like DropPath and DropBlock to enhance model performance.
- **Reference Training Scripts**: Included are high-performance training, validation, and inference scripts that support multiple GPUs and mixed-precision training.
Overall, `timm` simplifies the process of working with deep learning models for image tasks by providing a unified interface and extensive tools for training and evaluation.
https://github.com/huggingface/pytorch-image-models
GitHub
GitHub - huggingface/pytorch-image-models: The largest collection of PyTorch image encoders / backbones. Including train, eval…
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V...
#python #chinese #clip #computer_vision #contrastive_loss #coreml_models #deep_learning #image_text_retrieval #multi_modal #multi_modal_learning #nlp #pretrained_models #pytorch #transformers #vision_and_language_pre_training #vision_language
This project is about a Chinese version of the CLIP (Contrastive Language-Image Pretraining) model, trained on a large dataset of Chinese text and images. Here’s what you need to know This model helps you quickly perform tasks like calculating text and image features, cross-modal retrieval (finding images based on text or vice versa), and zero-shot image classification (classifying images without any labeled examples).
- **Ease of Use** The model has been tested on various datasets and shows strong performance in zero-shot image classification and cross-modal retrieval tasks.
- **Resources**: The project includes pre-trained models, training and testing codes, and detailed tutorials on how to use the model for different tasks.
Overall, this project makes it easy to work with Chinese text and images using advanced AI techniques, saving you time and effort.
https://github.com/OFA-Sys/Chinese-CLIP
This project is about a Chinese version of the CLIP (Contrastive Language-Image Pretraining) model, trained on a large dataset of Chinese text and images. Here’s what you need to know This model helps you quickly perform tasks like calculating text and image features, cross-modal retrieval (finding images based on text or vice versa), and zero-shot image classification (classifying images without any labeled examples).
- **Ease of Use** The model has been tested on various datasets and shows strong performance in zero-shot image classification and cross-modal retrieval tasks.
- **Resources**: The project includes pre-trained models, training and testing codes, and detailed tutorials on how to use the model for different tasks.
Overall, this project makes it easy to work with Chinese text and images using advanced AI techniques, saving you time and effort.
https://github.com/OFA-Sys/Chinese-CLIP
GitHub
GitHub - OFA-Sys/Chinese-CLIP: Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation. - OFA-Sys/Chinese-CLIP
#swift #inference #ios #macos #pretrained_models #speech_recognition #swift #transformers #visionos #watchos #whisper
WhisperKit is a tool that helps your Apple devices recognize speech from audio files or live recordings using OpenAI's Whisper model. It works locally on your device, which means it doesn't need internet connection once set up. To use it, you can add WhisperKit to your Swift project easily through the Swift Package Manager or install a command-line version using Homebrew. This tool is beneficial because it allows you to transcribe audio quickly and efficiently right on your device, making it useful for various applications like voice assistants or transcription services.
https://github.com/argmaxinc/WhisperKit
WhisperKit is a tool that helps your Apple devices recognize speech from audio files or live recordings using OpenAI's Whisper model. It works locally on your device, which means it doesn't need internet connection once set up. To use it, you can add WhisperKit to your Swift project easily through the Swift Package Manager or install a command-line version using Homebrew. This tool is beneficial because it allows you to transcribe audio quickly and efficiently right on your device, making it useful for various applications like voice assistants or transcription services.
https://github.com/argmaxinc/WhisperKit
GitHub
GitHub - argmaxinc/WhisperKit: On-device Speech Recognition for Apple Silicon
On-device Speech Recognition for Apple Silicon. Contribute to argmaxinc/WhisperKit development by creating an account on GitHub.
#python #bert #deep_learning #flax #hacktoberfest #jax #language_model #language_models #machine_learning #model_hub #natural_language_processing #nlp #nlp_library #pretrained_models #python #pytorch #pytorch_transformers #seq2seq #speech_recognition #tensorflow #transformer
The Hugging Face Transformers library provides thousands of pretrained models for various tasks like text, image, and audio processing. These models can be used for tasks such as text classification, image detection, speech recognition, and more. The library supports popular deep learning frameworks like JAX, PyTorch, and TensorFlow, making it easy to switch between them.
The benefit to the user is that you can quickly download and use these pretrained models with just a few lines of code, saving time and computational resources. You can also fine-tune these models on your own datasets and share them with the community. Additionally, the library offers a simple `pipeline` API for immediate use on different inputs, making it user-friendly for both researchers and practitioners. This helps in reducing compute costs and carbon footprint while enabling high-performance results across various machine learning tasks.
https://github.com/huggingface/transformers
The Hugging Face Transformers library provides thousands of pretrained models for various tasks like text, image, and audio processing. These models can be used for tasks such as text classification, image detection, speech recognition, and more. The library supports popular deep learning frameworks like JAX, PyTorch, and TensorFlow, making it easy to switch between them.
The benefit to the user is that you can quickly download and use these pretrained models with just a few lines of code, saving time and computational resources. You can also fine-tune these models on your own datasets and share them with the community. Additionally, the library offers a simple `pipeline` API for immediate use on different inputs, making it user-friendly for both researchers and practitioners. This helps in reducing compute costs and carbon footprint while enabling high-performance results across various machine learning tasks.
https://github.com/huggingface/transformers
GitHub
GitHub - huggingface/transformers: 🤗 Transformers: the model-definition framework for state-of-the-art machine learning models…
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training. - GitHub - huggingface/t...
#python #chinese #flash_attention #large_language_models #llm #natural_language_processing #pretrained_models
The Qwen series includes powerful language models and chat models that can be used for various tasks such as chatting, content creation, information extraction, summarization, translation, coding, and more. Here are the key benefits and features Qwen offers base language models (Qwen-1.8B, Qwen-7B, Qwen-14B, Qwen-72B) and chat models (Qwen-1.8B-Chat, Qwen-7B-Chat, Qwen-14B-Chat, Qwen-72B-Chat) with different sizes and capabilities.
- **Performance** The models are available in quantized forms (Int4 and Int8) which reduce memory usage and improve inference speed without significant performance degradation.
- **System Prompt** The models can use tools, act as agents, or even interpret code, with good performance on code execution and tool-use benchmarks.
- **Long-Context Understanding** Easy deployment options include using vLLM, FastChat, Web UI demos, CLI demos, and OpenAI-style APIs.
- **Finetuning**: Scripts are provided for finetuning the models using full-parameter, LoRA, and Q-LoRA methods.
Overall, Qwen models offer robust performance, flexibility, and ease of use, making them suitable for a wide range of applications.
https://github.com/QwenLM/Qwen
The Qwen series includes powerful language models and chat models that can be used for various tasks such as chatting, content creation, information extraction, summarization, translation, coding, and more. Here are the key benefits and features Qwen offers base language models (Qwen-1.8B, Qwen-7B, Qwen-14B, Qwen-72B) and chat models (Qwen-1.8B-Chat, Qwen-7B-Chat, Qwen-14B-Chat, Qwen-72B-Chat) with different sizes and capabilities.
- **Performance** The models are available in quantized forms (Int4 and Int8) which reduce memory usage and improve inference speed without significant performance degradation.
- **System Prompt** The models can use tools, act as agents, or even interpret code, with good performance on code execution and tool-use benchmarks.
- **Long-Context Understanding** Easy deployment options include using vLLM, FastChat, Web UI demos, CLI demos, and OpenAI-style APIs.
- **Finetuning**: Scripts are provided for finetuning the models using full-parameter, LoRA, and Q-LoRA methods.
Overall, Qwen models offer robust performance, flexibility, and ease of use, making them suitable for a wide range of applications.
https://github.com/QwenLM/Qwen
GitHub
GitHub - QwenLM/Qwen: The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud. - QwenLM/Qwen