GitHub Trends – Telegram

GitHub Trends

@githubtrending

10.1K subscribers

15.3K links

See what the GitHub community is most excited about today.

A bot automatically fetches new repositories from https://github.com/trending and sends them to the channel.

Author and maintainer: https://github.com/katursis

Download Telegram

About

Blog

Apps

Platform

10.1K subscribers

#python #artificial_intelligence #attention_mechanism #computer_vision #image_classification #transformers

This text describes a comprehensive implementation of Vision Transformers (ViT) in PyTorch, offering various models and techniques for image classification. Here’s the key information and benefits**
- The repository provides multiple ViT variants, including the original ViT, Simple ViT, NaViT, Deep ViT, CaiT, Token-to-Token ViT, CCT, Cross ViT, PiT, LeViT, CvT, Twins SVT, RegionViT, CrossFormer, ScalableViT, SepViT, MaxViT, NesT, MobileViT, XCiT, and others.
- Each variant introduces different architectural improvements such as efficient attention mechanisms, multi-scale processing, and innovative embedding techniques.
- The implementation includes pre-trained models and supports various tasks like masked image modeling, distillation, and self-supervised learning.

**Benefits** Users can choose from a wide range of ViT models tailored for different needs, such as efficiency, performance, or specific tasks.
- **Performance** Some models, like NaViT and ScalableViT, are designed to be more efficient in terms of computational resources and training time.
- **Ease of Use** The inclusion of various research ideas and techniques allows users to explore new approaches in vision transformer research.

Overall, this repository offers a powerful toolkit for anyone working with vision transformers, providing both practical solutions and cutting-edge research opportunities.

https://github.com/lucidrains/vit-pytorch

GitHub - lucidrains/vit-pytorch: Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with…

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch - lucidrains/vit-pytorch

👍1

503 views14:30

#python #annotation #annotation_tool #annotations #boundingbox #computer_vision #computer_vision_annotation #dataset #deep_learning #image_annotation #image_classification #image_labeling #image_labelling_tool #imagenet #labeling #labeling_tool #object_detection #pytorch #semantic_segmentation #tensorflow #video_annotation

CVAT is a powerful tool for annotating videos and images, especially useful for computer vision projects. It helps developers and companies annotate data quickly and efficiently. You can use CVAT online for free or subscribe for more features like unlimited data and integrations with other tools. It also offers a self-hosted option with enterprise support. CVAT supports many annotation formats and has automatic labeling options to speed up your work. It's widely used by many teams worldwide, making it a reliable choice for your data annotation needs.

https://github.com/cvat-ai/cvat

GitHub - cvat-ai/cvat: Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams…

Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale. - cvat-ai/cvat

556 views00:00

#python #auto_regressive_model #autoregressive_models #diffusion_models #generative_ai #generative_model #gpt #gpt_2 #image_generation #large_language_models #neurips #transformers #vision_transformer

VAR (Visual Autoregressive Modeling) is a new way to generate images that improves upon existing methods. It uses a "next-scale prediction" approach, which means it generates images from coarse to fine details, unlike the traditional method of predicting pixel by pixel. This makes VAR models better than diffusion models for the first time. You can try VAR on a demo website and generate images interactively, which is fun and easy. VAR also follows power-law scaling laws, making it efficient and scalable. The benefit to you is that you can create high-quality images quickly and easily, and even explore technical details through provided scripts and models.

https://github.com/FoundationVision/VAR

GitHub - FoundationVision/VAR: [NeurIPS 2024 Best Paper Award][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official…

[NeurIPS 2024 Best Paper Award][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Predi...

👍1😁1

411 views13:30

#python #3d_creation #3d_generation #aigc #diffusion_models #generative_model #image_to_3d

DreamCraft3D is a method to create highly detailed and realistic 3D objects using a combination of 2D reference images and advanced algorithms. It ensures that the 3D objects look consistent from all angles and have realistic textures. This is achieved by using a special technique called "Bootstrapped Score Distillation" which improves both the shape and texture of the 3D object in a way that reinforces each other. The benefit to the user is that they can generate very realistic 3D models quickly and accurately, which can be useful for various applications such as video games, movies, and architectural design.

https://github.com/deepseek-ai/DreamCraft3D

GitHub - deepseek-ai/DreamCraft3D: [ICLR 2024] Official implementation of DreamCraft3D: Hierarchical 3D Generation with Bootstrapped…

[ICLR 2024] Official implementation of DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior - deepseek-ai/DreamCraft3D

❤1

416 views12:30

#python #image_processing #ocr #pdf #python #tesseract

OCRmyPDF is a tool that makes scanned PDF files searchable and editable. It adds a text layer to the PDF, so you can search for words or copy and paste text from the document. It supports many languages, fixes misrotated or crooked pages, and optimizes the file size. The tool works on various operating systems like Linux, Windows, and macOS, and it uses multiple CPU cores to speed up the process. This makes it easier to work with scanned documents and keeps your files organized and searchable.

https://github.com/ocrmypdf/OCRmyPDF

GitHub - ocrmypdf/OCRmyPDF: OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched - ocrmypdf/OCRmyPDF

523 views11:30

#kotlin #aes_256 #android #background_removal #clean_architecture #crop #djvu #edit_photo #exif #f_droid #filter_image #image_manipulation #jetpack_compose #jxl #kotlin #material_you #ocr_recognition #pdf #psd #qrcode_scanner #watermark

Image Toolbox is a powerful and versatile image editing tool that lets you do many things with your photos. You can crop, apply over 230 different filters, edit EXIF data, remove backgrounds, and even convert images to PDFs. It also allows you to add stickers and text, extract text from images in over 120 languages, and encrypt files with AES-256 encryption. You can resize images using various scaling algorithms, convert between multiple image formats, and create collages. The app also supports GIF, WEBP, APNG, and JXL conversions, document scanning, QR code scanning and creation, and more. It has a simple interface but offers many advanced features, making it useful for both photographers and developers.

https://github.com/T8RIN/ImageToolbox

GitHub - T8RIN/ImageToolbox: 🖼️ Image Toolbox is a powerful app for advanced image manipulation. It offers dozens of features,…

🖼️ Image Toolbox is a powerful app for advanced image manipulation. It offers dozens of features, from basic tools like crop and draw to filters, OCR, and a wide range of image processing options -...

458 views11:30

#go #automation #c #go #golang #hook #image #mouse #opencv #robot #robotgo #rpa #window

Robotgo is a tool that helps automate tasks on your computer using the Go programming language. It can control the mouse and keyboard, capture screenshots, and work with windows. This means you can use it to automatically do things like scrolling, clicking, or typing text. Robotgo works on Windows, Mac, and Linux systems, making it very versatile. Using Robotgo can save time by automating repetitive tasks, allowing you to focus on more important things.

https://github.com/go-vgo/robotgo

GitHub - go-vgo/robotgo: RobotGo, Go Native cross-platform RPA, GUI automation, Auto test and Computer use @vcaesar

RobotGo, Go Native cross-platform RPA, GUI automation, Auto test and Computer use @vcaesar - go-vgo/robotgo

574 views15:00

#jupyter_notebook #cnn #colab #colab_notebook #computer_vision #deep_learning #deep_neural_networks #fourier #fourier_convolutions #fourier_transform #gan #generative_adversarial_network #generative_adversarial_networks #high_resolution #image_inpainting #inpainting #inpainting_algorithm #inpainting_methods #pytorch

LaMa is a powerful tool for removing objects from images. It uses special techniques called Fourier Convolutions, which help it understand the whole image at once. This makes it very good at filling in large areas that are missing. LaMa can even work well with high-resolution images, even if it was trained on smaller ones. This means you can use it to fix photos where objects are in the way, making them look natural and complete again.

https://github.com/advimman/lama

GitHub - advimman/lama: 🦙 LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022

🦙 LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022 - advimman/lama

514 views15:30

#python #3d #3d_aigc #3d_generation #diffusion_models #hunyuan3d #image_to_3d #shape #shape_generation #text_to_3d #texture_generation

Hunyuan3D 2.0 is a powerful tool that creates detailed 3D models with textures in two steps: first building the shape, then adding colors and materials. It works efficiently on standard computers (as low as 5GB VRAM for basic models) and offers multiple ways to use it, like coding, Blender plugins, or online demos, making it accessible for creating game-ready 3D assets, VR/AR content, or custom designs without needing advanced hardware.

https://github.com/Tencent/Hunyuan3D-2

GitHub - Tencent-Hunyuan/Hunyuan3D-2: High-Resolution 3D Assets Generation with Large Scale Hunyuan3D Diffusion Models.

High-Resolution 3D Assets Generation with Large Scale Hunyuan3D Diffusion Models. - Tencent-Hunyuan/Hunyuan3D-2

521 views12:00

#python #diffusion_models #dit #image_to_video #image_to_video_generation #text_to_video #text_to_video_generation

LTX-Video is a powerful AI model that creates high-quality, realistic videos in real time, running faster than you can watch them. It can generate videos from text descriptions, images, or existing videos, and supports advanced features like keyframe animation and video extension. You can use it online or run it locally with easy setup. It offers great control over video details, smooth motion, and works well even on consumer hardware. This helps you quickly create custom videos for storytelling, social media, or prototyping, saving time and boosting creativity with detailed, lifelike results[2][4][5].

https://github.com/Lightricks/LTX-Video

GitHub - Lightricks/LTX-Video: Official repository for LTX-Video

Official repository for LTX-Video. Contribute to Lightricks/LTX-Video development by creating an account on GitHub.

🔥1

463 views12:00

#python #comfyui #diffusion_models #dit #image_to_video #image_to_video_generation #text_to_image #text_to_image_generation

ComfyUI-LTXVideo is a tool that helps create high-quality videos from images using AI. It offers features like key frame control, improved video quality, and faster generation speeds. This means you can make smooth videos with fewer errors and more control over how they look. It also supports commercial use, so you can use the videos for business projects. The tool is designed to work well with consumer-grade GPUs, making it accessible to more users. Overall, it helps you create professional-looking videos quickly and easily.

https://github.com/Lightricks/ComfyUI-LTXVideo

GitHub - Lightricks/ComfyUI-LTXVideo: LTX-Video Support for ComfyUI

LTX-Video Support for ComfyUI. Contribute to Lightricks/ComfyUI-LTXVideo development by creating an account on GitHub.

🔥1

456 views12:30

#python #ai #ai_art #art #asset_generator #chatbot #deep_learning #desktop_app #image_generation #mistral #multimodal #privacy #pygame #pyside6 #python #self_hosted #speech_to_text #stable_diffusion #text_to_image #text_to_speech #text_to_speech_app

AI Runner is a tool that lets you use AI on your own computer without needing the internet. It can do many things like **voice chatbots**, **text-to-image** generation, and **image editing**. You can also make AI personalities for more interesting conversations. It runs fast and securely, keeping your data private. To use AI Runner, you need a good computer with a strong GPU, like an NVIDIA RTX 3060 or better. This helps keep your data safe and makes AI tasks faster.

https://github.com/Capsize-Games/airunner

GitHub - Capsize-Games/airunner: Offline inference engine for art, real-time voice conversations, LLM powered chatbots and automated…

Offline inference engine for art, real-time voice conversations, LLM powered chatbots and automated workflows - Capsize-Games/airunner

465 views11:30

#python #face_animation #image_animation #video_editing #video_generation

LivePortrait is a tool that uses AI to animate still photos, making them look like videos. It works by identifying key facial features and adding realistic movements. This technology helps create lifelike videos that can be used for personalized communication. The benefit to users is that they can easily create engaging animated portraits from static images, which can be fun and useful for various applications like social media or storytelling.

https://github.com/KwaiVGI/LivePortrait

GitHub - KlingTeam/LivePortrait: Bring portraits to life!

Bring portraits to life! Contribute to KlingTeam/LivePortrait development by creating an account on GitHub.

412 views12:30

#typescript #alternative #converter #data_manipulation #developer_tools #devtools #frontend #good_first_issue #image_manipulation #image_processing #javascript #pdf_manipulation #productivity #react #self_hosted #swissarmyknife #tools #typescript #video_manipulation #webapp #website

OmniTools is a self-hosted web app that helps with many tasks like image and video editing, number crunching, and more. It offers tools for resizing images, converting videos, calculating dates, and generating prime numbers. You can run it on your own computer using Docker, which means your data stays local. This app is open-source and free, allowing you to contribute new features or tools easily. Using OmniTools simplifies many everyday tasks and keeps your data private.

https://github.com/iib0011/omni-tools

GitHub - iib0011/omni-tools: Self-hosted collection of powerful web-based tools for everyday tasks. No ads, no tracking, just fast…

Self-hosted collection of powerful web-based tools for everyday tasks. No ads, no tracking, just fast, accessible utilities right from your browser! - iib0011/omni-tools

👍1

353 views12:00

#rust #2d_graphics #art #compositor #design #graphic_design #graphics_editor #image_generation #image_manipulation #image_processing #node_editor #node_graph #photo_editing #photo_editor #procedural #procedural_art #procedural_drawing #svg_editor #vector_editor

Graphite is a free, open-source 2D graphics editor that combines vector and raster tools with a unique hybrid workflow using layers and nodes. It lets you create detailed vector art and designs with nondestructive editing, meaning you can change your work anytime without losing quality. The node-based system offers powerful, flexible control like visual programming, while the layer system keeps things simple and familiar. This makes it easy to create complex graphics, animations, and effects all in one tool. Graphite is still evolving but aims to be a versatile, all-in-one creative platform accessible to everyone, helping you unleash your artistic potential efficiently[1][2][4].

https://github.com/GraphiteEditor/Graphite

GitHub - GraphiteEditor/Graphite: Open source comprehensive 2D content creation tool suite for graphic design, digital art, and…

Open source comprehensive 2D content creation tool suite for graphic design, digital art, and interactive real-time motion graphics — featuring node-based procedural editing - GraphiteEditor/Graphite

❤2

354 views11:30

#python #deep_learning #diffusion #flax #flux #hacktoberfest #image_generation #image2image #image2video #jax #latent_diffusion_models #pytorch #score_based_generative_modeling #stable_diffusion #stable_diffusion_diffusers #text2image #text2video #video2video

The Hugging Face Diffusers library is a powerful and easy-to-use tool for generating images, audio, and 3D molecular structures using advanced diffusion models. It offers ready-to-use pretrained models and flexible components like pipelines, schedulers, and model building blocks, allowing you to quickly create or customize your own diffusion-based projects. Installation is simple via pip or conda, and you can generate high-quality outputs with just a few lines of code. This library benefits you by making cutting-edge AI generation accessible, customizable, and efficient, whether you want to run models or train your own[1][2][5].

https://github.com/huggingface/diffusers

GitHub - huggingface/diffusers: 🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.

🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch. - huggingface/diffusers

456 views00:00

#vue #canvas_editor #design #design_editor #editor #fabricjs #image_editor #poster #svg_editor #vue_fabric

You can use a powerful open-source image editor built with fabric.js and Vue that lets you easily design images by dragging and dropping. It supports many features like importing PSD and JSON files, exporting PNG and SVG, layers, gradients, custom fonts, cropping, filters, and more. You can customize fonts, templates, right-click menus, and shortcuts, and extend it with plugins. This editor is lightweight and simple to use, making it great for quick image editing without complex tools. It also offers a paid version with full backend support and batch image generation, helping you save time and reduce development effort.

https://github.com/ikuaitu/vue-fabric-editor

GitHub - ikuaitu/vue-fabric-editor: 快图设计-基于fabric.js和Vue的开源图片编辑器，可自定义字体、素材、设计模板。fabric.js and Vue based image editor, can customize…

快图设计-基于fabric.js和Vue的开源图片编辑器，可自定义字体、素材、设计模板。fabric.js and Vue based image editor, can customize fonts, materials, design templates. - ikuaitu/vue-fabric-editor

362 views12:00

#typescript #ai #ai_chatbot #angular #chat #chatbot #chatgpt #cohere #component #files #huggingface #image #nextjs #openai #react #react_chatbot #solid #speech #svelte #vue

Deep Chat is an easy-to-add AI chat tool for your website that connects with popular AI services like ChatGPT and HuggingFace or your own custom APIs using just one line of code. It supports text, voice input, speech-to-text, text-to-speech, file sharing, webcam photos, and audio recording, making conversations more interactive. You can customize everything from avatars to message styles and run small AI models directly in the browser without servers. It works with major web frameworks and offers features like local message storage and focus mode for a modern chat experience. This helps you quickly add a powerful, flexible AI chatbot that fits your needs and improves user engagement.

https://github.com/OvidijusParsiunas/deep-chat

GitHub - OvidijusParsiunas/deep-chat: Fully customizable AI chatbot component for your website

Fully customizable AI chatbot component for your website - OvidijusParsiunas/deep-chat

401 views12:30

#python #blind_watermark #image_processing #watermark #watermark_image

You can add invisible watermarks to images using a Python tool based on DWT-DCT-SVD techniques, which hides your watermark securely without changing the image's appearance. This watermark can be embedded and later extracted even if the image is rotated, cropped, resized, or altered by noise or brightness changes. You can use it easily via command line or Python code, protecting your images from unauthorized use while keeping them visually unchanged. This helps prove ownership and maintain image authenticity without affecting quality or usability. The tool supports embedding text, images, or bit arrays as watermarks and works on Windows, Linux, and macOS.

https://github.com/guofei9987/blind_watermark

GitHub - guofei9987/blind_watermark: Blind&Invisible Watermark ，图片盲水印，提取水印无须原图！

Blind&Invisible Watermark ，图片盲水印，提取水印无须原图！. Contribute to guofei9987/blind_watermark development by creating an account on GitHub.

❤1

548 views12:00

#python #audio_generation #diffusion #image_generation #inference #model_serving #multimodal #pytorch #transformer #video_generation

vLLM-Omni is a free, open-source tool that makes serving AI models for text, images, videos, and audio fast, easy, and cheap. It builds on vLLM for top speed using smart memory tricks, overlapping tasks, and flexible resource sharing across GPUs. You get 2x higher throughput, 35% less delay, and simple setup with Hugging Face models via OpenAI API—perfect for building quick multi-modal apps like chatbots or media generators without high costs.

https://github.com/vllm-project/vllm-omni

GitHub - vllm-project/vllm-omni: A framework for efficient model inference with omni-modality models

A framework for efficient model inference with omni-modality models - vllm-project/vllm-omni

265 views15:30