xNul/code-llama-for-vscode
Use Code Llama with Visual Studio Code and the Continue extension. A local LLM alternative to GitHub Copilot.
Language: Python
#assistant #code #code_llama #codellama #continue #continuedev #copilot #llama #llama2 #llamacpp #llm #local #meta #ollama #studio #visual #vscode
Stars: 170 Issues: 3 Forks: 6
https://github.com/xNul/code-llama-for-vscode
  
  Use Code Llama with Visual Studio Code and the Continue extension. A local LLM alternative to GitHub Copilot.
Language: Python
#assistant #code #code_llama #codellama #continue #continuedev #copilot #llama #llama2 #llamacpp #llm #local #meta #ollama #studio #visual #vscode
Stars: 170 Issues: 3 Forks: 6
https://github.com/xNul/code-llama-for-vscode
GitHub
  
  GitHub - xNul/code-llama-for-vscode: Use Code Llama with Visual Studio Code and the Continue extension. A local LLM alternative…
  Use Code Llama with Visual Studio Code and the Continue extension. A local LLM alternative to GitHub Copilot. - xNul/code-llama-for-vscode
👍3
  tairov/llama2.mojo
Inference Llama 2 in one file of pure 🔥
#inference #llama #llama2 #modular #mojo #parallelize #performance #simd #tensor #vectorization
Stars: 200 Issues: 0 Forks: 7
https://github.com/tairov/llama2.mojo
  
  Inference Llama 2 in one file of pure 🔥
#inference #llama #llama2 #modular #mojo #parallelize #performance #simd #tensor #vectorization
Stars: 200 Issues: 0 Forks: 7
https://github.com/tairov/llama2.mojo
GitHub
  
  GitHub - tairov/llama2.mojo: Inference Llama 2 in one file of pure 🔥
  Inference Llama 2 in one file of pure 🔥. Contribute to tairov/llama2.mojo development by creating an account on GitHub.
👍1
  Fuzzy-Search/realtime-bakllava
llama.cpp with BakLLaVA model describes what does it see
Language: Python
#bakllavva #cpp #demo_application #inference #llama #llamacpp #llm
Stars: 141 Issues: 1 Forks: 15
https://github.com/Fuzzy-Search/realtime-bakllava
  
  llama.cpp with BakLLaVA model describes what does it see
Language: Python
#bakllavva #cpp #demo_application #inference #llama #llamacpp #llm
Stars: 141 Issues: 1 Forks: 15
https://github.com/Fuzzy-Search/realtime-bakllava
GitHub
  
  GitHub - OneInterface/realtime-bakllava: llama.cpp with BakLLaVA model describes what does it see
  llama.cpp with BakLLaVA model describes what does it see - OneInterface/realtime-bakllava
  lxe/llavavision
A simple "Be My Eyes" web app with a llama.cpp/llava backend
Language: JavaScript
#ai #artificial_intelligence #computer_vision #llama #llamacpp #llm #local_llm #machine_learning #multimodal #webapp
Stars: 284 Issues: 0 Forks: 7
https://github.com/lxe/llavavision
  
  A simple "Be My Eyes" web app with a llama.cpp/llava backend
Language: JavaScript
#ai #artificial_intelligence #computer_vision #llama #llamacpp #llm #local_llm #machine_learning #multimodal #webapp
Stars: 284 Issues: 0 Forks: 7
https://github.com/lxe/llavavision
GitHub
  
  GitHub - lxe/llavavision: A simple "Be My Eyes" web app with a llama.cpp/llava backend
  A simple "Be My Eyes" web app with a llama.cpp/llava backend - lxe/llavavision
  SqueezeAILab/LLMCompiler
LLMCompiler: An LLM Compiler for Parallel Function Calling
Language: Python
#efficient_inference #function_calling #large_language_models #llama #llama2 #llm #llm_agent #llm_agents #llm_framework #llms #natural_language_processing #nlp #parallel_function_call #transformer
Stars: 216 Issues: 0 Forks: 11
https://github.com/SqueezeAILab/LLMCompiler
  
  LLMCompiler: An LLM Compiler for Parallel Function Calling
Language: Python
#efficient_inference #function_calling #large_language_models #llama #llama2 #llm #llm_agent #llm_agents #llm_framework #llms #natural_language_processing #nlp #parallel_function_call #transformer
Stars: 216 Issues: 0 Forks: 11
https://github.com/SqueezeAILab/LLMCompiler
GitHub
  
  GitHub - SqueezeAILab/LLMCompiler: [ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Calling
  [ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Calling - SqueezeAILab/LLMCompiler
  Writesonic/GPTRouter
Smoothly Manage Multiple LLMs (OpenAI, Anthropic, Azure) and Image Models (Dall-E, SDXL), Speed Up Responses, and Ensure Non-Stop Reliability.
Language: TypeScript
#anthropic #azure_openai #cohere #google_gemini #langchain #llama_index #llm #llmops #llms #mlops #openai #palm_api
Stars: 130 Issues: 6 Forks: 16
https://github.com/Writesonic/GPTRouter
  
  Smoothly Manage Multiple LLMs (OpenAI, Anthropic, Azure) and Image Models (Dall-E, SDXL), Speed Up Responses, and Ensure Non-Stop Reliability.
Language: TypeScript
#anthropic #azure_openai #cohere #google_gemini #langchain #llama_index #llm #llmops #llms #mlops #openai #palm_api
Stars: 130 Issues: 6 Forks: 16
https://github.com/Writesonic/GPTRouter
GitHub
  
  GitHub - Writesonic/GPTRouter: Smoothly Manage Multiple LLMs (OpenAI, Anthropic, Azure) and Image Models (Dall-E, SDXL), Speed…
  Smoothly Manage Multiple LLMs (OpenAI, Anthropic, Azure) and Image Models (Dall-E, SDXL), Speed Up Responses, and Ensure Non-Stop Reliability. - Writesonic/GPTRouter
❤5
  SJTU-IPADS/PowerInfer
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
Language: C
#falcon #large_language_models #llama #llm #llm_inference #local_inference
Stars: 792 Issues: 8 Forks: 32
https://github.com/SJTU-IPADS/PowerInfer
  
  High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
Language: C
#falcon #large_language_models #llama #llm #llm_inference #local_inference
Stars: 792 Issues: 8 Forks: 32
https://github.com/SJTU-IPADS/PowerInfer
GitHub
  
  GitHub - SJTU-IPADS/PowerInfer: High-speed Large Language Model Serving for Local Deployment
  High-speed Large Language Model Serving for Local Deployment - SJTU-IPADS/PowerInfer
👍2❤1
  hpcaitech/SwiftInfer
Efficient AI Inference & Serving
Language: Python
#artificial_intelligence #deep_learning #gpt #inference #llama #llama2 #llm_inference #llm_serving
Stars: 299 Issues: 3 Forks: 14
https://github.com/hpcaitech/SwiftInfer
  
  Efficient AI Inference & Serving
Language: Python
#artificial_intelligence #deep_learning #gpt #inference #llama #llama2 #llm_inference #llm_serving
Stars: 299 Issues: 3 Forks: 14
https://github.com/hpcaitech/SwiftInfer
GitHub
  
  GitHub - hpcaitech/SwiftInfer: Efficient AI Inference & Serving
  Efficient AI Inference & Serving. Contribute to hpcaitech/SwiftInfer development by creating an account on GitHub.
🔥1
  mishushakov/llm-scraper
Turn any webpage into structured data using LLMs
Language: TypeScript
#ai #browser #browser_automation #gpt #langchain #llama #llm #openai #playwright #puppeteer #scraper
Stars: 332 Issues: 6 Forks: 23
https://github.com/mishushakov/llm-scraper
  
  Turn any webpage into structured data using LLMs
Language: TypeScript
#ai #browser #browser_automation #gpt #langchain #llama #llm #openai #playwright #puppeteer #scraper
Stars: 332 Issues: 6 Forks: 23
https://github.com/mishushakov/llm-scraper
GitHub
  
  GitHub - mishushakov/llm-scraper: Turn any webpage into structured data using LLMs
  Turn any webpage into structured data using LLMs. Contribute to mishushakov/llm-scraper development by creating an account on GitHub.
👍2
  mbzuai-oryx/LLaVA-pp
🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)
Language: Python
#conversation #llama_3_llava #llama_3_vision #llama3 #llama3_llava #llama3_vision #llava #llava_llama3 #llava_phi3 #llm #lmms #phi_3_llava #phi_3_vision #phi3 #phi3_llava #phi3_vision #vision_language
Stars: 297 Issues: 2 Forks: 13
https://github.com/mbzuai-oryx/LLaVA-pp
  
  🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)
Language: Python
#conversation #llama_3_llava #llama_3_vision #llama3 #llama3_llava #llama3_vision #llava #llava_llama3 #llava_phi3 #llm #lmms #phi_3_llava #phi_3_vision #phi3 #phi3_llava #phi3_vision #vision_language
Stars: 297 Issues: 2 Forks: 13
https://github.com/mbzuai-oryx/LLaVA-pp
GitHub
  
  GitHub - mbzuai-oryx/LLaVA-pp: 🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)
  🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3) - mbzuai-oryx/LLaVA-pp
  WinampDesktop/winamp
Iconic media player
Language: C++
#llama #media #player #winamp
Stars: 1751 Issues: 15 Forks: 428
https://github.com/WinampDesktop/winamp
Iconic media player
Language: C++
#llama #media #player #winamp
Stars: 1751 Issues: 15 Forks: 428
https://github.com/WinampDesktop/winamp
👍1👎1🥴1🌚1
  vietanhdev/llama-assistant
AI-powered assistant to help you with your daily tasks, powered by Llama 3.2. It can recognize your voice, process natural language, and perform various actions based on your commands: summarizing text, rephasing sentences, answering questions, writing emails, and more.
Language: Python
#llama #llama_3_2 #llama3 #llava #moondream #owen #personal_assistant #private_gpt
Stars: 170 Issues: 0 Forks: 12
https://github.com/vietanhdev/llama-assistant
  
  AI-powered assistant to help you with your daily tasks, powered by Llama 3.2. It can recognize your voice, process natural language, and perform various actions based on your commands: summarizing text, rephasing sentences, answering questions, writing emails, and more.
Language: Python
#llama #llama_3_2 #llama3 #llava #moondream #owen #personal_assistant #private_gpt
Stars: 170 Issues: 0 Forks: 12
https://github.com/vietanhdev/llama-assistant
GitHub
  
  GitHub - nrl-ai/llama-assistant: AI-powered assistant to help you with your daily tasks, powered by Llama 3, DeepSeek R1, and many…
  AI-powered assistant to help you with your daily tasks, powered by Llama 3, DeepSeek R1, and many more models on HuggingFace. - nrl-ai/llama-assistant
👍1
  maiqingqiang/NotebookMLX
📋 NotebookMLX - An Open Source version of NotebookLM (Ported NotebookLlama)
Language: Jupyter Notebook
#ai #llama #mlx #notebookllama #notebooklm
Stars: 121 Issues: 1 Forks: 7
https://github.com/maiqingqiang/NotebookMLX
  
  📋 NotebookMLX - An Open Source version of NotebookLM (Ported NotebookLlama)
Language: Jupyter Notebook
#ai #llama #mlx #notebookllama #notebooklm
Stars: 121 Issues: 1 Forks: 7
https://github.com/maiqingqiang/NotebookMLX
GitHub
  
  GitHub - johnmai-dev/NotebookMLX: 📋 NotebookMLX - An Open Source version of NotebookLM (Ported NotebookLlama)
  📋 NotebookMLX - An Open Source version of NotebookLM (Ported NotebookLlama) - johnmai-dev/NotebookMLX
  edwko/OuteTTS
Interface for OuteTTS models.
Language: Python
#gguf #llama #text_to_speech #transformers #tts
Stars: 278 Issues: 6 Forks: 13
https://github.com/edwko/OuteTTS
  
  Interface for OuteTTS models.
Language: Python
#gguf #llama #text_to_speech #transformers #tts
Stars: 278 Issues: 6 Forks: 13
https://github.com/edwko/OuteTTS
GitHub
  
  GitHub - edwko/OuteTTS: Interface for OuteTTS models.
  Interface for OuteTTS models. Contribute to edwko/OuteTTS development by creating an account on GitHub.
  papersgpt/papersgpt-for-zotero
Zotero AI plugin chatting papers with ChatGPT, Gemini, Claude, Llama 3.2, QwQ-32B-Preview, Marco-o1, Gemma, Mistral and Phi-3.5
Language: JavaScript
#ai #chatgpt #claude #gemini #gemma #llama #marco_o1 #mistral #paper #phi_3 #qwq_32b_preview #summary #zotero #zotero_plugin
Stars: 232 Issues: 3 Forks: 1
https://github.com/papersgpt/papersgpt-for-zotero
  
  Zotero AI plugin chatting papers with ChatGPT, Gemini, Claude, Llama 3.2, QwQ-32B-Preview, Marco-o1, Gemma, Mistral and Phi-3.5
Language: JavaScript
#ai #chatgpt #claude #gemini #gemma #llama #marco_o1 #mistral #paper #phi_3 #qwq_32b_preview #summary #zotero #zotero_plugin
Stars: 232 Issues: 3 Forks: 1
https://github.com/papersgpt/papersgpt-for-zotero
GitHub
  
  GitHub - papersgpt/papersgpt-for-zotero: A powerful Zotero AI and MCP plugin with ChatGPT, Gemini, Claude, Grok, DeepSeek, OpenRouter…
  A powerful Zotero AI and MCP plugin with ChatGPT, Gemini, Claude, Grok, DeepSeek, OpenRouter, Kimi, GLM, SiliconFlow, GPT-oss, Gemma 3, Qwen 3 - papersgpt/papersgpt-for-zotero
❤1
  zhihu/ZhiLight
A highly optimized inference acceleration engine for Llama and its variants.
Language: C++
#cpm #cuda #gpt #inference_engine #llama #llm #llm_serving #minicpm #pytorch #qwen
Stars: 192 Issues: 1 Forks: 16
https://github.com/zhihu/ZhiLight
  
  A highly optimized inference acceleration engine for Llama and its variants.
Language: C++
#cpm #cuda #gpt #inference_engine #llama #llm #llm_serving #minicpm #pytorch #qwen
Stars: 192 Issues: 1 Forks: 16
https://github.com/zhihu/ZhiLight
GitHub
  
  GitHub - zhihu/ZhiLight: A highly optimized LLM inference acceleration engine for Llama and its variants.
  A highly optimized LLM inference acceleration engine for Llama and its variants. - zhihu/ZhiLight
👍1
  ictnlp/LLaVA-Mini
LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner.
Language: Python
#efficient #gpt4o #gpt4v #large_language_models #large_multimodal_models #llama #llava #multimodal #multimodal_large_language_models #video #vision #vision_language_model #visual_instruction_tuning
Stars: 173 Issues: 7 Forks: 11
https://github.com/ictnlp/LLaVA-Mini
  
  LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner.
Language: Python
#efficient #gpt4o #gpt4v #large_language_models #large_multimodal_models #llama #llava #multimodal #multimodal_large_language_models #video #vision #vision_language_model #visual_instruction_tuning
Stars: 173 Issues: 7 Forks: 11
https://github.com/ictnlp/LLaVA-Mini
GitHub
  
  GitHub - ictnlp/LLaVA-Mini: LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images,…
  LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner.  - GitHub - ictnlp/LLaVA-Mini: LLaVA-Mi...
  therealoliver/Deepdive-llama3-from-scratch
Achieve the llama3 inference step-by-step, grasp the core concepts, master the process derivation, implement the code.
Language: Jupyter Notebook
#attention #attention_mechanism #gpt #inference #kv_cache #language_model #llama #llm_configuration #llms #mask #multi_head_attention #positional_encoding #residuals #rms #rms_norm #rope #rotary_position_encoding #swiglu #tokenizer #transformer
Stars: 388 Issues: 0 Forks: 28
https://github.com/therealoliver/Deepdive-llama3-from-scratch
  
  Achieve the llama3 inference step-by-step, grasp the core concepts, master the process derivation, implement the code.
Language: Jupyter Notebook
#attention #attention_mechanism #gpt #inference #kv_cache #language_model #llama #llm_configuration #llms #mask #multi_head_attention #positional_encoding #residuals #rms #rms_norm #rope #rotary_position_encoding #swiglu #tokenizer #transformer
Stars: 388 Issues: 0 Forks: 28
https://github.com/therealoliver/Deepdive-llama3-from-scratch
GitHub
  
  GitHub - therealoliver/Deepdive-llama3-from-scratch: Achieve the llama3 inference step-by-step, grasp the core concepts, master…
  Achieve the llama3 inference step-by-step, grasp the core concepts, master the process derivation, implement the code. - therealoliver/Deepdive-llama3-from-scratch
👍1
  dipampaul17/KVSplit
Run larger LLMs with longer contexts on Apple Silicon by using differentiated precision for KV cache quantization. KVSplit enables 8-bit keys & 4-bit values, reducing memory by 59% with <1% quality loss. Includes benchmarking, visualization, and one-command setup. Optimized for M1/M2/M3 Macs with Metal support.
Language: Python
#apple_silicon #generative_ai #kv_cache #llama_cpp #llm #m1 #m2 #m3 #memory_optimization #metal #optimization #quantization
Stars: 222 Issues: 1 Forks: 5
https://github.com/dipampaul17/KVSplit
  
  Run larger LLMs with longer contexts on Apple Silicon by using differentiated precision for KV cache quantization. KVSplit enables 8-bit keys & 4-bit values, reducing memory by 59% with <1% quality loss. Includes benchmarking, visualization, and one-command setup. Optimized for M1/M2/M3 Macs with Metal support.
Language: Python
#apple_silicon #generative_ai #kv_cache #llama_cpp #llm #m1 #m2 #m3 #memory_optimization #metal #optimization #quantization
Stars: 222 Issues: 1 Forks: 5
https://github.com/dipampaul17/KVSplit
GitHub
  
  GitHub - dipampaul17/KVSplit: Run larger LLMs with longer contexts on Apple Silicon by using differentiated precision for KV cache…
  Run larger LLMs with longer contexts on Apple Silicon by using differentiated precision for KV cache quantization. KVSplit enables 8-bit keys & 4-bit values, reducing memory by 59% with &am...
  NU-QRG/optiml
Acceleration library for LLM agents.
Language: C++
#llama #llm
Stars: 198 Issues: 7 Forks: 44
https://github.com/NU-QRG/optiml
  Acceleration library for LLM agents.
Language: C++
#llama #llm
Stars: 198 Issues: 7 Forks: 44
https://github.com/NU-QRG/optiml
