#go #argo #argo_cd #cd #ci_cd #cicd #continuous_delivery #continuous_deployment #devops #docker #gitops #helm #ksonnet #kubernetes #pipeline
https://github.com/argoproj/argo-cd
https://github.com/argoproj/argo-cd
GitHub
GitHub - argoproj/argo-cd: Declarative Continuous Deployment for Kubernetes
Declarative Continuous Deployment for Kubernetes. Contribute to argoproj/argo-cd development by creating an account on GitHub.
#java #airbyte #connectors #data #data_analysis #data_ingestion #data_integration #data_science #data_transfers #elt #etl #incremental_updates #integration #open_source #pipeline #pipelines #replications
https://github.com/airbytehq/airbyte
https://github.com/airbytehq/airbyte
GitHub
GitHub - airbytehq/airbyte: The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to…
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted. ...
#python #cleandata #data_engineering #data_profilers #data_profiling #data_quality #data_science #data_unit_tests #datacleaner #datacleaning #dataquality #dataunittest #eda #exploratory_analysis #exploratory_data_analysis #exploratorydataanalysis #mlops #pipeline #pipeline_debt #pipeline_testing #pipeline_tests
https://github.com/great-expectations/great_expectations
https://github.com/great-expectations/great_expectations
GitHub
GitHub - great-expectations/great_expectations: Always know what to expect from your data.
Always know what to expect from your data. Contribute to great-expectations/great_expectations development by creating an account on GitHub.
#python #data_parallelism #deep_learning #distributed_training #hpc #large_scale #model_parallelism #pipeline_parallelism
https://github.com/hpcaitech/ColossalAI
https://github.com/hpcaitech/ColossalAI
GitHub
GitHub - hpcaitech/ColossalAI: Making large AI models cheaper, faster and more accessible
Making large AI models cheaper, faster and more accessible - hpcaitech/ColossalAI
#java #data #data_engineering #data_orchestration #data_orchestrator #data_pipeline #dataflow #elt #etl #kestra #orchestration #pipeline #scheduler #workflow #workflow_automation #workflow_engine
https://github.com/kestra-io/kestra
https://github.com/kestra-io/kestra
GitHub
GitHub - kestra-io/kestra: Orchestrate everything - from scripts to data, infra, AI, and business - as code, with UI and AI Copilot.…
Orchestrate everything - from scripts to data, infra, AI, and business - as code, with UI and AI Copilot. Simple. Fast. Scalable. - kestra-io/kestra
#python #computer_vision #convolutional_networks #embedding_vectors #embeddings #feature_extraction #feature_vector #image_processing #image_retrieval #machine_learning #milvus #pipeline #towhee #transformer #unstructured_data #video_processing #vision_transformer #vit
https://github.com/towhee-io/towhee
https://github.com/towhee-io/towhee
GitHub
GitHub - towhee-io/towhee: Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.
Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast. - towhee-io/towhee
#jupyter_notebook #ai #aihub #argo #automl #gpt #inference #kubeflow #kubernetes #llmops #mlops #notebook #pipeline #pytorch #spark #vgpu #workflow
https://github.com/tencentmusic/cube-studio
https://github.com/tencentmusic/cube-studio
GitHub
GitHub - tencentmusic/cube-studio: cube studio开源云原生一站式机器学习/深度学习/大模型AI平台,mlops算法链路全流程,算力租赁平台,notebook在线开发,拖拉拽任务流pipeline编排,多机多卡…
cube studio开源云原生一站式机器学习/深度学习/大模型AI平台,mlops算法链路全流程,算力租赁平台,notebook在线开发,拖拉拽任务流pipeline编排,多机多卡分布式训练,超参搜索,推理服务VGPU虚拟化,边缘计算,标注平台自动化标注,deepseek等大模型sft微调/奖励模型/强化学习训练,vllm/ollama/mindie大模型多机推理,私有知识库,AI模型市场...
👍3
#python #billion_parameters #compression #data_parallelism #deep_learning #gpu #inference #machine_learning #mixture_of_experts #model_parallelism #pipeline_parallelism #pytorch #trillion_parameters #zero
DeepSpeed is a powerful tool for training and using large artificial intelligence models quickly and efficiently. It allows you to train models with billions or even trillions of parameters, which is much faster and cheaper than other methods. With DeepSpeed, you can achieve significant speedups, reduce costs, and improve the performance of your models. For example, it can train ChatGPT-like models 15 times faster than current state-of-the-art systems. This makes it easier to work with large language models without needing massive resources, making AI more accessible and efficient for everyone.
https://github.com/microsoft/DeepSpeed
DeepSpeed is a powerful tool for training and using large artificial intelligence models quickly and efficiently. It allows you to train models with billions or even trillions of parameters, which is much faster and cheaper than other methods. With DeepSpeed, you can achieve significant speedups, reduce costs, and improve the performance of your models. For example, it can train ChatGPT-like models 15 times faster than current state-of-the-art systems. This makes it easier to work with large language models without needing massive resources, making AI more accessible and efficient for everyone.
https://github.com/microsoft/DeepSpeed
GitHub
GitHub - deepspeedai/DeepSpeed: DeepSpeed is a deep learning optimization library that makes distributed training and inference…
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective. - deepspeedai/DeepSpeed
#cplusplus #android #audio_processing #c_plus_plus #calculator #computer_vision #deep_learning #framework #graph_based #graph_framework #inference #machine_learning #mediapipe #mobile_development #perception #pipeline_framework #stream_processing #video_processing
MediaPipe is a tool that helps you add smart machine learning features to your apps and devices. It works on mobile, web, desktop, and other devices. You can use pre-made solutions for tasks like vision, text, and audio processing, or customize the models to fit your needs. MediaPipe also offers tools like Model Maker and Studio to help you create and test your solutions easily. This makes it easier to delight your customers with innovative features without needing deep machine learning expertise.
https://github.com/google-ai-edge/mediapipe
MediaPipe is a tool that helps you add smart machine learning features to your apps and devices. It works on mobile, web, desktop, and other devices. You can use pre-made solutions for tasks like vision, text, and audio processing, or customize the models to fit your needs. MediaPipe also offers tools like Model Maker and Studio to help you create and test your solutions easily. This makes it easier to delight your customers with innovative features without needing deep machine learning expertise.
https://github.com/google-ai-edge/mediapipe
GitHub
GitHub - google-ai-edge/mediapipe: Cross-platform, customizable ML solutions for live and streaming media.
Cross-platform, customizable ML solutions for live and streaming media. - google-ai-edge/mediapipe
#typescript #cd #ci #git #gitlab #gitlab_ci #local #pipeline #push #uncomitted #untracked
You can run GitLab CI pipelines locally using `gitlab-ci-local`, which saves you time and effort by not having to push changes to test your `.gitlab-ci.yml` files. This tool allows you to execute pipelines as a shell executor or docker executor, eliminating the need for development-specific scripts. It also offers convenience features like CLI options, environment files, bash aliases, and tab completion. You can list pipeline jobs before running them and customize variables and artifacts easily. This makes your development process more efficient and streamlined.
https://github.com/firecow/gitlab-ci-local
You can run GitLab CI pipelines locally using `gitlab-ci-local`, which saves you time and effort by not having to push changes to test your `.gitlab-ci.yml` files. This tool allows you to execute pipelines as a shell executor or docker executor, eliminating the need for development-specific scripts. It also offers convenience features like CLI options, environment files, bash aliases, and tab completion. You can list pipeline jobs before running them and customize variables and artifacts easily. This makes your development process more efficient and streamlined.
https://github.com/firecow/gitlab-ci-local
GitHub
GitHub - firecow/gitlab-ci-local: Tired of pushing to test your .gitlab-ci.yml?
Tired of pushing to test your .gitlab-ci.yml? Contribute to firecow/gitlab-ci-local development by creating an account on GitHub.
#python #artificial_intelligence #dag #data_science #data_visualization #dataflow #developer_tools #machine_learning #notebooks #pipeline #python #reactive #web_app
Marimo is a powerful tool for Python users that makes working with notebooks much easier and more efficient. Here’s what it offers When you run a cell or interact with UI elements, marimo automatically updates dependent cells, keeping your code and outputs consistent.
- **Interactive** Marimo ensures no hidden state and deterministic execution, making your work reliable.
- **Executable** Notebooks are stored as `.py` files, making version control easy.
- **Modern Editor**: It includes features like GitHub Copilot, AI assistants, and more quality-of-life tools.
Using marimo helps you avoid errors, keeps your code organized, and makes sharing and deploying your work simpler.
https://github.com/marimo-team/marimo
Marimo is a powerful tool for Python users that makes working with notebooks much easier and more efficient. Here’s what it offers When you run a cell or interact with UI elements, marimo automatically updates dependent cells, keeping your code and outputs consistent.
- **Interactive** Marimo ensures no hidden state and deterministic execution, making your work reliable.
- **Executable** Notebooks are stored as `.py` files, making version control easy.
- **Modern Editor**: It includes features like GitHub Copilot, AI assistants, and more quality-of-life tools.
Using marimo helps you avoid errors, keeps your code organized, and makes sharing and deploying your work simpler.
https://github.com/marimo-team/marimo
GitHub
GitHub - marimo-team/marimo: A reactive notebook for Python — run reproducible experiments, query with SQL, execute as a script…
A reactive notebook for Python — run reproducible experiments, query with SQL, execute as a script, deploy as an app, and version with git. Stored as pure Python. All in a modern, AI-native editor....
#rust #events #forwarder #logs #metrics #observability #parser #pipeline #router #rust #stream_processing #vector
Vector is a powerful tool for managing your observability data, such as logs and metrics. It allows you to collect, transform, and route your data to any vendor you choose, giving you full control. Vector is reliable, fast (up to 10x faster than alternatives), and secure. It helps reduce costs, improve data quality, and consolidate agents, making your observability processes more efficient and reliable. With a strong community support and extensive documentation, Vector is used by many big companies and is downloaded over 100,000 times daily. This makes it a valuable tool for anyone looking to manage their data effectively.
https://github.com/vectordotdev/vector
Vector is a powerful tool for managing your observability data, such as logs and metrics. It allows you to collect, transform, and route your data to any vendor you choose, giving you full control. Vector is reliable, fast (up to 10x faster than alternatives), and secure. It helps reduce costs, improve data quality, and consolidate agents, making your observability processes more efficient and reliable. With a strong community support and extensive documentation, Vector is used by many big companies and is downloaded over 100,000 times daily. This makes it a valuable tool for anyone looking to manage their data effectively.
https://github.com/vectordotdev/vector
GitHub
GitHub - vectordotdev/vector: A high-performance observability data pipeline.
A high-performance observability data pipeline. Contribute to vectordotdev/vector development by creating an account on GitHub.
👍1
#python #automation #data #data_engineering #data_ops #data_science #infrastructure #ml_ops #observability #orchestration #pipeline #prefect #python #workflow #workflow_engine
Prefect is a tool that helps you automate and manage data workflows in Python. It makes it easy to turn your scripts into reliable and flexible workflows that can handle unexpected changes. With Prefect, you can schedule tasks, retry failed operations, and monitor your workflows. You can install it using `pip install -U prefect` and start creating workflows with just a few lines of code. This helps data teams work more efficiently, reduce errors, and save time. You can also use Prefect Cloud for more advanced features and support.
https://github.com/PrefectHQ/prefect
Prefect is a tool that helps you automate and manage data workflows in Python. It makes it easy to turn your scripts into reliable and flexible workflows that can handle unexpected changes. With Prefect, you can schedule tasks, retry failed operations, and monitor your workflows. You can install it using `pip install -U prefect` and start creating workflows with just a few lines of code. This helps data teams work more efficiently, reduce errors, and save time. You can also use Prefect Cloud for more advanced features and support.
https://github.com/PrefectHQ/prefect
GitHub
GitHub - PrefectHQ/prefect: Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
Prefect is a workflow orchestration framework for building resilient data pipelines in Python. - PrefectHQ/prefect
#python #cloud_native #cncf #deep_learning #docker #fastapi #framework #generative_ai #grpc #jaeger #kubernetes #llmops #machine_learning #microservice #mlops #multimodal #neural_search #opentelemetry #orchestration #pipeline #prometheus
Jina-serve is a tool that helps you build and deploy AI services easily. It supports major machine learning frameworks and allows you to scale your services from local development to production quickly. You can use it to create AI services that communicate via gRPC, HTTP, and WebSockets. It has features like built-in Docker integration, one-click cloud deployment, and support for Kubernetes and Docker Compose, making it easy to manage and scale your AI applications. This makes it simpler for you to focus on the core logic of your AI projects without worrying about the technical details of deployment and scaling.
https://github.com/jina-ai/serve
Jina-serve is a tool that helps you build and deploy AI services easily. It supports major machine learning frameworks and allows you to scale your services from local development to production quickly. You can use it to create AI services that communicate via gRPC, HTTP, and WebSockets. It has features like built-in Docker integration, one-click cloud deployment, and support for Kubernetes and Docker Compose, making it easy to manage and scale your AI applications. This makes it simpler for you to focus on the core logic of your AI projects without worrying about the technical details of deployment and scaling.
https://github.com/jina-ai/serve
GitHub
GitHub - jina-ai/serve: ☁️ Build multimodal AI applications with cloud-native stack
☁️ Build multimodal AI applications with cloud-native stack - jina-ai/serve
#python #cleandata #data_engineering #data_profilers #data_profiling #data_quality #data_science #data_unit_tests #datacleaner #datacleaning #dataquality #dataunittest #eda #exploratory_analysis #exploratory_data_analysis #exploratorydataanalysis #mlops #pipeline #pipeline_debt #pipeline_testing #pipeline_tests
GX Core is a powerful tool for ensuring data quality. It allows you to write simple tests, called "Expectations," to check if your data meets certain standards. This helps teams work together more effectively and keeps everyone informed about the data's quality. You can automatically generate reports, making it easy to share results and preserve your organization's knowledge about its data. To get started, you just need to install GX Core in a Python virtual environment and follow some simple steps. This makes managing data quality much simpler and more efficient.
https://github.com/great-expectations/great_expectations
GX Core is a powerful tool for ensuring data quality. It allows you to write simple tests, called "Expectations," to check if your data meets certain standards. This helps teams work together more effectively and keeps everyone informed about the data's quality. You can automatically generate reports, making it easy to share results and preserve your organization's knowledge about its data. To get started, you just need to install GX Core in a Python virtual environment and follow some simple steps. This makes managing data quality much simpler and more efficient.
https://github.com/great-expectations/great_expectations
GitHub
GitHub - great-expectations/great_expectations: Always know what to expect from your data.
Always know what to expect from your data. Contribute to great-expectations/great_expectations development by creating an account on GitHub.
#python #ai #big_model #data_parallelism #deep_learning #distributed_computing #foundation_models #heterogeneous_training #hpc #inference #large_scale #model_parallelism #pipeline_parallelism
Colossal-AI is a powerful tool that helps make large AI models faster, cheaper, and easier to use. It uses special techniques like parallelism to speed up training on big models without needing expensive hardware. This means users can train complex AI models even on regular computers or laptops, saving time and money. Colossal-AI also supports various applications across industries like medicine, video generation, and chatbots, making it very versatile for developers.
https://github.com/hpcaitech/ColossalAI
Colossal-AI is a powerful tool that helps make large AI models faster, cheaper, and easier to use. It uses special techniques like parallelism to speed up training on big models without needing expensive hardware. This means users can train complex AI models even on regular computers or laptops, saving time and money. Colossal-AI also supports various applications across industries like medicine, video generation, and chatbots, making it very versatile for developers.
https://github.com/hpcaitech/ColossalAI
GitHub
GitHub - hpcaitech/ColossalAI: Making large AI models cheaper, faster and more accessible
Making large AI models cheaper, faster and more accessible - hpcaitech/ColossalAI
#java #automation #data_orchestration #devops #high_availability #infrastructure_as_code #java #low_code #lowcode #orchestration #pipeline #pipeline_as_code #workflow
Kestra is an open-source platform that helps manage complex workflows easily. It uses a simple YAML code to define workflows, which can be automated based on schedules or real-time events. Kestra supports many plugins, allowing integration with various data sources and tools. This makes it easy to automate tasks like data processing and infrastructure management. The platform is scalable, fault-tolerant, and offers real-time monitoring, making it beneficial for teams handling large data pipelines and complex workflows. It simplifies workflow management, reduces errors, and boosts efficiency.
https://github.com/kestra-io/kestra
Kestra is an open-source platform that helps manage complex workflows easily. It uses a simple YAML code to define workflows, which can be automated based on schedules or real-time events. Kestra supports many plugins, allowing integration with various data sources and tools. This makes it easy to automate tasks like data processing and infrastructure management. The platform is scalable, fault-tolerant, and offers real-time monitoring, making it beneficial for teams handling large data pipelines and complex workflows. It simplifies workflow management, reduces errors, and boosts efficiency.
https://github.com/kestra-io/kestra
GitHub
GitHub - kestra-io/kestra: Orchestrate everything - from scripts to data, infra, AI, and business - as code, with UI and AI Copilot.…
Orchestrate everything - from scripts to data, infra, AI, and business - as code, with UI and AI Copilot. Simple. Fast. Scalable. - kestra-io/kestra
#rust #ai #change_data_capture #context_engineering #data #data_engineering #data_indexing #data_infrastructure #data_processing #etl #hacktoberfest #help_wanted #indexing #knowledge_graph #llm #pipeline #python #rag #real_time #rust #semantic_search
**CocoIndex** is a fast, open-source Python tool (Rust core) for transforming data into AI formats like vector indexes or knowledge graphs. Define simple data flows in ~100 lines of code using plug-and-play blocks for sources, embeddings, and targets—install via `pip install cocoindex`, add Postgres, and run. It auto-syncs fresh data with minimal recompute on changes, tracking lineage. **You save time building scalable RAG/semantic search pipelines effortlessly, avoiding complex ETL and stale data issues for production-ready AI apps.**
https://github.com/cocoindex-io/cocoindex
**CocoIndex** is a fast, open-source Python tool (Rust core) for transforming data into AI formats like vector indexes or knowledge graphs. Define simple data flows in ~100 lines of code using plug-and-play blocks for sources, embeddings, and targets—install via `pip install cocoindex`, add Postgres, and run. It auto-syncs fresh data with minimal recompute on changes, tracking lineage. **You save time building scalable RAG/semantic search pipelines effortlessly, avoiding complex ETL and stale data issues for production-ready AI apps.**
https://github.com/cocoindex-io/cocoindex
GitHub
GitHub - cocoindex-io/cocoindex: Data transformation framework for AI. Ultra performant, with incremental processing. 🌟 Star if…
Data transformation framework for AI. Ultra performant, with incremental processing. 🌟 Star if you like it! - cocoindex-io/cocoindex