#rust #arrow #dataframe #datafusion #distributed #java #jvm #kotlin #kubernetes #scala #spark
https://github.com/ballista-compute/ballista
https://github.com/ballista-compute/ballista
GitHub
GitHub - ballista-compute/ballista: Distributed compute platform implemented in Rust, and powered by Apache Arrow.
Distributed compute platform implemented in Rust, and powered by Apache Arrow. - GitHub - ballista-compute/ballista: Distributed compute platform implemented in Rust, and powered by Apache Arrow.
#scala #ai #apache_spark #azure #big_data #cognitive_services #data_science #databricks #deep_learning #http #lightgbm #machine_learning #microsoft #ml #model_deployment #onnx #opencv #pyspark #spark #synapse
https://github.com/microsoft/SynapseML
https://github.com/microsoft/SynapseML
GitHub
GitHub - microsoft/SynapseML: Simple and Distributed Machine Learning
Simple and Distributed Machine Learning. Contribute to microsoft/SynapseML development by creating an account on GitHub.
#java #airflow #azkaban #dataworks #davinci #etl #flink #governance #griffin #hadoop #hive #hue #kettle #linkis #scriptis #spark #supperset #tableau #visualis #workflow #zeppelin
https://github.com/WeBankFinTech/DataSphereStudio
https://github.com/WeBankFinTech/DataSphereStudio
GitHub
GitHub - WeBankFinTech/DataSphereStudio: DataSphereStudio is a one stop data application development& management portal, covering…
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, ...
#hcl #argocd #aws #aws_eks_cluster #bottlerocket #cluster_autoscaler #eks #eks_addons #eks_clusters #eks_fargate #fargate #fluentbit #fluxcd #helm_charts #ingress_controller #kubernetes #spark #spark_operator #terraform #traefik #yunikorn
https://github.com/aws-ia/terraform-aws-eks-blueprints
https://github.com/aws-ia/terraform-aws-eks-blueprints
GitHub
GitHub - aws-ia/terraform-aws-eks-blueprints: Configure and deploy complete EKS clusters.
Configure and deploy complete EKS clusters. Contribute to aws-ia/terraform-aws-eks-blueprints development by creating an account on GitHub.
#scala #etl_pipeline #flink #one_stop_solution #spark #streaming #streaming_warehouse #streamx
https://github.com/streamxhub/streamx
https://github.com/streamxhub/streamx
GitHub
GitHub - apache/incubator-streampark: StreamPark, Make stream processing easier! easy-to-use streaming application development…
StreamPark, Make stream processing easier! easy-to-use streaming application development framework and operation platform - GitHub - apache/incubator-streampark: StreamPark, Make stream processing ...
#cplusplus #big_data #clickhouse #distributed_database #lakehouse #olap_database #spark #sql #ytsaurus
https://github.com/ytsaurus/ytsaurus
https://github.com/ytsaurus/ytsaurus
GitHub
GitHub - ytsaurus/ytsaurus: YTsaurus is a scalable and fault-tolerant open-source big data platform.
YTsaurus is a scalable and fault-tolerant open-source big data platform. - ytsaurus/ytsaurus
#jupyter_notebook #ai #aihub #argo #automl #gpt #inference #kubeflow #kubernetes #llmops #mlops #notebook #pipeline #pytorch #spark #vgpu #workflow
https://github.com/tencentmusic/cube-studio
https://github.com/tencentmusic/cube-studio
GitHub
GitHub - tencentmusic/cube-studio: cube studio开源云原生一站式机器学习/深度学习/大模型AI平台,mlops算法链路全流程,算力租赁平台,notebook在线开发,拖拉拽任务流pipeline编排,多机多卡…
cube studio开源云原生一站式机器学习/深度学习/大模型AI平台,mlops算法链路全流程,算力租赁平台,notebook在线开发,拖拉拽任务流pipeline编排,多机多卡分布式训练,超参搜索,推理服务VGPU虚拟化,边缘计算,标注平台自动化标注,deepseek等大模型sft微调/奖励模型/强化学习训练,vllm/ollama/mindie大模型多机推理,私有知识库,AI模型市场...
👍3
#java #bigquery #database #dbt #delta_lake #elt #etl #hadoop #hive #hudi #iceberg #lakehouse #olap #query_engine #real_time #redshift #snowflake #spark #sql
Apache Doris is a high-performance, real-time analytical database that offers several benefits. It is easy to use with a simple architecture and supports standard SQL, making it compatible with MySQL tools. Doris delivers extremely fast query performance, even under massive data loads, making it ideal for scenarios like report analysis, ad-hoc queries, unified data warehouses, and data lake queries. It also supports federated querying of various data sources and has rich ecosystem integrations with tools like Spark and Flink. This makes Apache Doris a versatile and powerful tool for handling complex analytical tasks efficiently.
https://github.com/apache/doris
Apache Doris is a high-performance, real-time analytical database that offers several benefits. It is easy to use with a simple architecture and supports standard SQL, making it compatible with MySQL tools. Doris delivers extremely fast query performance, even under massive data loads, making it ideal for scenarios like report analysis, ad-hoc queries, unified data warehouses, and data lake queries. It also supports federated querying of various data sources and has rich ecosystem integrations with tools like Spark and Flink. This makes Apache Doris a versatile and powerful tool for handling complex analytical tasks efficiently.
https://github.com/apache/doris
GitHub
GitHub - apache/doris: Apache Doris is an easy-to-use, high performance and unified analytics database.
Apache Doris is an easy-to-use, high performance and unified analytics database. - apache/doris