GitHub Trends
10.1K subscribers
15.3K links
See what the GitHub community is most excited about today.

A bot automatically fetches new repositories from https://github.com/trending and sends them to the channel.

Author and maintainer: https://github.com/katursis
Download Telegram
#python #analytics #dagster #data_engineering #data_integration #data_orchestrator #data_pipelines #data_science #etl #metadata #mlops #orchestration #python #scheduler #workflow #workflow_automation

Dagster is a tool that helps you manage and automate your data workflows. You can define your data assets, like tables or machine learning models, using Python functions. Dagster then runs these functions at the right time and keeps your data up-to-date. It offers features like integrated lineage and observability, making it easier to track and manage your data. This tool is useful for every stage of data development, from local testing to production, and it integrates well with other popular data tools. Using Dagster, you can build reusable components, spot data quality issues early, and scale your data pipelines efficiently. This makes your work more productive and helps maintain control over complex data systems.

https://github.com/dagster-io/dagster
👍1
#java #data_catalog #data_discovery #data_governance #datahub #metadata

DataHub is a free, open-source platform that helps you find and understand your data. It acts like a catalog for all your data, making it easier to discover, manage, and use. Built by Acryl Data and LinkedIn, DataHub supports modern data stacks and offers features like real-time metadata graphs, integration with various tools, and a user-friendly interface. You can try it out with a hosted demo or follow the quickstart guide to set it up locally. Joining the community through Slack or attending town hall meetings can also help you stay updated and connected with other users. This tool is beneficial because it simplifies data management, enhances collaboration, and improves overall data visibility within your organization.

https://github.com/datahub-project/datahub
#java #ai_catalog #data_catalog #datalake #federated_query #lakehouse #metadata #metalake #model_catalog #opendatacatalog #skycomputing #stratosphere

Apache Gravitino is a powerful tool for managing metadata across different sources and regions. It's available under the Apache 2.0 license, which means you can use it freely for any purpose, including commercial projects. You can modify and distribute the software as needed. This flexibility allows businesses to integrate Gravitino into their systems without worrying about royalties or strict usage restrictions. The benefit to users is that they can easily manage complex data environments while having full control over how they use and customize the software.

https://github.com/apache/gravitino