GitHub Trends
10.1K subscribers
15.3K links
See what the GitHub community is most excited about today.

A bot automatically fetches new repositories from https://github.com/trending and sends them to the channel.

Author and maintainer: https://github.com/katursis
Download Telegram
#java #batch #cdc #change_data_capture #data_integration #data_pipeline #distributed #elt #etl #flink #kafka #mysql #paimon #postgresql #real_time #schema_evolution

Flink CDC is a tool that helps you move and transform data in real-time or in batches. It makes data integration simple by using YAML files to describe how data should be moved and transformed. This tool offers features like full database synchronization, table sharding, schema evolution, and data transformation. To use it, you need to set up an Apache Flink cluster, download Flink CDC, create a YAML file to define your data sources and sinks, and then run the job. This benefits you by making it easier to manage and integrate your data efficiently across different databases.

https://github.com/apache/flink-cdc
#java #alldata #cloudeon #cube_studio #datahub #datart #datasophon #datavines #dinky #dolphinscheduler #griffin #hudi #iceberg #openmetadata #paimon #streampark

AllData is a comprehensive data platform that offers multiple features such as data integration, data quality management, report analytics, and machine learning. It has a customizable architecture and supports integration of open-source projects, with plans to release 30 new open-source project frameworks by the end of 2024. Users can choose between the open-source version or the commercial version, with the latter offering a more stable experience with fewer bugs. The commercial version includes additional features like real-time development, offline platform, and BI reporting capabilities. This platform helps users comprehensively manage and utilize data, improving work efficiency and data analysis capabilities.

https://github.com/alldatacenter/alldata