Open source repositories tagged with #data-ingestion, ranked by health score.
SeaTunnel is a multimodal, high-performance, distributed, massive data integration tool.
Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.