Apache Airflow is an open-source platform to programmatically author, schedule, and monitor data pipelines as directed acyclic graphs (DAGs). Originally created at Airbnb in 2014 and donated to the ASF, it is the de facto standard for batch workflow orchestration with a rich ecosystem of providers for cloud services, databases, and SaaS systems.
Data Pipeline service catalog
Showing Data Pipeline services from 8,000+ services. Search within this category, or choose another category or company below.
Apache Spark is an open-source unified analytics engine for large-scale data processing with built-in modules for SQL, streaming, machine learning, and graph processing. Created at UC Berkeley in 2009, Spark replaced Hadoop MapReduce as the dominant batch processing engine and underpins commercial platforms including Databricks, AWS EMR, and Google Dataproc.
Managed Apache Airflow platform for building, running, and observing data pipelines.
iPaaS platform for connecting applications and data across hybrid IT environments.
Effortlessly centralize all the data you need so your team can deliver better insights, faster. Start for free.
Hightouch is the Agentic Marketing Platform powered by the industry-leading Composable CDP: AI marketing that actually knows your brand, customers, and business.
Enterprise cloud data management platform for integration, governance, and quality.
API integration and management platform connecting cloud, on-premise, and SaaS applications.
Polytomic combines ETL and Reverse ETL in a single platform so you can move data to and from your data warehouse without managing brittle pipelines across multiple tools. Built for enterprise scale, Polytomic handles billions of rows, complex models, and fine-grained access controls with reliability and speed.
Talend is now part of Qlik. Seamlessly integrate, transform, and govern data across any environment with Qlik Talend Cloud — built for AI, analytics, and trusted decisions.